Difference between revisions of "Quantization"
m (BPeat moved page Quantization-aware Training to Quantization-aware Model Training without leaving a redirect) |
|||
Line 8: | Line 8: | ||
[http://www.google.com/search?q=Google+AIY+Projects+Program+artificial+intelligence+deep+learning ...Google search] | [http://www.google.com/search?q=Google+AIY+Projects+Program+artificial+intelligence+deep+learning ...Google search] | ||
− | * [ | + | * [http://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize#quantization-aware-training Quantization-aware training] |
− | <youtube> | + | Quantization-aware model training ensures that the forward pass matches precision for both training and inference. There are two aspects to this: |
+ | |||
+ | * Operator fusion at inference time are accurately modeled at training time. | ||
+ | * Quantization effects at inference are modeled at training time. | ||
+ | |||
+ | For efficient inference, TensorFlow combines batch normalization with the preceding convolutional and fully-connected layers prior to quantization by folding batch norm layers. | ||
+ | |||
+ | <youtube>eZdOkDtYMoo</youtube> |
Revision as of 20:30, 2 March 2019
YouTube search... ...Google search
Quantization-aware model training ensures that the forward pass matches precision for both training and inference. There are two aspects to this:
- Operator fusion at inference time are accurately modeled at training time.
- Quantization effects at inference are modeled at training time.
For efficient inference, TensorFlow combines batch normalization with the preceding convolutional and fully-connected layers prior to quantization by folding batch norm layers.