Difference between revisions of "Quantization"
m (BPeat moved page Quantization-aware Model Training to Quantization without leaving a redirect) |
|||
| Line 5: | Line 5: | ||
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | ||
}} | }} | ||
| − | [http://www.youtube.com/results?search_query= | + | [http://www.youtube.com/results?search_query=Quantization+aware+model+training YouTube search...] |
| − | [http://www.google.com/search?q= | + | [http://www.google.com/search?q=Quantization+aware+model+training ...Google search] |
* [http://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize#quantization-aware-training Quantization-aware training] | * [http://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize#quantization-aware-training Quantization-aware training] | ||
| Line 15: | Line 15: | ||
* Quantization effects at inference are modeled at training time. | * Quantization effects at inference are modeled at training time. | ||
| − | For efficient inference, TensorFlow combines batch normalization with the preceding convolutional and fully-connected layers prior to quantization by folding batch norm layers. | + | For efficient inference, [[TensorFlow]] combines batch normalization with the preceding convolutional and fully-connected layers prior to quantization by folding batch norm layers. |
<youtube>eZdOkDtYMoo</youtube> | <youtube>eZdOkDtYMoo</youtube> | ||
Revision as of 20:37, 2 March 2019
YouTube search... ...Google search
Quantization-aware model training ensures that the forward pass matches precision for both training and inference. There are two aspects to this:
- Operator fusion at inference time are accurately modeled at training time.
- Quantization effects at inference are modeled at training time.
For efficient inference, TensorFlow combines batch normalization with the preceding convolutional and fully-connected layers prior to quantization by folding batch norm layers.