Difference between revisions of "Quantization"

From
Jump to: navigation, search
m (BPeat moved page Quantization-aware Model Training to Quantization without leaving a redirect)
Line 5: Line 5:
 
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools  
 
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools  
 
}}
 
}}
[http://www.youtube.com/results?search_query=Google+AIY+Projects+Program+artificial+intelligence+deep+learning YouTube search...]
+
[http://www.youtube.com/results?search_query=Quantization+aware+model+training YouTube search...]
[http://www.google.com/search?q=Google+AIY+Projects+Program+artificial+intelligence+deep+learning ...Google search]
+
[http://www.google.com/search?q=Quantization+aware+model+training ...Google search]
  
 
* [http://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize#quantization-aware-training Quantization-aware training]
 
* [http://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize#quantization-aware-training Quantization-aware training]
Line 15: Line 15:
 
* Quantization effects at inference are modeled at training time.
 
* Quantization effects at inference are modeled at training time.
  
For efficient inference, TensorFlow combines batch normalization with the preceding convolutional and fully-connected layers prior to quantization by folding batch norm layers.
+
For efficient inference, [[TensorFlow]] combines batch normalization with the preceding convolutional and fully-connected layers prior to quantization by folding batch norm layers.
  
 
<youtube>eZdOkDtYMoo</youtube>
 
<youtube>eZdOkDtYMoo</youtube>

Revision as of 20:37, 2 March 2019

YouTube search... ...Google search

Quantization-aware model training ensures that the forward pass matches precision for both training and inference. There are two aspects to this:

  • Operator fusion at inference time are accurately modeled at training time.
  • Quantization effects at inference are modeled at training time.

For efficient inference, TensorFlow combines batch normalization with the preceding convolutional and fully-connected layers prior to quantization by folding batch norm layers.