Difference between revisions of "Quantization"

From
Jump to: navigation, search
m (BPeat moved page Quantization-aware Model Training to Quantization without leaving a redirect)
(No difference)

Revision as of 20:30, 2 March 2019

YouTube search... ...Google search

Quantization-aware model training ensures that the forward pass matches precision for both training and inference. There are two aspects to this:

  • Operator fusion at inference time are accurately modeled at training time.
  • Quantization effects at inference are modeled at training time.

For efficient inference, TensorFlow combines batch normalization with the preceding convolutional and fully-connected layers prior to quantization by folding batch norm layers.