Learn MQBench configuration

Configuration

MQBench provides a primary API prepare_by_platform for users to quantize their model. MQBench contains many backends presets for hardware alignment, but you maybe want to customize your backend. We provide a guide for learning MQBench configuration, and it will be helpful.

1. API prepare_by_platform accepts an extra param, you can provide it following this format.

extra_config = {
    'extra_qconfig_dict': {
        'w_observer': 'MSEObserver',                              # custom weight observer
        'a_observer': 'MSEObserver',                              # custom activation observer
        'w_fakequantize': 'FixedFakeQuantize',                    # custom weight fake quantize function
        'a_fakequantize': 'FixedFakeQuantize',                    # custom activation fake quantize function
        'w_qscheme': {
            'bit': 8,                                             # custom bitwidth for weight,
            'symmetry': False,                                    # custom whether quant is symmetric for weight,
            'per_channel': True,                                  # custom whether quant is per-channel or per-tensor for weight,
            'pot_scale': False,                                   # custom whether scale is power of two for weight.
        },
        'a_qscheme': {
            'bit': 8,                                             # custom bitwidth for activation,
            'symmetry': False,                                    # custom whether quant is symmetric for activation,
            'per_channel': True,                                  # custom whether quant is per-channel or per-tensor for activation,
            'pot_scale': False,                                   # custom whether scale is power of two for activation.
        }
    },
    'extra_quantizer_dict': {
        'additional_function_type': [operator.add,],              # additional function type, a list, use function full name, like operator.add.
        'additional_module_type': (torch.nn.Upsample),            # additional module type, a tuple, use class full name, like torch.nn.Upsample.
        'additional_node_name': [layer1_1_conv1] ,                # addition node name, a list, use full node name, like layer1_1_conv1.
        'exclude_module_name': [layer2.1.relu,],                  # skip specific module, a list, use module qualify name, like layer2.1.relu.
        'exclude_function_type': [operator.mul,] ,                # skip specific module, a list, use function full name, like operator.mul
        'exclude_node_name': [layer1_1_conv1],                    # skip specific module, a list, use full node name, like layer1_1_conv1.
    },
    'preserve_attr': {
        '': ["func1"],                                            # Specify attribute of model which should be preserved
        'backbone': ['func2'],                                    # after prepare. Since symbolic_trace only store attributes which is
                                                                  # in forward. If model.func1 and model.backbone.func2 should be preserved,
                                                                  # {'': ['func1'], 'backbone': ['func2'] } should work.
    }
    'extra_fuse_dict': {                                            # checkout https://github.com/ModelTC/MQBench/blob/main/mqbench/fuser_method_mappings.py for more fuse details.
        'additional_fuser_method_mapping': {
            (torch.nn.Linear, torch.nn.BatchNorm1d):
                fuse_linear_bn,                                   # fuse use method

        },
        'additional_fusion_pattern': {
            (torch.nn.BatchNorm1d, torch.nn.Linear):
                ConvBNReLUFusion,                                 # fuse use pattern
        },
        'additional_qat_module_mapping': {
            nn.ConvTranspose2d: qnn.qat.ConvTranspose2d,          # mapping qat module
        } ,
    },
    'concrect_args': {
    }                                                             # custom tracer behavior, checkout https://github.com/pytorch/pytorch/blob/efcbbb177eacdacda80b94ad4ce34b9ed6cf687a/torch/fx/_symbolic_trace.py#L836
}

2. Customize just by:

prepared = prepare_by_platform(model, backend, extra_config)

Observer

1. MinMaxObserver
2. EMAMinMaxObserver        # More general choice
3. MinMaxFloorObserver      # For Vitis HW
4. EMAMinMaxFloorObserver   # For Vitis HW
5. EMAQuantileObserver      # Quantile observer.
6. ClipStdObserver          # Usually used for DSQ.
7. LSQObserver              # Usually used for LSQ.
8. MSEObserver
9. EMAMSEObserver

Quantizer

1. FixedFakeQuantize        # Unlearnable scale/zeropoint
2. LearnableFakeQuantize    # Learnable scale/zeropoint
3. NNIEFakeQuantize         # Quantize function for NNIE
4. DoReFaFakeQuantize       # Dorefa
5. DSQFakeQuantize          # DSQ
6. PACTFakeQuantize         # PACT
7. TqtFakeQuantize          # TQT
8. AdaRoundFakeQuantize