Natural Language Processing Benchmark

Based on MQBench, we provide a sentence classification benchmark on GLUE. We provide the results of bert-base-uncased on 8 tasks of GLUE.

Post-training Quantization

  • Backend: Academic

  • W_calibration: MinMax

  • A_calibration: EMAQuantile

Task

Metrics

FP32 results

int8 results

mrpc

acc/f1

87.75/91.35

87.75/91.2

mnli

acc m/mm

84.94/84.76

84.69/84.59

cola

Matthews corr

59.6

59.41

sst2

acc

93.35

92.78

stsb

Pearson/Spearman corr

89.70/89.28

89.36/89.22

qqp

f1/acc

87.82/90.91

87.46/90.72

rte

acc

72.56

71.84

qnli

acc

91.84

91.32