Natural Language Processing Benchmark

Based on MQBench, we provide a sentence classification benchmark on GLUE. We provide the results of bert-base-uncased on 8 tasks of GLUE.

Post-training Quantization

Task	Metrics	FP32 results	int8 results
mrpc	acc/f1	87.75/91.35	87.75/91.2
mnli	acc m/mm	84.94/84.76	84.69/84.59
cola	Matthews corr	59.6	59.41
sst2	acc	93.35	92.78
stsb	Pearson/Spearman corr	89.70/89.28	89.36/89.22
qqp	f1/acc	87.82/90.91	87.46/90.72
rte	acc	72.56	71.84
qnli	acc	91.84	91.32