mlperf-v0.7

MLPerf是业内首套衡量机器学习软硬件性能的通用基准,由图灵奖得主David Patterson联合谷歌和几所著名高校于2018年发起。
MLPerf 是AI芯片的一个基准测试,主要包括:Training 和Inference两个方面的性能测试。Training是于测量系统将模型训练到目标质量指标的速度;Inference是用于测试系统使用训练有素的模型处理输入和产生结果的速度。
MLPerf基准联盟现有83家成员,包括谷歌、英伟达、微软、Facebook、阿里巴巴等73家企业和斯坦福、哈佛、多伦多大学等10所高校

随着AI技术的进步,今年的测试基准进一步加大了难度。

MLPerf训练测试基准包括图像分类、翻译、推荐系统和围棋等8个机器学习任务中,最终结果是这8项任务的训练时间,速度越快则性能越强。

具体的8项任务内容如下:

"1"

其中后三项是新加入或重新制定的标准:

1、BERT:用Wikipedia语料库训练BERT,这是首次将BERT引入MLPerf测试基准。

2、DLRM:用Criteo AI Lab的Terabyte点击率数据集训练的深度学习推荐模型(DLRM),广泛用于在线购物推荐、搜索结果和社交媒体内容排序。

3、Mini-Go:之前的MLPerf v0.5和v0.6也有训练围棋的强化学习任务,但却是迷你棋盘,此次v0.7将棋盘扩大为19×19全尺寸,这更能反映研究成果。

感谢zixuan同学的整理,现将部分汇总结论整理如下:

Submitter – Software Relationship

"softward"

  • MxNet, Pytorch, Tensorflow are still the mainstream of deep learning framework.
  • Customized frameworks, i.e. Huawei MindSpore, Nvidia Merlin are also entering public view.

Submitter – Field Relationship

"2"

  • Nvidia and Google are active in all the deep learning fields.
    Image Classification benchmark is popular and is adopted by almost all the company.
  • RL, Recommendation, NLP received less attention.
  • Intel has no submission for NLP and Object detection.
  • China mainland companies, i.e. Alibaba, Inspur, Shenzhen*, needs more efforts to become remarkable

Software – Field Relationship

"3"

  • Tensorflow is adopted in all the fields.
  • MxNet has the submissions for image classification and object detection (because gluon-cv toolkit is much more developed than others, like gluon-nlp?)
  • Combination of two frameworks (mxnet + pytorch) is also an option.

参考链接:
https://xueqiu.com/1097649362/155335766
https://mlperf.org/training-results-0-7