An Implementation of ERNIE For Language Understanding (including Pre-training models and Fine-tuning tools)
文心大模型ERNIE是百度发布的产业级知识增强大模型,涵盖了NLP大模型和跨模态大模型。2019年3月,开源了国内首个开源预训练模型文心ERNIE 1.0,此后在语言与跨模态的理解和生成等领域取得一系列技术突破,并对外开源与开放了系列模型,助力大模型研究与产业化应用发展。提醒: ERNIE老版本代码已经迁移至repro分支,欢迎使用我们全新升级的基于动静结合的新版ERNIE套件进行开发。另外,也欢迎上EasyDL、BML体验更丰富的功能。
【了解更多】
ERNIE-ViL 2.0 (base)
正式开源ERNIE-M
正式开源ERNIE-GEN
模型正式开源! (点击进入)IJCAI-2020
收录。
git clone https://github.com/PaddlePaddle/ERNIE.git
# ernie_3.0 模型下载
# 进入models_hub目录
cd ./applications/models_hub
# 运行下载脚本
sh download_ernie_3.0_base_ch.sh
#进入文本分类任务文件夹
cd ./applications/tasks/text_classification/
#查看文本分类任务自带数据集
ls ./data
#查看 ERNIE3.0预训练模型 训练文本分类任务的配置文件
cat ./examples/cls_ernie_fc_ch.json
python run_trainer.py
,如下所示,使用基于ernie的中文文本分类模型在训练集上进行本地模型训练。
# ernie 中文文本分类模型
# 基于json实现预置网络训练。其调用了配置文件./examples/cls_ernie_fc_ch.json
python run_trainer.py --param_path ./examples/cls_ernie_fc_ch.json
fleetrun --gpus=x,y run_trainer.py./examples/cls_ernie_fc_ch.json
{
"dataset_reader":{"train_reader":{"config":{"data_path":"./data/predict_data"}}},
"inference":{"inference_model_path":"./output/cls_ernie_fc_ch/save_inference_model/inference_step_251",
"output_path": "./output/predict_result.txt"}
}
python run_infer.py --param_path ./examples/cls_ernie_fc_ch_infer.json
#进入预训练模型下载目录
cd ./applications/models_hub
#下载ERNIE3.0 base模型
sh downlaod_ernie_3.0_base_ch.sh
配置 | 模型 | CLUEWSC2020 | IFLYTEK | TNEWS | AFQMC | CMNLI | CSL | OCNLI | 平均值 |
---|---|---|---|---|---|---|---|---|---|
24L1024H | RoBERTa-wwm-ext-large | 90.79 | 62.02 | 59.33 | 76.00 | 83.88 | 83.67 | 78.81 | 76.36 |
20L1024H | ERNIE 3.0-XBase | 91.12 | 62.22 | 60.34 | 76.95 | 84.98 | 84.27 | 82.07 | 77.42 |
12L768H | RoBERTa-wwm-ext-base | 88.55 | 61.22 | 58.08 | 74.75 | 81.66 | 81.63 | 77.25 | 74.73 |
12L768H | ERNIE 3.0-Base | 88.18 | 60.72 | 58.73 | 76.53 | 83.65 | 83.30 | 80.31 | 75.63 |
6L768H | RBT6, Chinese | 75.00 | 59.68 | 56.62 | 73.15 | 79.26 | 80.04 | 73.15 | 70.99 |
6L768H | ERNIE 3.0-Medium | 79.93 | 60.14 | 57.16 | 74.56 | 80.87 | 81.23 | 77.02 | 72.99 |
分类和匹配任务:
TASK | AFQMC | TNEWS | IFLYTEK | CMNLI | OCNLI | CLUEWSC2020 | CSL |
---|---|---|---|---|---|---|---|
epoch | 3 | 3 | 3 | 2 | 5 | 50 | 5 |
max_seq_length | 128 | 128 | 128 | 128 | 128 | 128 | 256 |
warmup_proportion | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
Model | AFQMC | TNEWS | IFLYTEK | CMNLI | OCNLI | CLUEWSC2020 | CSL |
---|---|---|---|---|---|---|---|
ERNIE 3.0-Medium | bsz_32_lr_2e-05 | bsz_16_lr_3e-05 | bsz_16_lr_5e-05 | bsz_16_lr_1e-05/bsz_64_lr_2e-05 | bsz_64_lr_2e-05 | bsz_8_lr_2e-05 | bsz_32_lr_1e-05 |
ERNIE 3.0-Base | bsz_16_lr_2e-05 | bsz_64_lr_3e-05 | bsz_16_lr_5e-05 | bsz_16_lr_2e-05 | bsz_16_lr_2e-05 | bsz_8_lr_2e-05(drop_out _0.1) | bsz_16_lr_3e-05 |
ERNIE 3.0-XBase | bsz_16_lr_1e-05 | bsz_16_lr_2e-05 | bsz_16_lr_3e-05 | bsz_16_lr_1e-05 | bsz_32_lr_2e-05 | bsz_8_lr_2e-05 | bsz_64_lr_1e-05 |
文本分类(文本分类)
文本匹配(文本匹配)
序列标注(序列标注)
信息抽取(信息抽取)
文本生成(文本生成)
图文匹配(图文匹配)
数据蒸馏(数据蒸馏)
工具使用(工具使用)
@article{sun2019ernie,
title={Ernie: Enhanced representation through knowledge integration},
author={Sun, Yu and Wang, Shuohuan and Li, Yukun and Feng, Shikun and Chen, Xuyi and Zhang, Han and Tian, Xin and Zhu, Danxiang and Tian, Hao and Wu, Hua},
journal={arXiv preprint arXiv:1904.09223},
year={2019}
}
@inproceedings{sun2020ernie,
title={Ernie 2.0: A continual pre-training framework for language understanding},
author={Sun, Yu and Wang, Shuohuan and Li, Yukun and Feng, Shikun and Tian, Hao and Wu, Hua and Wang, Haifeng},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={34},
number={05},
pages={8968--8975},
year={2020}
}
@article{xiao2020ernie,
title={Ernie-gen: An enhanced multi-flow pre-training and fine-tuning framework for natural language generation},
author={Xiao, Dongling and Zhang, Han and Li, Yukun and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
journal={arXiv preprint arXiv:2001.11314},
year={2020}
}
@article{yu2020ernie,
title={Ernie-vil: Knowledge enhanced vision-language representations through scene graph},
author={Yu, Fei and Tang, Jiji and Yin, Weichong and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
journal={arXiv preprint arXiv:2006.16934},
year={2020}
}
@article{xiao2020ernie,
title={ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding},
author={Xiao, Dongling and Li, Yu-Kun and Zhang, Han and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
journal={arXiv preprint arXiv:2010.12148},
year={2020}
}
@article{ding2020ernie,
title={ERNIE-Doc: A retrospective long-document modeling transformer},
author={Ding, Siyu and Shang, Junyuan and Wang, Shuohuan and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
journal={arXiv preprint arXiv:2012.15688},
year={2020}
}
@article{li2020unimo,
title={Unimo: Towards unified-modal understanding and generation via cross-modal contrastive learning},
author={Li, Wei and Gao, Can and Niu, Guocheng and Xiao, Xinyan and Liu, Hao and Liu, Jiachen and Wu, Hua and Wang, Haifeng},
journal={arXiv preprint arXiv:2012.15409},
year={2020}
}
@article{ouyang2020ernie,
title={Ernie-m: Enhanced multilingual representation by aligning cross-lingual semantics with monolingual corpora},
author={Ouyang, Xuan and Wang, Shuohuan and Pang, Chao and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
journal={arXiv preprint arXiv:2012.15674},
year={2020}
}