4 Star 20 Fork 8

PaddlePaddle / PALM

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

PaddlePALM

English | 简体中文

PaddlePALM (PArallel Learning from Multi-tasks) 是一个灵活,通用且易于使用的NLP大规模预训练和多任务学习框架。 PALM是一个旨在快速开发高性能NLP模型的上层框架。

使用PaddlePALM,可以非常轻松灵活的探索具有多种任务辅助训练的“高鲁棒性”阅读理解模型,基于PALM训练的模型D-NetEMNLP2019国际阅读理解评测中夺得冠军。

Sample

MRQA2019 排行榜

除了降低NLP研究成本以外,PaddlePALM已被应用于“百度搜索引擎”,有效地提高了用户查询的理解准确度和挖掘出的答案质量,具备高可靠性和高训练/推理性能。

特点:

  • 易于使用:使用PALM, 8个步骤即可实现一个典型的NLP任务。此外,模型主干网络、数据集读取工具和任务输出层已经解耦,只需对代码进行相当小的更改,就可以将任何组件替换为其他候选组件。
  • 支持多任务学习6个步骤即可实现多任务学习任务。
  • 支持大规模任务和预训练:可自动利用多gpu加速训练和推理。集群上的分布式训练需要较少代码。
  • 流行的NLP骨架和预训练模型:内置多种最先进的通用模型架构和预训练模型(如BERT、ERNIE、RoBERTa等)。
  • 易于定制:支持任何组件的定制开发(例如:主干网络,任务头,读取工具和优化器)与预定义组件的复用,这给了开发人员高度的灵活性和效率,以适应不同的NLP场景。

你可以很容易地用较少的代码复现出很好的性能,涵盖了大多数NLP任务,如分类、匹配、序列标记、阅读理解、对话理解等等。更多细节可以在examples中找到。

数据集
chnsenticorp Quora Question Pairs matching MSRA-NER
(SIGHAN2006)
CMRC2018

评价标准

accuracy
f1-score
accuracy
f1-score
f1-score
em
f1-score
test
test
test
dev
ERNIE Base 95.8 95.8 86.2 82.2 99.2 64.3 85.2

Package概览

Sample

PALM架构图

PaddlePALM是一个设计良好的高级NLP框架。基于PaddlePALM的轻量级代码可以高效实现监督学习、非监督/自监督学习、多任务学习和迁移学习。在PaddlePALM架构中有三层,从下到上依次是component层、trainer层、high-level trainer层。

在组件层,PaddlePALM提供了6个 解耦的组件来实现NLP任务。每个组件包含丰富的预定义类和一个基类。预定义类是针对典型的NLP任务的,而基类是帮助用户开发一个新类(基于预定义类或基类)。

训练器层是用选定的构件建立计算图,进行训练和预测。该层描述了训练策略、模型保存和加载、评估和预测过程。一个训练器只能处理一个任务。

高级训练器层用于复杂的学习和推理策略,如多任务学习。您可以添加辅助任务来训练健壮的NLP模型(提高模型的测试集和领域外的性能),或者联合训练多个相关任务来获得每个任务的更高性能。

模块 描述
paddlepalm 基于PaddlePaddle框架的high-level NLP预训练和多任务学习框架。
paddlepalm.reader 预置的任务数据集读取与预处理工具。
paddlepalm.backbone 预置的主干网络,如BERT, ERNIE, RoBERTa。
paddlepalm.head 预置的任务输出层。
paddlepalm.lr_sched 预置的学习率规划策略。
paddlepalm.optimizer 预置的优化器。
paddlepalm.downloader 预训练模型管理与下载模块。
paddlepalm.Trainer 任务训练/预测单元。训练器用于建立计算图,管理训练和评估过程,实现模型/检查点保存和pretrain_model/检查点加载等。
paddlepalm.MultiHeadTrainer 完成多任务训练/预测的模块。一个MultiHeadTrainer建立在几个Trainer的基础上。实现了模型主干网络跨任务复用、多任务学习、多任务推理等。

安装

PaddlePALM 支持 python2 和 python3, linux 和 windows, CPU 和 GPU。安装PaddlePALM的首选方法是通过pip。只需运行以下命令:

pip install paddlepalm

通过源码安装

git clone https://github.com/PaddlePaddle/PALM.git
cd PALM && python setup.py install

库依赖

  • Python >= 2.7
  • cuda >= 9.0
  • cudnn >= 7.0
  • PaddlePaddle >= 1.7.0 (请参考安装指南进行安装)

下载预训练模型

我们提供了许多预训练的模型来初始化模型主干网络参数。用预先训练好的模型训练大的NLP模型,如12层Transformer,实际上比用随机初始化的参数更有效。要查看所有可用的预训练模型并下载,请在python解释器中运行以下代码(在shell中输入命令python):

>>> from paddlepalm import downloader
>>> downloader.ls('pretrain')
Available pretrain items:
  => RoBERTa-zh-base
  => RoBERTa-zh-large
  => ERNIE-v2-en-base
  => ERNIE-v2-en-large
  => XLNet-cased-base
  => XLNet-cased-large
  => ERNIE-v1-zh-base
  => ERNIE-v1-zh-base-max-len-512
  => BERT-en-uncased-large-whole-word-masking
  => BERT-en-cased-large-whole-word-masking
  => BERT-en-uncased-base
  => BERT-en-uncased-large
  => BERT-en-cased-base
  => BERT-en-cased-large
  => BERT-multilingual-uncased-base
  => BERT-multilingual-cased-base
  => BERT-zh-base

>>> downloader.download('pretrain', 'BERT-en-uncased-base', './pretrain_models')
...

使用

快速开始

8个步骤开始一个典型的NLP训练任务。

  1. 使用paddlepalm.reader 为数据集加载和输入特征生成创建一个reader,然后调用reader.load_data方法加载训练数据。
  2. 使用paddlepalm.load_data创建一个模型主干网络来提取文本特征(例如,上下文单词嵌入,句子嵌入)。
  3. 通过reader.register_withreader注册到主干网络上。在这一步之后,reader能够使用主干网络产生的输入特征。
  4. 使用paddlepalm.head。创建一个任务head,可以为训练提供任务损失,为模型推理提供预测结果。
  5. 使用paddlepalm.Trainer创建一个任务Trainer,然后通过Trainer.build_forward构建包含主干网络和任务头的前向图(在步骤2和步骤4中创建)。
  6. 使用paddlepalm.optimizer(如果需要,创建paddlepalm.lr_sched)来创建一个优化器,然后通过train.build_back向后构建。
  7. 使用trainer.fit_reader将准备好的reader和数据(在步骤1中实现)给到trainer。
  8. 使用trainer.load_pretrain加载预训练模型或使用 trainer.load_pretrain加载checkpoint,或不加载任何已训练好的参数,然后使用trainer.train进行训练。

更多实现细节请见示例:

多任务学习

多任务学习模式下运行:

  1. 重复创建组件(每个任务按照上述第1~5步执行)。
  2. 创建空的Trainer(每个Trainer对应一个任务),并通过它们创建一个MultiHeadTrainer
  3. 使用multi_head_trainer.build_forward构建多任务前向图。
  4. 使用paddlepalm.optimizer(如果需要,创建paddlepalm.lr_sched)来创建一个optimizer,然后通过 multi_head_trainer.build_backward创建反向。
  5. 使用multi_head_trainer.fit_readers将所有准备好的读取器和数据放入multi_head_trainer中。
  6. 使用multi_head_trainer.load_pretrain加载预训练模型或使用 multi_head_trainer.load_pretrain加载checkpoint,或不加载任何已经训练好的参数,然后使用multi_head_trainer.train进行训练。

multi_head_trainer的保存/加载和预测操作与trainer相同。

更多实现multi_head_trainer的细节,请见

设置saver

在训练时保存 models/checkpoints 和 logs,调用 trainer.set_saver 方法。更多实现细节见这里

评估/预测

训练结束后进行预测和评价, 只需创建额外的reader, backbone和head(重复上面1~4步骤),注意创建时需设phase='predict'。 然后使用trainer的predict方法进行预测(不需创建额外的trainer)。更多实现细节请见这里

使用多GPU

如果您的环境中存在多个GPU,您可以通过环境变量控制这些GPU的数量和索引CUDA_VISIBLE_DEVICES。例如,如果您的环境中有4个gpu,索引为0、1、2、3,那么您可以运行以下命令来只使用GPU2:

CUDA_VISIBLE_DEVICES=2 python run.py

多GPU的使用需要 ,作为分隔。例如,使用GPU2和GPU3,运行以下命令:

CUDA_VISIBLE_DEVICES=2,3 python run.py

在多GPU模式下,PaddlePALM会自动将每个batch数据分配到可用的GPU上。例如,如果batch_size设置为64,并且有4个GPU可以用于PaddlePALM,那么每个GPU中的batch_size实际上是64/4=16。因此,当使用多个GPU时,您需要确保batch_size可以被暴露给PALM的GPU数量整除

许可证书

此向导由PaddlePaddle贡献,受Apache-2.0 license许可认证。

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

PaddlePALM (PArallel Learning from Multi-tasks) 是一个灵活,通用且易于使用的NLP大规模预训练和多任务学习框架,非常适用于阅读理解等任务。本开源项目受国家重点研发计划“云计算和大数据”专项支持(项目号 2018YFB1004300 )。 展开 收起
Python
Apache-2.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
Python
1
https://gitee.com/paddlepaddle/PALM.git
git@gitee.com:paddlepaddle/PALM.git
paddlepaddle
PALM
PALM
master

搜索帮助