Transformersv4.x：将慢速分词器转换为快速分词器

html5 • 2022年10月28日 pm9:26 • 问答

我正在关注变压器的预训练模型xlm-roberta-large-xnli示例

from transformers import pipeline
classifier = pipeline("zero-shot-classification",
                      model="joeddav/xlm-roberta-large-xnli")

我收到以下错误

ValueError: Couldn't instantiate the backend tokenizer from one of: (1) a `tokenizers` library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

我用的是变形金刚版 '4.1.1'

回答

根据 Transformers v4.0.0 release，sentencepiece作为必需的依赖项被删除。这意味着

“依赖 SentencePiece 库的分词器将无法用于标准转换器安装”

包括XLMRobertaTokenizer. 但是，sentencepiece可以作为额外的依赖项安装

pip install transformers[sentencepiece]

或者

pip install sentencepiece

如果您已经安装了变压器。

pip install sentencepiece followed by kernel/runtime restart solves the issue.

以上是Transformersv4.x：将慢速分词器转换为快速分词器的全部内容。

THE END

二维码

仅使用C++17，是否可以创建一个宏，它接受类和方法名称并在类中存在这样的方法时返回true？

< <上一篇

Django：管理界面：如何更改用户密码

下一篇>>

搜索内容

Transformersv4.x：将慢速分词器转换为快速分词器

回答

目录

目录

推荐文章

最新文章