10

I'm here to ask you guys if it is possible to use an existing trained huggingface-transformers model with spacy.

My first naive attempt was to load it via spacy.load('bert-base-uncased'), it didn't work because spacy demands a certain structure, which is understandable.

Now I'm trying to figure out how to use the spacy-transformers library to load the model, create the spacy structure, and use it from that point as a normal spacy-aware model.

I don't know if it is even possible as I couldn't find anything regarding the subject. I've tried to read the documentation but all guides, examples, and posts I found, start from a spacy structured model like spacy/en_core_web_sm, but how did that model was created in the first place? I can believe someone has to train everything again with spacy.

Can I get some help from you?

Thanks.

1 Answer 1

9

What you do is add a Transformer component to your pipeline and give the name of your HuggingFace model as a parameter to that. This is covered in the docs, though people do have trouble finding it. It's important to understand that a Transformer is only one piece of a spaCy pipeline, and you should understand how it all fits together.

To pull from the docs, this is how you specify a custom model in a config:

[components.transformer.model]
@architectures = "spacy-transformers.TransformerModel.v3"
# XXX You can change the model name here
name = "bert-base-cased"
tokenizer_config = {"use_fast": true}

Going back to why you need to understand spaCy's structure, it's very important to understand that in spaCy, Transformers are only sources of features. If your HuggingFace model has an NER head or something it will not work. So if you use a custom model, you'll need to train other components, like NER, on top of it.

Also note that spaCy has a variety of non-Transformers built-in models. These are very fast to train and in many situations will give performance comparable to Transformers; even if they aren't as accurate, you can use the built-in models to get your pipeline configured and then just swap in a Transformer.

all guides, examples, and posts I found, start from a spacy structured model like spacy/en_core_web_sm, but how did that model was created in the first place?

Did you see the quickstart? The pretrained models are created using configs similar to what you get from that.

4
  • Hello @pom23. Thank you very much. I'm going to accept this answer for all you wrote, but most of all "If your HuggingFace model has an NER head or something it will not work.". That is exactly what I'm trying to do. I believe it will demand too much effort I'm not able to spend. Regards the docs, they are pretty good, but all of them starts from some spacy-pretrained model like en_core_web_sm, that's why I got confused. But now is clear. Thank you.
    – rdemorais
    Oct 29, 2021 at 14:11
  • 1
    If you want to use a model with pretrained heads, what you can do is wrap it in a small custom component and use the output to set annotations on the Doc object. However if you do that you're using a very small part of spaCy's features, so it often won't be worth it. It's also hard for spaCy to support arbitrary output heads because, even more so than the base tensors, there's a lot of variation in how they're represented.
    – polm23
    Oct 30, 2021 at 10:07
  • I understand your point. Thank you for your time. Now I know how to go. Best regards.
    – rdemorais
    Oct 31, 2021 at 23:07
  • 2
    There is now an official FAQ topic for this. github.com/explosion/spaCy/discussions/10327
    – polm23
    Feb 18, 2022 at 6:10

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.