Questions tagged [spacy-transformers]
The spacy-transformers tag has no usage guidance.
spacy-transformers
97
questions
0
votes
0
answers
14
views
Is there any possibility to integrate NER and Textcat Multilabel Models in the same Pipeline
I am working on extracting information from raw text and have created an NER model with 6 entities. I want to pass the output of the NER model to textcat multilabel models. Specifically, I have ...
1
vote
0
answers
36
views
State of the art word sense disambiguation on WordNet synsets
I am trying to perform a simple task: given a corpus, identify all words that are hyponyms of a certain synset (e.g., «find every mention of a "plant" or a "bird"»). In order to do ...
0
votes
0
answers
39
views
Custom Named Entity Recognition (NER) Model with spaCy V3
This is my first time building a custom model with SPACY NER.
# Define a function to create spaCy DocBin objects from the annotated data
def get_spacy_doc(file, data):
# Create a blank spaCy ...
0
votes
1
answer
45
views
SpaCy transformer NER training – zero loss on transformer, not trained
I am training a SpaCy pipeline with ['transformer', 'ner'] components, ner trains well, but transformer is stuck on 0 loss, and, I am assuming, is not training.
Here is my config:
[paths]
vectors = &...
0
votes
0
answers
18
views
Training process with the default Spacy configuration file does not produce any log output
My config.cfg is generated according to the instructions provided at Spacy Quickstart
I run the train command:python -m spacy train config.cfg --output ./output --paths.train .\train.spacy --paths.dev ...
0
votes
0
answers
39
views
Dependency parser with SpaCy
I am trying to build an entity ruler with SpaCy that will identify specific organizations based on their relationship (contract) with other organizations. For context, you are a supermarket dealing ...
0
votes
0
answers
78
views
Loading a pre-trained spaCy transformer with Hugging Face fails because of missing config.json
I am trying to get into NLP with Hugging Face, Presidio and spaCy. Following the Presidio tutorial, I tried downloading a pre-trained spaCy transformer named de_dep_news_trf like this:
import ...
0
votes
0
answers
35
views
Spacy using all available CPU when running via Docker
I'm using scrubadub_spacy to clean text documents. When processing a document in a Docker environment built on amazonlinux, it maxes out the 8 cores on my MacBook and in Amazon Fargate. I'm confused ...
0
votes
0
answers
33
views
Trying to understand how batching works with Thinc models
Since many of the Thinc layers require a Float2D as input, I've been struggling to understand how to pass a batch of tokenized text, where [batch_size, max_seq_length, embedding_size] are the ...
0
votes
1
answer
37
views
Adding Linear layers to Thinc Model Example - Understanding Data Dimensions Through Model Architecture
Trying to learn the inner workings of models trained with Spacy, and Thinc models are it. Looking at this tutorial and I'm modifying the model to see what breaks and what works. Instead of tagging, I'...
0
votes
1
answer
100
views
Spacy v3 DocBin unable to save train.spacy bytes object is too large
I want to train large data in spacy v3.0+
There are 8000000 data tokens count
i made 1000000 each chunk and finally murge vai DocBin python code but getting error
import os
import spacy
from spacy....
0
votes
0
answers
45
views
spaCy contextualSpellCheck: recurring issue with HF timeout & "local variable 'model' referenced before assignment"
Running spaCy 3.7.2 on Databricks on AWS in a network limited environment. Error when trying initiate/use contextualSpellCheck. To get around what looks like a network issue I've installed the ...
0
votes
0
answers
54
views
I am encountering problems while using PyInstaller to package a Qt application that contains a spaCy language model
My Python version is 3.10.13, using a .venv environment. The spaCy model I load is zh_core_web_trf. The code runs normally in VSCode, but when I package it with PyInstaller, it shows an error. I have ...
0
votes
0
answers
222
views
Training spancat component for spacy en_core_web_trf model in prodigy: getting a KeyError: "[E001] No component 'tok2vec' found in pipeline
I want to use the pretrained spacy en_core_web_trf model with the "ner" component, add "spancat" component and train it with the labelled data using prodigy.
However, after running ...
0
votes
0
answers
506
views
No module named 'transformers.models.bark.configuration_bark'
I am trying to import spacy's "en_core_web_trf" model in Colab and facing error because it " Failed to import transformers.models.bark.configuration_bark". What do i do to resolve ...