Skip to main content

NLP Collective

A collective focused on NLP (natural language processing), the transformation or extraction of useful information from natural language data.
38.9k Questions
+41
7.3k Members
+78
Contact

Pinned content

View all 2 collections

NLP admins have deemed these posts noteworthy.

Pinned
4 votes
1k views
Collection

Natural Language Processing FAQ

Frequently asked questions relating to NLP. Many of these may be questions that are often asked over and over, duplicates would likely be closed in favor of these. Add the best answer (using the ...
Berthold's user avatar
  • 101

Can you answer these questions?

View all unanswered questions

These questions still don't have an answer

0 votes
0 answers
9 views

Recreating Text Embeddings From An Example Dataset

I am in a situation where I have a list of sentences, and a list of their ideal embeddings on a 25-dimensional vector. I am trying to use a neural network to generate new encodings, but I am ...
1 vote
0 answers
5 views

Why am I seeing unused parameters in position embeddings when using relative_key in BertModel?

I am training a BERT model using pytorch and HuggingFace's BertModel. The sequences of tokens can vary in length from 1 (just a CLS token) to 128. The model trains fine when using absolute position ...
0 votes
0 answers
8 views

Is there any possibility to integrate NER and Textcat Multilabel Models in the same Pipeline

I am working on extracting information from raw text and have created an NER model with 6 entities. I want to pass the output of the NER model to textcat multilabel models. Specifically, I have ...
-1 votes
0 answers
14 views

AssertionError: Unexpected kwargs: {'use_flash_attention_2': False}

I'm using EvolvingLMMs-Lab/lmms-eval to evaluate LLaVa model after running accelerate launch --num_processes=8 -m lmms_eval --model llava --model_args pretrained="liuhaotian/llava-v1.5-7b" ...
-1 votes
0 answers
20 views

BERT: how to get a quoted string as token

I eventually managed to train a model, based on BERT (bert-base-uncased) and TensorFlow, to extract intents and slots for texts like this: create a doc document named doc1 For this text, my model ...

Looking for an extra challenge?

View all bountied questions

These questions have a bounty on them

1 vote
0 answers
67 views
+50

How to fine-tune merlinite 7B model in Python

I am new to LLM programming in Python and I am trying to fine-tune the instructlab/merlinite-7b-lab model on my Mac M1. My goal is to teach this model to a new music composer Xenobi Amilen I have ...