Skip to main content

NLP Collective

Discuss NLP with peers and experts Learn more
A new space for technical discussions about NLP
Share your insights, advice and experience with peers and experts
Engage and discuss in threaded post replies

Discussions

Browse discussion posts about NLP.

44 discussion posts
Sorted by:
11 votes
757 views
19 replies

Is R efficient for sentiment analysis?

I would like to explore more about sentiment analysis but I cannot decide if I should start a project in python or R. What would you suggest?

Community's user avatar
  • 1
0 votes
43 views
0 replies

Tools that can combine a csv with meta data and feed them to LLM to query them

I have a table that looks like this: pd.DataFrame({'HRHHID': [1,2,3,4,5], 'HEHOUSUT': [2,3,1,4,2], 'HETELHHD': [1,2,1,1,1]}) I also have a txt file with some "meta data" for this file that ...

quant's user avatar
  • 4,388
0 votes
20 views
0 replies

Using medspaCy with target rules from Metathesaurus

Before I start this discussion, here are some useful links that could provide some context: https://github.com/medspacy/medspacy https://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/...

Alex K's user avatar
  • 113
1 vote
96 views
3 replies

An fantastic idea about using several sentences to represent another sentence in NLP

When learning NLP, I found that the current representation methods are basically word representation, so I wonder if there is a sentence representation? My hypothesis: to represent sentences using ...

Software Engineer's user avatar
0 votes
33 views
0 replies

What are the advantages and disadvantages of using spaCy vs. NLTK for NLP tasks?

I am currently working on several NLP projects and I'm trying to decide between using spaCy and NLTK as my main NLP library. Both libraries seem to offer a range of features, but I'm not sure which ...

Thomas Markov's user avatar
0 votes
46 views
0 replies

Classifier-Free-Guidance with Transformers

I'm working on music generation using transformers. Using the decoder part for the audio tokens with text conditioning by the T5 encoder In Classifier-Free-Guidance, the text conditioning randomly ...

qmzp's user avatar
  • 59
1 vote
38 views
0 replies

Post-Processing Arabic OCR

Has anyone worked in a place where they extract a lot of text using Arabic OCR and then clean it to be as accurate as possible? How is this done? For example, if you digitize many documents and use ...

Hello's user avatar
  • 1
10 votes
385 views
7 replies

How are OCR texts post-processed to increase accuracy of recognition?

Has anyone worked in a company where they extract large amounts of text using OCR and then clean the text to be as accurate as possible? How is this done? Say I digitize a lot of legal documents, run ...

Hello's user avatar
  • 1
0 votes
81 views
0 replies

How to Develop an AI Model for Generating Website Templates from Text Prompts?

I am working on a project to develop an AI model that can generate website templates based on user-provided text prompts. The model should be able to interpret details such as desired features, color ...

Quartz Mode's user avatar
0 votes
62 views
0 replies

Long Context Embedding Models eg. bge-m3 - To Chunk or Not to Chunk?

bge-m3 is highly performant embedding model that can encode both sparse and dense. It has a context length of 8kb. What I am wondering is with such models if it would be useful to BOTH long encode ...

Draco's user avatar
  • 11
1 vote
53 views
2 replies

Am I overengineering my lenient NER F1 measures

Hi, all I need to customize my F1 measurement a lot when evaluating my fine-tuned NER model performance. But I don't see other people with similar issues. I wonder if I am doing the wrong thing, or if ...

FewKey's user avatar
  • 152
9 votes
178 views
5 replies

What is your ideal development environment for deep learning/NLP?

I'm curious on what are the ideal development setups for NLP developers who train deep learning models? I know in my journey to work more with deep learning, I use Jupyter Notebooks a lot, but I'm ...

AhmedBr's user avatar
  • 146
1 vote
206 views
1 reply

Can you help a classification algorithm by offering it cues?

Is there a way in which you can teach a classification algorithm to learn better from sparse data. I was also thinking about the possibility of giving it cues as a list of words or phrases. If you ...

arturo-bandini-jr's user avatar
12 votes
547 views
3 replies

LLM learning roadblock: how to fine-tune a model with zero budget

As an early learner in LLM I completed most of the hugging face tutorials without much trouble on a laptop. Once I move past that to wanting to fine tune a model with a large amount of data, my ...

Utkarsh Dadhich's user avatar
2 votes
31 views
0 replies

Do you know more list of datasets with English sentences has idiom?

Do you know list of dataset with English sentences has idiom? I use it for research. I found these, but need more https://metatext.io/datasets/english-possible-idiomatic-expressions-(epie) https://...

Vy Do's user avatar
  • 50.7k
15 30 50 per page