Skip to main content
The 2024 Developer Survey results are live! See the results
Collection

Natural Language Processing FAQ

Frequently asked questions relating to NLP. Many of these may be questions that are often asked over and over, duplicates would likely be closed in favor of these. Add the best answer (using the answer URL) if the question itself is brief.

Created
Active
Last edited
Viewed 1k times
Part of NLP Collective
4
354 votes
7 answers
219k views

What is "entropy and information gain"?

Answer Accepted

I assume entropy was mentioned in the context of building decision trees. To illustrate, imagine the task of learning to classify first-names into male/female groups. That is given a list of names ...

View answer
Amro's user avatar
Thorough answer to this question
Berthold's user avatar
284 votes
14 answers
304k views

How to compute the similarity between two text documents?

Answer Accepted

The common way of doing this is to transform the documents into TF-IDF vectors and then compute the cosine similarity between them. Any textbook on information retrieval (IR) covers this. See esp. ...

View answer
Fred Foo's user avatar
Common approach to this
Berthold's user avatar
98 votes
10 answers
102k views

How to use Bert for long text classification?

Common question
Berthold's user avatar
0 votes
2 answers
361 views

What is the benefit of NLP sentence segmentation over Python algorithm?

Commonly asked question
Futurist Forever's user avatar
1 vote
2 answers
687 views

Information extracting from plain text using NLP

Basic action to perform Named Entity Recognition, one of the fundamental NLP tasks
SilentCloud's user avatar