Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [nltk]

The Natural Language Toolkit is a Python library for computational linguistics.

nltk
0 votes
0 answers
42 views

SSL: CERTIFICATE_VERIFY_FAILED certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)

Trying to download nltk packages using nltk.download() on MacOS in India. Tried this import nltk import ssl try: _create_unverified_https_context = ssl._create_unverified_context except ...
Aarav's user avatar
  • 1
0 votes
1 answer
34 views

Adding special tokens to beginning and end of ngram function

I'm writing a function to that takes in text and converts the text into ngrams based on the order, n. So for bigrams n=2, fivegrams n=5, and so on. I'm trying to add special tokens at the beginning ...
Kevin Veeder's user avatar
1 vote
1 answer
32 views

Making a Python dictionary with a for loop for tokens and their model score

So I'm trying to make a Python dictionary consisting of a word and its model score for all of the words in my file. My issue is that I can't find a way to put the keyword for my iterator, words, into ...
Kevin Veeder's user avatar
0 votes
0 answers
16 views

what is the difference between MWE Tokenizer and countvectorizer+ngram?

looking through the documentation about ngrams and the different vectorizors, I came across the Multi-word expression tokenizer (MWETokenizer) which locates phrases in a text and converts them into a ...
linkey apiacess's user avatar
1 vote
1 answer
42 views

Function that returns tuples composed of a Python dictionary

I'm trying to create a function that takes a list of tokenized words for a review, and a label and returns a list of tuples composed of a Python dictionary and the label associated. You can see what I ...
Kevin Veeder's user avatar
-1 votes
1 answer
53 views

English text tokenization in C# not python is possible? [closed]

In our software we have to analyze a plain text file. First we should break the text into paragraph, then into sentences, then into tokens. Final steps (as far as I understand) is the stemming and ...
Zoltan Hernyak's user avatar
0 votes
0 answers
21 views

How to generate "poetic text" in Python

My problem is: generate "poetic text" as per the commonly accepted definition My solution is a generalization of my solution to the palindrome problem. The problem with my solution is ...
Innovations Anonymous's user avatar
0 votes
0 answers
24 views

why isn't tf.keras.layers.TextVectorization accepting standardization=None?

I'm still trying to get this work (and to learn!) so I am using a tiny corpus. I do some preprocessing on the text in order to get specific bi-gram collocations using nltk (not relevant here but I ...
DS14's user avatar
  • 129
0 votes
1 answer
65 views

How to parse search engine keywords input

I'm implementing a tool that lets users search for terms in texts. I'm currently focused on handling more complex input from the search. The operators I am looking to support are : | = OR & = AND ...
Equino's user avatar
  • 47
1 vote
0 answers
36 views

State of the art word sense disambiguation on WordNet synsets

I am trying to perform a simple task: given a corpus, identify all words that are hyponyms of a certain synset (e.g., «find every mention of a "plant" or a "bird"»). In order to do ...
InfiniteSnow's user avatar
0 votes
0 answers
78 views

How do I install the nltk library's "averaged_perceptron_tagger" on railway server?

Hi I am building an API with django REST Framework for generating a PowerPoint slide using python pptx package. I'm also using NLTK(Natural Language Toolkit) library to process text by tokenizing and ...
Ini-ubong Isemin's user avatar
0 votes
1 answer
19 views

NLTK package is not working in production but working in development

I have created a web-app using Django. I this web-app I want to add functionality to extract phrases from content. My code is working fine in development but not working in production. Using nltk ...
Manoj Kamble's user avatar
0 votes
1 answer
33 views

How to optimize this function and improve running time?

I have function aimed at creating a data-frame with three columns; bigram-phrase, count ( of the bigram-phrase), and PMI score ( for the bigram-phrase). Since I want to run this on a large dataset ...
98fly's user avatar
  • 31
0 votes
1 answer
34 views

Getting nltk certificate verify failed error with Visual Studio Code with Python3

I got this error. As you can see I have import nltk and nltk.download in my code per their guide.: [nltk_data] Error loading words: <urlopen error [SSL: [nltk_data] CERTIFICATE_VERIFY_FAILED] ...
Ares's user avatar
  • 5
1 vote
2 answers
50 views

cannot import punkt nltk

Due to security settings at work i cannot simply do nltk.download('punkt') i therefore printed out the nltk.data.path and found where it's looking, then added the zip file into the location e.g it was ...
Maths12's user avatar
  • 963

15 30 50 per page
1
2 3 4 5
475