Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [text-mining]

Text Mining is a process of deriving high-quality information from unstructured (textual) information.

0 votes
1 answer
33 views

Extract Keywords from Text Vector -- one set of keyworks for each element

Please consider the reprex at the end of the post. It works along the lines of https://cran.r-project.org/web/packages/udpipe/vignettes/udpipe-usecase-postagging-lemmatisation.html It extracts a set ...
larry77's user avatar
  • 1,461
0 votes
0 answers
23 views

Errors attaching metadata to corpus

I am trying to generate a corpus with two documents: one is responses of participants characterized as "supporters" and one is responses of "non-supporters". I've entered this as ...
Nicolette's user avatar
0 votes
1 answer
69 views

Unordered txt file contents: How to design in proper dictionary

I have txt file and it's contents are unordered like below sample. I must select first row because it has train run exact time. my txt file has couple of summary 1, 2 and so on. hence, keys are same ...
eric's user avatar
  • 53
0 votes
0 answers
31 views

pdftools – How to skip errors?

I have an R script that converts all pdf files to text, but the "pdftools" package runs into various errors and stops the process. I would like to include in the code that if it finds an ...
onlyjust17's user avatar
1 vote
1 answer
36 views

Extracting Text via Web Scraping: Loop with several optional start/ end strings

I would like to webscrape the text of several press statements. The problem I'm, currently having is, to define several strings, where the scraping of the text should start/ end. For example the ...
Alexandra's user avatar
0 votes
1 answer
49 views

Export txt files from a corpus after preprocessing

I am struggling to export files from my corpus after preprocessing, I currently have 26 documents in my corpus, but i want to export them as txt files os they have been pre processed so i can combine ...
Bilal Rashid's user avatar
1 vote
1 answer
33 views

I cannot get past data(stop_words) to analyze text in text mining

It's my first attempt at text mining and I have run into a wall. This is what I have done thus far: library(tm) library(tidytext) library(dplyr) library(ggplot2) text1 <- c("Dear land of ...
Rohan Sagar's user avatar
0 votes
0 answers
34 views

Preventing Automatic Fine-Tuning during Inference Loop in Python

I'm working on a Python project that involves processing documents through a language model within a for loop. Basically, I have some questions and I want to ask these questions to a LLM that will ...
BZH's user avatar
  • 11
0 votes
0 answers
18 views

NER features in ML Text Mining

I'm doing a work in identifying fraudulent reviews, with this I'm using some feature engineering like 'NER'. My question is, how can I fit NER into my ML algorithm? Can I vectorize it using TDF-IDF? ...
Marteusa's user avatar
0 votes
0 answers
35 views

I can't use unnest tokens properly when importing from excel

I'm a brand new r programmer and I'm doing an unguided assignment trying to start text mining / sentiment analysis. I'm supposed to get text from an excel file (looks like this) I do some filtering ...
Andrew Morgan's user avatar
0 votes
0 answers
24 views

Disambiguate a gene symbol from an English word

Dears, I use pubmed.mineR: Text Mining of PubMed Abstracts, to extract gene symbols from PubMed Abstacts (texts). There are some gene symbols like: can (https://www.uniprot.org/uniprotkb/P61517/entry)...
Alessandro Brozzi's user avatar
0 votes
0 answers
29 views

Python code to list all the tables created and tables used to create it from sql script

Hi, just wanted to know if you can output such a way that, Created tables and tables used to create that tables can be grouped together eg 'select * from A inner join B on A.id=B.id ...
Anjan Basumatary's user avatar
0 votes
0 answers
25 views

R package syuzhet does not work in Hungarian

I would like to get sentiments of Hungarian songs. I use syuzhet 1.07. It works fine with default settings or with some languages, but not with Hungarian. Is this a bug, or I should load other ...
Jónás Balázs's user avatar
-1 votes
1 answer
20 views

Error while creating the TDM - "No applicable method for 'meta' applied to an object of class "character""

While creating the tm package TermDocumentMatrix, i am getting error. following code i have used. int_vc <- VCorpus(int_vc) int_vc <- tm_map(int_vc, tolower) int_vc <- tm_map(int_vc, ...
yem's user avatar
  • 29
0 votes
0 answers
97 views

LDA Topic Modeling Producing Identical/Empty Topics

I am topic modeling on two large text documents (around 500-750 KB) and am asking for ten topics. I keep getting a repeat of two topics. Could this be an issue of the small number of documents? Or ...
Dez Miller's user avatar

15 30 50 per page
1
2 3 4 5
174