Questions tagged [string-matching]
String matching is the problem of finding occurrences of one string (“pattern”, “needle”) in another (“text”, “haystack”).
string-matching
2,309
questions
0
votes
1
answer
19
views
Renaming dataframe column in Python with a string value in another dataframe by matching column/index names
Major edit:
Apparently it is difficult to understand my question, so I'll do my best to concretize it.
I got two dataframes, "df1" and "df2". These are quite larger, larger than in ...
0
votes
1
answer
81
views
Finding the longest Dictionary.Key match in a phrase
I have a SortedDictionary<string, string>, ordered by key length descending, of the form:
red fox - address1
weasel - address2
foxes - address3
fox - address3
etc.
and a list of phrases e.g.
&...
0
votes
2
answers
50
views
Is there a way to obtain a list separated by comma as the output of str_extract_all instead of the default output in R?
I have searched high and low and nobody seems to have asked that exact question, so I'm at loss.
I have a data frame with a couple columns. One of this column contains various sentences that don't ...
0
votes
1
answer
52
views
Identifying Correct String Order in Pandas
I have a dataframe as the following, showing the relationship of different entities in each row.
Child
Parent
Ult_Parent
Full_Family
A032
A001
A039
A001, A032, A039, A040, A041, A043, A043, A045, ...
0
votes
0
answers
36
views
Fuzzy Match 2 Large Pandas Dataframes
I have 2 pandas dataframes that both contain company names. I want to left join df1(~10k rows) with df2(~1.6m rows) on company names using a fuzzy match. My current function takes too long to run, so ...
2
votes
6
answers
115
views
Matching the start of a sequence in R
I have a series of string in a vector and need to remove the matching starting pattern from the string. However, I don't know the pattern or how long it is.
stringa <- c("apple_tart", &...
1
vote
1
answer
49
views
Given a String count the possible Permutations that satisfy a condition. How to Optimize from O(N*N!)
Hi I recently came across an interesting question and had a hard time trying to optimize it beyond O(N*N!).
Here is the question:
Given a string, return the number of possible combination that satisfy ...
1
vote
2
answers
144
views
How can I find all exact occurrences of a string, or close matches of it, in a longer string in Python?
Goal:
I'd like to find all exact occurrences of a string, or close matches of it, in a longer string in Python.
I'd also like to know the location of these occurrences in the longer string.
To define ...
1
vote
1
answer
31
views
Why doesn't fuzzywuzzy's process.extractBests give a 100% score when the tested string 100% contains the query string?
I'm testing fuzzywuzzy's process.extractBests() as follows:
from fuzzywuzzy import process
# Define the query string
query = "Apple"
# Define the list of choices
choices = ["Apple&...
0
votes
0
answers
72
views
How to efficiently compute similarity scores for prefixes of a string with another string in C?
I'm working on a problem involving string matching where I need to compute the similarity scores for each prefix of a string C against another string S. The similarity score for a prefix P of C and S ...
0
votes
0
answers
40
views
Spotfire's "~=" not matching wildcard characters
Using Spotfire Alanyst 14.0.3
I'm in the Data Canvas adding a filter via the "Add transformation" feature.
When I use the filter expression ...
[customdata_name]~='Binary Pump : 1 : ...
0
votes
1
answer
88
views
How to do fuzzy merge with 2 large pandas dataframes?
I have 2 pandas dataframes that both contain company names. I want to merge these 2 dataframes on company names using a fuzzy match. But the problem is 1 dataframe contains 5m rows and the other 1 ...
1
vote
0
answers
114
views
How to find best matching anchor texts from paragraph and list of titles?
I have a paragraph:
In today's world, keeping your personal information safe online is more important than ever. With cyber-attacks on the rise, having a strong cybersecurity strategy is essential.
...
2
votes
3
answers
142
views
How to Compare Hierarchy in 2 Pandas DataFrames? (New Sample Data Updated)
I have 2 dataframes that captured the hierarchy of the same dataset. Df1 is more complete compared to Df2, so I want to use Df1 as the standard to analyze if the hierarchy in Df2 is correct. However, ...
-1
votes
1
answer
62
views
Can i combine contain and startswith in order to match two columns from one dataframe to another's master column?
Master dataframe filled with a specific match's players and statistics.
34 columns and variable number of rows.
Column "Player" has full names
Player
Goals
Assists
Dominic Calvert-Lewin
1
...