All Questions
2,820
questions
1
vote
1
answer
39
views
How to use replace text using a regex in a Pandas dataframe [duplicate]
I have the following dataset:
meste = pd.DataFrame({'a':['06/33','40/2','05/22']})
a
0 06/33
1 40/2
2 05/22
And I want to remove the leading 0s in the text (06/33 to 6/33 for example). I ...
1
vote
1
answer
62
views
What are methods of parsing complex unstructured text from docx file into pandas?
I have a docx file with unstructured text that looks like the following:
docx File
Prep
Northern Kitchen Number One K01-24-01-P01 $132,500
Background:
None
Project Description:
Some long ...
-2
votes
0
answers
32
views
Remove /u202f and check the regex so that it matches am and pm accordingly [duplicate]
I wrote a pattern i.e regex for:
17/07/24, 5:43 pm - Ryan: cant relate
i.e
pattern = r'\d{1,2}\/\d{1,2}\/\d{2,4},\s\d{1,2}:\d{2}\s(?:am|pm)\s-\s'
I tried to create data frame of it to separate the ...
-3
votes
0
answers
67
views
Split pandas series/dataframe column has long mail chain to muliple rows, using regex
I have a pandas data frame in which one column of text strings contains long mail chains. I want to split each mail chain in one row to extract fields like: to:, from:, cc: ,date and body and create ...
1
vote
1
answer
44
views
How do I create a regex dynamically using strings in a list for use in a pandas dataframe search?
The following code allows me to successfully identify the 2nd and 3rd texts, and only those texts, in a pandas dataframe by search for rows that contain the word "cod" or "i":
...
0
votes
0
answers
18
views
Python regular expression adorns string with visible delimiters, yields extra delmiter [duplicate]
I am fairly new to Python and pandas. In my data cleaning, I would like to see the I performed previous cleaning steps correctly on a string column. In particular, I want to see where the strings ...
0
votes
1
answer
56
views
Python_Pandas regex with vectorization is generating NaN
I have a data set which has columns that looks like this:
The Loc_Description column contains the name of towns and roads. The position of a town name with each string varies and is mostly dependent ...
1
vote
1
answer
64
views
Checking values row per row to assign proper work center to product
I am tasked with matching the products from a table to the Work Center they are supposed to be sent to. To do this, I have 2 tables: one contains the details about what product goes to what Work ...
1
vote
4
answers
97
views
Pandas Extract Phone Number if it is in Correct Format
I have a column that has phone numbers. They are usually formatted in (555) 123-4567 but sometimes they are in a different format or they are not proper numbers. I am trying to convert this field to ...
1
vote
2
answers
79
views
Extracting int values from a string (in different formats) using a regex
I have a string value (football score) in my Pandas dataset. I would like to extract the home goals and the away goals from this score.
The score can be written in a couple of ways (sometimes it is ...
0
votes
2
answers
39
views
drop specific column in pandas
My data has some columns with format ABC_number_AX and ABC_number_AX_MED. I would like to exclude column ABC_number_AX. To do that I define the following pattern to filter out columns with a specific ...
2
votes
1
answer
50
views
How to make regex code apply only to empty target cells
An example of my data
StreetAddress
City
State
Zip
1 Main St 01123
Winsted
CT
1 Main St
Winsted
CT
01123
I am trying to use regex and pandas to clean a spreadsheet that I have. The problem I am ...
2
votes
1
answer
46
views
Regex capture group in pandas extract call for single digit followed by single letter
I need to extract substring instances in a pandas series that match this regex: "[3-6]X"
ie "3X", "4X", "5X", or "6X" from arbitrary strings like &...
-1
votes
3
answers
50
views
Pandas: replace regex with string ending with tab not working
I have the following dataframe:
df = pd.DataFrame({'Depth':['7500', '7800', '8300', '8500'],
'Gas':['25-13 PASON', '9/8 PASON', '19/14', '56/26'],
'ID':[1, 2, 3, 4]})
...
0
votes
0
answers
26
views
Assign output of expanded pandas str.extract to new columns of same dataframe when rows are filtered with .loc [duplicate]
I have a dataframe with some registration types and some alphanumeric registration numbers for certain types. If a row has a certain type, I need to perform a regex operation on the relevant number, ...