Skip to main content
The 2024 Developer Survey results are live! See the results

All Questions

Tagged with
0 votes
0 answers
11 views

Pyspark Regex Lookbehind Beginning Of String [duplicate]

My string in column "Key" is: "+One+Two+Three-Four" I want to extract all words following the "+" sign: df.select(regexp_extract_all("Key", F.lit(r"(?<=...
shwan's user avatar
  • 568
1 vote
1 answer
21 views

Is there any situation where re.search could not be used instead of re.match? [duplicate]

The documentation seems clear but it begs the question, what is the purpose of re.match? Couldn't re.search with the caret (^) be used instead as long as the MULTILINE flag is not enabled? Is re.match ...
Kevin Eldurson's user avatar
0 votes
0 answers
15 views

Regex unicode categories \p{L} combined with a Python formatted string [duplicate]

I would like to write a regular expression in Python that is a formatted string and includes a unicode category. The regex would look like this: import regex as re mystring = "abc" m = re....
Oriane Nédey's user avatar
1 vote
1 answer
51 views

Removing string between two specified strings in Python 3 [duplicate]

I am working on an NLP project that requires me to remove computer code from a piece of text. The code is encased between the tags <pre><code> and </code></pre>. Now I could do ...
Namrata Banerji's user avatar
-4 votes
0 answers
48 views

Why is using regex group feature on Python giving different outputs? [duplicate]

import re string1 = "aaabaa" zusuchen = "aa" #1 m_start = re.finditer(fr'(?=({zusuchen}))', string1) results = [(match.start(1), match.end(1)-1) for match in m_start] for z in ...
Hotbread's user avatar
-1 votes
0 answers
21 views

Splicing Relavent Text from a Screenshot using pytesseract and ocr for Scheduling script

Hi I'm currently making a script that can take screenshots of a university class schedule and automatically sync it to either google calender or outlook calendar. from PIL import Image import ...
Daniel George's user avatar
1 vote
1 answer
46 views

How to extract the volume from a string using a regular expression?

I need to extract the volume with regular expression from strings like "Candy BAR 350G" (volume = 350G), "Gin Barrister 0.9ml" (volume = 0.9ml), "BAXTER DRY Gin 40% 0.5 ml&...
Veronica Isakova's user avatar
1 vote
1 answer
40 views

How to use replace text using a regex in a Pandas dataframe [duplicate]

I have the following dataset: meste = pd.DataFrame({'a':['06/33','40/2','05/22']}) a 0 06/33 1 40/2 2 05/22 And I want to remove the leading 0s in the text (06/33 to 6/33 for example). I ...
Alexis's user avatar
  • 2,242
1 vote
1 answer
40 views

Regx pattern for Pyspark: match start and middle of a text and extract the middle

I have text in a pyspark column called TEXT that look like below: The sky is red. I have 2 apples and I am fine. ---------------------------------------------- The sky is back. I have 8 apples or I am ...
maede nasri's user avatar
3 votes
1 answer
60 views

Parsing formulas efficiently using regex and Polars

I am trying to parse a series of mathematical formulas and need to extract variable names efficiently using Polars in Python. Regex support in Polars seems to be limited, particularly with look-around ...
Oyibo's user avatar
  • 97
1 vote
2 answers
60 views

How to extract or capture the value from stdout_lines of an Ansible playbook?

I am looking for help to extract or capture the free MB value from the stdout_lines of an Ansible playbook execution and use that value as a criteria to proceed further in the playbook. My task output ...
PraveenPrasannan's user avatar
2 votes
1 answer
52 views

How do I fix this Reg ex so that it matches hyphenated words where the final segment ends in a consonant other than the letter m

I want to match all cases where a hyphenated string (which could be made up of one or multiple hyphenated segments) ends in a consonant that is not the letter m. In other words, it needs to match ...
Paige Cox's user avatar
-4 votes
0 answers
35 views

Using regex for Account Number Extraction [closed]

Using Regex, how to read the accounts from below table in such a manner that from the first row, four IDs can be extracted- 300501798101, 359073848101, 359073848102 and 300501798101 whereas from the ...
Rohit's user avatar
  • 9
2 votes
1 answer
79 views

Using callable_iterator (re.finditer) causes Python to freeze

I have a function that is called for every line of a text. def tokenize_line(line: str, cmd = ''): matches = re.finditer(Patterns.SUPPORTED_TOKENS, line) tokens_found, not_found, start_idx = []...
Martin A.'s user avatar
0 votes
0 answers
33 views

Regex: Find all matches between varying length sets of identical special characters [duplicate]

I have texts similar to this: <FILE_NAME> ��������������� </FILE_NAME> <SHEET_NAMES> ['������'] </SHEET_NAMES> <RAW_STRINGS> [������������] Where any length of the ...
Idodo's user avatar
  • 1,429

15 30 50 per page
1
2 3 4 5
2403