All Questions
Tagged with python python-polars
1,339
questions
0
votes
0
answers
34
views
Column value lookup based on multiple conditions and wild cards
I have a dataframe:
df = pl.DataFrame({'Col1': ['a', 'a', 'a', 'b', 'b', 'b', 'b', 'aa', 'aa', 'aa']
, 'Col2': ['c', 'd', 'e', 'c', 'd', 'e', 'f', 'd', 'e', 'f']
,...
2
votes
1
answer
29
views
Polars - when/then conditions form dict
I would like to have a function that accept list of conditions as parameter and filter given dataframe by all of them.
Pseudocode should look like this:
def Filter(df, conditions = ["a",&...
3
votes
3
answers
51
views
How to re order duplicates answers on polars dataframe
I have a Polars dataframe that contains multiple questions and answers. The problem is that each answer is contained in its own column, which means that I have a lot of redundant information. ...
3
votes
3
answers
74
views
Polars - Filter DataFrame using another DataFrame's row's
I have two Dataframes - graph and search with the same schema
Schema for graph:
SCHEMA = {
START_RANGE: pl.Int64,
END_RANGE: pl.Int64,
}
Schema for search:
SCHEMA = {
START: pl.Int64,
...
2
votes
1
answer
40
views
Cast columns that might not exist in Polars
I want to cast a column to another type, but there is a possibility that the column does not exist in the df.
import polars as pl
import polars.selectors as cs
# Sample DataFrame
data = {
"...
4
votes
1
answer
62
views
Get max date column name on polars
I'm trying to get the column name containing the maximum date value in my Polars DataFrame. I found a similar question that was already answered here.
However, in my case, I have many columns, and ...
3
votes
2
answers
61
views
Polars: Replace elements in list of List column
Consider the following example series.
s = pl.Series('s', [[1, 2, 3], [3, 4, 5]])
I'd like to replace all 3s with 10s to obtain the following.
res = pl.Series('s', [[1, 2, 10], [10, 4, 5]])
Is it ...
3
votes
1
answer
59
views
Parsing formulas efficiently using regex and Polars
I am trying to parse a series of mathematical formulas and need to extract variable names efficiently using Polars in Python.
Regex support in Polars seems to be limited, particularly with look-around ...
2
votes
0
answers
43
views
Python + Polars: a DataFrame which keeps track of the history of operations it was derived from?
I found myself making a variant of pl.DataFrame which keeps track of the operations performed on it. For example:
from pprint import pformat, pprint
import polars as pl
import polars._typing as plt
...
2
votes
0
answers
70
views
Best way to aggregate an iterable of `polars.DataFrame` or `polars.Series` objects
I am looking for the best way to compute a per-row running sum (average) over a large number of polars.DataFrames, where each of the frames can potentially have a large number of rows. I'd like the ...
2
votes
1
answer
86
views
How can I efficiently `fill_null` only certain columns of a DataFrame?
For example, let us say I want to fill_null(strategy="zero") only the numeric columns of my DataFrame. My current strategy is to do this:
import polars as pl
import polars.selectors as cs
...
0
votes
1
answer
86
views
Using Polars expressions to apply `eval()` to a column
I would like to achieve the following via Polars expressions, as opposed to mapping the elements row-by-row, but I have not been able to figure out a way.
import polars
def foo():
return 1 + 1
...
2
votes
0
answers
55
views
Reshape issue on a pandas dataframe which was converted from polars
Update:
was able to work around the issue by converting it to numpy() first and then do a reshape.
Before edit:
I have a python program where I am using polars dataframe for reading from the file and ...
2
votes
2
answers
136
views
How to update a polars dataframe column at a specific range?
I want to update a specific column at a specific row index range.
Here's what I want to achieve in pandas:
df = pd.DataFrame({ "foo": [0,0,0,0] })
df["foo"].iloc[0:3] = 1
# or
df....
2
votes
1
answer
67
views
Contradictory error when using Polars read_csv() with multiple files for csv.gz
I'm trying to read multiple csv.gz files into a dataframe but it's not working as I expect.
When I use this globbing pattern:
pl.read_csv('folder_1\*.csv.gz')
It returns this error:
ComputeError: ...