Questions tagged [python-polars]
Polars is a DataFrame library/in-memory query engine.
python-polars
2,163
questions
0
votes
0
answers
23
views
Column value lookup based on multiple conditions and wild cards
I have a dataframe:
df = pl.DataFrame({'Col1': ['a', 'a', 'a', 'b', 'b', 'b', 'b', 'aa', 'aa', 'aa']
, 'Col2': ['c', 'd', 'e', 'c', 'd', 'e', 'f', 'd', 'e', 'f']
,...
3
votes
1
answer
26
views
How can I convert a float value to datetime with higher precision in polars?
The restrictions I have:
I have to concat lots of CSV files horizontally, where the proper types can not be infered, therefore I have to scan the CSV files with string columns data_lf = [pl.scan_csv(...
2
votes
1
answer
27
views
Polars - when/then conditions form dict
I would like to have a function that accept list of conditions as parameter and filter given dataframe by all of them.
Pseudocode should look like this:
def Filter(df, conditions = ["a",&...
0
votes
2
answers
19
views
Polars : output of aggregate function (sum)
When I make the sum of the "Price" column for a subset of rows, I get the correct answer.
ca = [df.filter((pl.col('Created') <= datum[j]) & (pl.col('Year') == jaar[j])).select(pl.sum('...
2
votes
3
answers
50
views
How to re order duplicates answers on polars dataframe
I have a Polars dataframe that contains multiple questions and answers. The problem is that each answer is contained in its own column, which means that I have a lot of redundant information. ...
3
votes
3
answers
73
views
Polars - Filter DataFrame using another DataFrame's row's
I have two Dataframes - graph and search with the same schema
Schema for graph:
SCHEMA = {
START_RANGE: pl.Int64,
END_RANGE: pl.Int64,
}
Schema for search:
SCHEMA = {
START: pl.Int64,
...
2
votes
1
answer
40
views
Cast columns that might not exist in Polars
I want to cast a column to another type, but there is a possibility that the column does not exist in the df.
import polars as pl
import polars.selectors as cs
# Sample DataFrame
data = {
"...
3
votes
1
answer
58
views
Get max date column name on polars
I'm trying to get the column name containing the maximum date value in my Polars DataFrame. I found a similar question that was already answered here.
However, in my case, I have many columns, and ...
3
votes
2
answers
60
views
Polars: Replace elements in list of List column
Consider the following example series.
s = pl.Series('s', [[1, 2, 3], [3, 4, 5]])
I'd like to replace all 3s with 10s to obtain the following.
res = pl.Series('s', [[1, 2, 10], [10, 4, 5]])
Is it ...
3
votes
1
answer
59
views
Parsing formulas efficiently using regex and Polars
I am trying to parse a series of mathematical formulas and need to extract variable names efficiently using Polars in Python.
Regex support in Polars seems to be limited, particularly with look-around ...
2
votes
0
answers
43
views
Python + Polars: a DataFrame which keeps track of the history of operations it was derived from?
I found myself making a variant of pl.DataFrame which keeps track of the operations performed on it. For example:
from pprint import pformat, pprint
import polars as pl
import polars._typing as plt
...
2
votes
0
answers
70
views
Best way to aggregate an iterable of `polars.DataFrame` or `polars.Series` objects
I am looking for the best way to compute a per-row running sum (average) over a large number of polars.DataFrames, where each of the frames can potentially have a large number of rows. I'd like the ...
2
votes
1
answer
86
views
How can I efficiently `fill_null` only certain columns of a DataFrame?
For example, let us say I want to fill_null(strategy="zero") only the numeric columns of my DataFrame. My current strategy is to do this:
import polars as pl
import polars.selectors as cs
...
1
vote
0
answers
57
views
write/read to/from json a python dict that has a polars DF for one or more keys
I am new to Polars in python. I am trying to save a dict that contains single pl.DataFrames for the values of some of the keys. I am currently trying to save the dict in JSON format, but I am ...
0
votes
1
answer
86
views
Using Polars expressions to apply `eval()` to a column
I would like to achieve the following via Polars expressions, as opposed to mapping the elements row-by-row, but I have not been able to figure out a way.
import polars
def foo():
return 1 + 1
...