Skip to main content
The 2024 Developer Survey results are live! See the results

All Questions

Tagged with
2 votes
0 answers
55 views

Reshape issue on a pandas dataframe which was converted from polars

Update: was able to work around the issue by converting it to numpy() first and then do a reshape. Before edit: I have a python program where I am using polars dataframe for reading from the file and ...
curiouscoder007's user avatar
7 votes
1 answer
137 views

Translate Pandas groupby plus resample to Polars in Python

I have this code that generates a toy DataFrame (production df is much complex): import polars as pl import numpy as np import pandas as pd def create_timeseries_df(num_rows): date_rng = pd....
girdeux's user avatar
  • 700
0 votes
3 answers
97 views

How to replace a specific field inside a JSON string in each row of a csv file in Python with a random value?

I have a CSV file named input.csv with the following columns: row_num start_date_time id json_message 120 2024-02-02 00:01:00.001+00 1020240202450 {'amount': 10000, 'currency': 'NZD','seqnbr': 161 } ...
Balaji Venkatachalam's user avatar
1 vote
0 answers
70 views

Utilize polars library to compute Levenshtein distance

I have sequences in the form of tuples in a column of a pandas DataFrame. A sample of my DataFrame is the following: id sequence 33268 [(59, 2), (91, 2), (112, 2), (126, 2), (0, 3),... 44360 ...
exch_cmmnt_memb's user avatar
4 votes
3 answers
125 views

How to create a new column within a polars DataFrame that is equal to a list?

I am currently trying to create a new column within a polars dataframe (df). Within my df, there are many many rows, and within this new column I only want my existing list to populate wherever ...
Datawiz's user avatar
  • 41
1 vote
1 answer
75 views

Polars read AWS RDS DB with a table containing column of type jsonb

I'm trying to read AWS RDS DB using the following method through polars: df_rds_table_test = pl.read_database_uri(sql_query, uri) Postgres DB contains a table with column name 'json_message' of type ...
Balaji Venkatachalam's user avatar
2 votes
2 answers
82 views

Pandas vs. Polars: mean() function

There's code which counts mean value of a column pd.DataFrame({'id': ['A', 'A', 'B', 'B', 'B', 'B'], 'a': [1, 2, 3, 4, float('inf'), float('inf')]}).groupby('id').mean() for Pandas. The result is: ...
Krows's user avatar
  • 21
0 votes
1 answer
70 views

How to concat multiple lazyframe created from numpy ndarray and datetime.datetime in Polars

I wish to convert this snipper of pandas code into polars code to learn polars and see if I can benefit w.r.t speed performances: df_list = [] for datum in data: df = pd.DataFrame() temp_data =...
Luca's user avatar
  • 119
1 vote
1 answer
76 views

Can I optimize this cpu-bound pandas code with polars?

I have this pandas code: def last_non_null(s): return s.dropna().iloc[-1] if not s.dropna().empty else np.nan def merge_rows_of_final_df(df_final): # Group by columns A, B, and C cols = ['...
Luca's user avatar
  • 119
1 vote
0 answers
98 views

Polars Downcast the Datatype without Precision Loss

I've observed that when I use polars.Expr.shrink_dtype to optimize the datatype of columns, it often alters the float values slightly. For instance, a float64 value of 2.7 becomes 2.7000001 when ...
sci9's user avatar
  • 776
0 votes
0 answers
46 views

Dataframe larger than memory

In polars, pandas, or another dataframe library, is it possible to have a dataframe with data larger than RAM, as you can in DuckDB? My current solution is to use the polars streaming API, polars....
Test's user avatar
  • 1,521
3 votes
1 answer
117 views

Python - Rolling Indexing in Polars library?

I'd like to ask around if anyone knows how to do rolling indexing in polars? I have personally tried a few solutions which did not work for me (I'll show them below): What I'd like to do: Indexing the ...
user24758287's user avatar
1 vote
2 answers
180 views

Polars read_parquet method converts the original date value to a different value if the date is invalid

I'm reading a parquet file from S3 bucket using polars and below is the code that i use: df = pl.read_parquet(parquet_file_name, storage_options=storage_options, hive_partitioning=False) In the S3 ...
Balaji Venkatachalam's user avatar
1 vote
0 answers
42 views

How can I expand a column of lists into the neighbouring columns (using Polars)? [duplicate]

Say I have the DataFrame: >>> df = polars.DataFrame({"a": [0, 1, 2], "b": [1, 2, 3], "c": [[1, 2, 3], [2], [5, 0]]}) >>> df shape: (3, 3) ┌─────┬─────┬──...
FISR's user avatar
  • 63
2 votes
3 answers
377 views

Polars compare two dataframes - is there a way to fail immediately on first mismatch

I'm using polars.testing assert_frame_equal method to compare two sorted dataframes containing same columns and below is my code: assert_frame_equal(src_df, tgt_df, check_dtype=False, check_row_order=...
Balaji Venkatachalam's user avatar

15 30 50 per page
1
2 3 4 5
12