Skip to main content
The 2024 Developer Survey results are live! See the results

All Questions

Tagged with
0 votes
2 answers
41 views

Sort Pandas dataframe by Sub Total and count

I have a very large dataset called bin_df. Using pandas and the following code I've assigned sub-total "Total" to each group: bin_df = df[df["category"].isin(model....
Charlotte's user avatar
  • 411
1 vote
2 answers
33 views

Group By Two Variables and then Create New Column which is the Value of One Variable Based on the Value of Another Variable in Python (pandas)

I can do this in R but have no idea how to do this in Python. I have data with sbj, num_item, visit, and height. I want to create baseline_height using pandas. Ex: sbj num_item visit height ...
NICE8x's user avatar
  • 11
1 vote
2 answers
54 views

Pandas dataframe groupby apply function with variable number of arguments

I have a pandas dataframe that looks like import pandas as pd data = { "Race_ID": [2,2,2,2,2,5,5,5,5,5,5], "Student_ID": [1,2,3,4,5,9,10,2,3,6,5], "theta": [8,9,2,...
Ishigami's user avatar
  • 239
0 votes
0 answers
64 views

Sorting a DataFrame by Multiple Conditions in Pandas

I'm struggling with a specific sort that I'm not managing to implement in Python. Here's a sample dataframe import pandas as pd data = { 'product': ['A', 'A', 'A', 'B', 'B', 'B'], 'quantity': ...
Johann Robette's user avatar
1 vote
3 answers
68 views

dataframe filter groupby based on a subset

df_example = pd.DataFrame({'name': ['a', 'a', 'a', 'b', 'b', 'b'], 'class': [1, 2, 2, 3, 2, 2], 'price': [3, 4, 2, 1, 6, 5]}) I want to filter each ...
user6703592's user avatar
  • 1,102
2 votes
1 answer
48 views

number every first unique piece in each group

In each group, each 1st unique item should be given a different number in new column 'num'. I can form the groups but I don't know how to number the unique pieces. Is there a way to do that ? Unique ...
mxplk's user avatar
  • 57
1 vote
1 answer
179 views

Permutation summation in Pandas dataframe growing super exponentially

I have a pandas dataframe that looks like import pandas as pd data = { "Race_ID": [2,2,2,2,2,5,5,5,5,5,5], "Student_ID": [1,2,3,4,5,9,10,2,3,6,5], "theta": [8,9,2,...
Ishigami's user avatar
  • 239
0 votes
1 answer
46 views

Pandas, How can I group column 1 by column 2 with column 1's absolute max values without changing column 1 to absolute values?

So lets say I got a df_1 like this: Floor UV 1 1 -2 2 1 3 3 1 -5 4 1 4 5 2 14 6 2 -15 And I have written this code: output_df = df_1.loc[df_1.groupby(&...
Hür Doğan ÜNLÜ's user avatar
1 vote
1 answer
26 views

Rolling Average with variable min_periods from another column

I have a dataframe with multiple accounts across the last few years and am trying to get the rolling average of a column, per account, which is easy enough. However I also need to have the min_periods ...
Nolan G's user avatar
  • 13
3 votes
1 answer
43 views

Groupby multiple columns and extract top rows based on non-grouped column value

I am trying to solve a problem some what very similar to: https://platform.stratascratch.com/coding/10362-top-monthly-sellers?code_type=2 here is my data frame: product seller market ...
Theja 's user avatar
  • 55
1 vote
3 answers
64 views

Problem using groupby and transform with conditional lambda on multiple columns in Pandas

I'm curious about a weird behavior I got while using Pandas. My intial purpose was, for each group in my data, to replace all values in a column with NA when said column contains more than x% missing ...
TimDdckr's user avatar
0 votes
0 answers
36 views

Pandas interpolate on 2 missing values based on different columns and other specific filter function after groupby

After a groupby on a date column I would like to interpolate on 2 specific values based on different columns and also retrieve the value of another for which the sum of two columns is minimum ... I ...
Hotone's user avatar
  • 451
4 votes
1 answer
76 views

How to vectorize groupby combination lists of two columns in Pandas Dataframe

I've a dataframe and need to group by two columns from all possible combinations of dataframe columns ['A','B','C','D','E','F','G'] import pandas as pd d = {'A': [0,1,1,0,0,1,0,0], 'B': [1,1,0,0,...
black cat's user avatar
7 votes
1 answer
136 views

Translate Pandas groupby plus resample to Polars in Python

I have this code that generates a toy DataFrame (production df is much complex): import polars as pl import numpy as np import pandas as pd def create_timeseries_df(num_rows): date_rng = pd....
girdeux's user avatar
  • 700
3 votes
3 answers
95 views

Efficiently remove rows from pandas df based on second latest time in column

I have a pandas Dataframe that looks similar to this: Index ID time_1 time_2 0 101 2024-06-20 14:32:22 2024-06-20 14:10:31 1 101 2024-06-20 15:21:31 2024-06-20 14:32:22 2 101 2024-06-20 15:21:31 ...
Frede's user avatar
  • 45

15 30 50 per page
1
2 3 4 5
902