All Questions
90,990
questions
4
votes
2
answers
31
views
How do I categorize projects in a dataframe according to its title?
I have a dataframe where I want to categorize energy releated projects in 4 different topics according to its title.
For that I want to use pre-defined keywords to identify which topic the project ...
0
votes
0
answers
39
views
Why data that are being written to excel are not starting from the 'A' column?
I'm using pandas to copy data from one excel to another and the data are being copied just not at the right place.
I have this function that reads the data:
def updated_file(self, progress_bar):
...
0
votes
1
answer
21
views
Renaming dataframe column in Python with a string value in another dataframe by matching column/index names
Major edit:
Apparently it is difficult to understand my question, so I'll do my best to concretize it.
I got two dataframes, "df1" and "df2". These are quite larger, larger than in ...
3
votes
3
answers
51
views
How to re order duplicates answers on polars dataframe
I have a Polars dataframe that contains multiple questions and answers. The problem is that each answer is contained in its own column, which means that I have a lot of redundant information. ...
-1
votes
0
answers
44
views
How to split a info in a single row in excel into columns using python [duplicate]
I have read a CSV file using pd.read_csv()
I am trying to clean up this data but it is proving a bit difficult.
Essentially all of the information is in a single column and row 1 and I need to split ...
-1
votes
1
answer
37
views
How do I perform a smear between two dataframes in python/pandas? [duplicate]
I have two dataframes and I need to perform a smear (if that is what it's generally called). Basically the first one is smaller (5 million rows) and the other is 40 million rows. I want to add the ...
3
votes
3
answers
73
views
Polars - Filter DataFrame using another DataFrame's row's
I have two Dataframes - graph and search with the same schema
Schema for graph:
SCHEMA = {
START_RANGE: pl.Int64,
END_RANGE: pl.Int64,
}
Schema for search:
SCHEMA = {
START: pl.Int64,
...
0
votes
0
answers
13
views
A value is trying to be set on a copy of a slice from a DataFrame while using loc [duplicate]
I am aware that this is a common issue, but I am confused why I am getting it here:
train_df.loc[:,'decision'] = np.where(train_probs[:,1]>cutoff, 1, 0)
I am doing exactly what the warning says:
...
2
votes
2
answers
38
views
How to compare lists in two Pandas dataframes to get the common elements?
I want to compare lists from columns set_1 and set_2 in df_2 with ins column in df_1 to find all common elements.
I've started doing it for one row and one column but I have no idea how to compare all ...
2
votes
1
answer
41
views
how do you sort column names in Date in descending order in pandas
I have this DataFrame:
Node Interface Speed Band_In carrier Date
Server1 wan1 100 80 ATT 2024-05-09
Server1 wan1 100 50 ...
-7
votes
2
answers
65
views
How to apply "if" condition on dataframes [duplicate]
So I am trying to create a list where it checks from the height column in the dataframe to see if the height is above 70, I want to append 2 and if it is between 66 and 70 append 1 otherwise append 0 ...
4
votes
1
answer
60
views
Get max date column name on polars
I'm trying to get the column name containing the maximum date value in my Polars DataFrame. I found a similar question that was already answered here.
However, in my case, I have many columns, and ...
0
votes
2
answers
56
views
Vectorized way to check if a string is in a dataframe column (set of strings)?
I have a pandas dataframe df. This dataframe has a column to_filter. to_filter is either an empty set or a set of strings. This dataframe also has an integer column id. The id may not be unique.
Given ...
0
votes
3
answers
38
views
How to use Python Pandas Groupby for multiple columns?
I have a dataframe that I am trying to do some calculations on and add a few columns.
Here is an example of the input dataframe:
df1:
Index Type Product Late or On Time
0 A X ...
1
vote
1
answer
32
views
Apply sklearn logloss with rolling on pandas dataframe
My function call looks something like
loss = log_loss(y_true=validate_d['y'], y_pred=validate_probs, sample_weight=validate_df['weight'], normalize=True)
Is there any way to combine this with pandas ...