Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [pandas]

Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data science libraries in Python.

-1 votes
0 answers
44 views

How to split a info in a single row in excel into columns using python [duplicate]

I have read a CSV file using pd.read_csv() I am trying to clean up this data but it is proving a bit difficult. Essentially all of the information is in a single column and row 1 and I need to split ...
rogue1's user avatar
  • 1
2 votes
3 answers
57 views

How to convert JSONL to parquet efficiently?

Given a jsonl file like this: {"abc1": "hello world", "foo2": "foo bar"} {"foo2": "bar bar blah", "foo3": "blah foo"} ...
alvas's user avatar
  • 120k
0 votes
0 answers
29 views

Python threads 'starved' by pandas operations

I am creating a UI application with Qt in Python. It performs operations on pandas DataFrames in a separate threading.Thread to keep the UI responsive; no individual pandas instruction takes noticable ...
AirToTec's user avatar
0 votes
0 answers
32 views

Minimize RAM usage of pandas operations in python

I have a python function using pandas that does operations on some dataframes. This python functions currently consumes a lot of RAM. I have tried to minimize RAM usage as much as possible but ...
BD O's user avatar
  • 13
-1 votes
1 answer
37 views

How do I perform a smear between two dataframes in python/pandas? [duplicate]

I have two dataframes and I need to perform a smear (if that is what it's generally called). Basically the first one is smaller (5 million rows) and the other is 40 million rows. I want to add the ...
babyface's user avatar
-2 votes
1 answer
62 views

Convert time into seconds [removing milliseconds]

I have a Pandas dataframe where, column name 'A' has date and time value (as of now it is of type string). Column A Column B 2024-07-11 13:09:37.466 PC2 2024-07-11 13:24:43.03 PC1 May 6 2024 22:49:...
FarahR's user avatar
  • 3
0 votes
0 answers
13 views

A value is trying to be set on a copy of a slice from a DataFrame while using loc [duplicate]

I am aware that this is a common issue, but I am confused why I am getting it here: train_df.loc[:,'decision'] = np.where(train_probs[:,1]>cutoff, 1, 0) I am doing exactly what the warning says: ...
Baron Yugovich's user avatar
1 vote
1 answer
31 views

how do you pick the max value of each row of certain columns in pandas

I have this data frame: df Node Interface Speed Band_In carrier 1-Jun 10-Jun Server1 wan1 100 80 ATT 80 30 Server1 wan2 ...
user1471980's user avatar
  • 10.5k
2 votes
2 answers
38 views

How to compare lists in two Pandas dataframes to get the common elements?

I want to compare lists from columns set_1 and set_2 in df_2 with ins column in df_1 to find all common elements. I've started doing it for one row and one column but I have no idea how to compare all ...
emor's user avatar
  • 157
-5 votes
0 answers
46 views

Pandas introducing lineterminators via to_csv without cause or reason [closed]

I've bug checked this thoroughly. I know that the bug is introduced when outputting to csv via the df.to_csv method. The method is randomly adding lineterminators which aren't called for in any way. I ...
Josh's user avatar
  • 1
0 votes
1 answer
20 views

RDKit PandasTools WriteSDF: RuntimeError: Bad pickle format: unexpected End-of-File while reading

I face the error: PandasTools.WriteSDF(pp, args.output_file, molColName='ID', properties=list(pp.columns)) File "/scratch/micromamba/envs/biotools_py39/lib/python3.9/site-packages/rdkit/Chem/...
M.Vu's user avatar
  • 460
-1 votes
0 answers
25 views

NameError Traceback (most recent call last) <ipython-input-3-9ec55f7a7976> in <module> : NameError: name 'books' is not defined

I am trying to plot the evolution of degree centrality over the books for some of the characters from Game of Thrones .I have a list evol that contains the computed degree centrality from all the ...
acharyabibash's user avatar
0 votes
1 answer
61 views

How do I handle merged cells in Excel using Pandas parse function?

I have an Excel file with merged columns and rows, and I want to read the excel file and parse it to convert it into a DataFrame. This is just a small example of what happened because the real data ...
RMB's user avatar
  • 1
0 votes
1 answer
51 views

Multi-level rolling average with missing values

I have data on frequencies (N), for combinations of [from, to, subset], and the month. Importantly, when N=0, the row is missing. N from to subset month ...
FooBar's user avatar
  • 16.3k
0 votes
0 answers
22 views

Pandas to_sql takes forever with Google Cloud SQL

I'm attempting to insert some data into Google Cloud SQL (running postgres) and it takes forever. It takes roughly 1 minute to insert 10 rows. I am not doing anything fancy, just initializing the ...
wizmer's user avatar
  • 930

15 30 50 per page