Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [pandas]

Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data science libraries in Python.

0 votes
0 answers
21 views

Pandas to_sql takes forever with Google Cloud SQL

I'm attempting to insert some data into Google Cloud SQL (running postgres) and it takes forever. It takes roughly 1 minute to insert 10 rows. I am not doing anything fancy, just initializing the ...
wizmer's user avatar
  • 930
2 votes
1 answer
46 views

How to convert timedelta to integer in pandas dataframe

I am trying to convert timedelta to integer. time = (pd.to_datetime(each_date2)-pd.to_datetime(each_date1)) pd.to_numeric(time, downcast='integer') time has following value: Timedelta('7 days 00:00:...
mona's user avatar
  • 91
0 votes
1 answer
45 views

String value from JSON response becomes an numeric value in Python pandas dataframe

I am using Python to pull data out of a REST API and store it in a SQL Database. Everything works fine except for one JSON value in the response. JSON Response [ { "pbxId": "...
Genius86's user avatar
0 votes
0 answers
45 views

Spark EOF Error (Parquet Read from S3)- Spark to Pandas conversion

I am reading close to 1 million rows stored in S3 as parquet files into a dataframe (900 MB size data in a bucket). Filtering the dataframe based on values and then later converting to a Pandas ...
Don Woodward's user avatar
1 vote
1 answer
39 views

How to use replace text using a regex in a Pandas dataframe [duplicate]

I have the following dataset: meste = pd.DataFrame({'a':['06/33','40/2','05/22']}) a 0 06/33 1 40/2 2 05/22 And I want to remove the leading 0s in the text (06/33 to 6/33 for example). I ...
Alexis's user avatar
  • 2,242
0 votes
1 answer
37 views

Subtract dataframe into subdataframes using pandas

I have large dataframe and I want to substract this dataframe into smaller dataframes based on two conditions. Below is the small a piece of the dataframe: | | id |outcome| | -----...
WilliamAshoti's user avatar
1 vote
2 answers
33 views

Group By Two Variables and then Create New Column which is the Value of One Variable Based on the Value of Another Variable in Python (pandas)

I can do this in R but have no idea how to do this in Python. I have data with sbj, num_item, visit, and height. I want to create baseline_height using pandas. Ex: sbj num_item visit height ...
NICE8x's user avatar
  • 11
1 vote
2 answers
54 views

Pandas dataframe groupby apply function with variable number of arguments

I have a pandas dataframe that looks like import pandas as pd data = { "Race_ID": [2,2,2,2,2,5,5,5,5,5,5], "Student_ID": [1,2,3,4,5,9,10,2,3,6,5], "theta": [8,9,2,...
Ishigami's user avatar
  • 239
1 vote
2 answers
57 views

How to find the number of rows within a group since a nonzero value occurred for a pandas dataframe?

I have a dataframe like so: ID value A 0 A 1 A 0 A 0 B 0 B 0 B 2 B 0 B 4 B 0 I want to add a column that counts the number of rows since a nonzero value occurred within the group (in this ...
mdrishan's user avatar
  • 499
0 votes
1 answer
34 views

Create a new column based on other columns for time series data in pandas

I have the following pandas dataframe with columns May, June, and July. Month June July Aug June a d g July b e h Aug c f i I want to create a several new columns with a 1 month forecast, 2 month ...
kmm2204's user avatar
2 votes
1 answer
41 views

how do you sort column names in Date in descending order in pandas

I have this DataFrame: Node Interface Speed Band_In carrier Date Server1 wan1 100 80 ATT 2024-05-09 Server1 wan1 100 50 ...
user1471980's user avatar
  • 10.5k
0 votes
1 answer
45 views

Keeping a running total of quantites while matching items and dates within a range

I'm attempting to match job lines to purchase orders on items within a date range while tracking the available quantity of the items. If I have three dataframes: joblines = pd.DataFrame({ '...
Warcupine's user avatar
  • 4,590
0 votes
2 answers
56 views

Vectorized way to check if a string is in a dataframe column (set of strings)?

I have a pandas dataframe df. This dataframe has a column to_filter. to_filter is either an empty set or a set of strings. This dataframe also has an integer column id. The id may not be unique. Given ...
roulette01's user avatar
  • 2,354
0 votes
0 answers
16 views

Concurrency Control Mechanism For Dataframe Processing In Django WebApp

I have django webapp where processing excel file data directly using pandas dataframe. now, I want to make this operations concurrency control for multiple request processing simultaneously. suggest ...
Enthu Learner's user avatar
0 votes
3 answers
38 views

How to use Python Pandas Groupby for multiple columns?

I have a dataframe that I am trying to do some calculations on and add a few columns. Here is an example of the input dataframe: df1: Index Type Product Late or On Time 0 A X ...
hobbsac's user avatar
  • 21

15 30 50 per page