Questions tagged [pandas]
Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data science libraries in Python.
pandas
288,238
questions
0
votes
0
answers
21
views
Pandas to_sql takes forever with Google Cloud SQL
I'm attempting to insert some data into Google Cloud SQL (running postgres) and it takes forever.
It takes roughly 1 minute to insert 10 rows.
I am not doing anything fancy, just initializing the ...
2
votes
1
answer
46
views
How to convert timedelta to integer in pandas dataframe
I am trying to convert timedelta to integer.
time = (pd.to_datetime(each_date2)-pd.to_datetime(each_date1))
pd.to_numeric(time, downcast='integer')
time has following value:
Timedelta('7 days 00:00:...
0
votes
1
answer
45
views
String value from JSON response becomes an numeric value in Python pandas dataframe
I am using Python to pull data out of a REST API and store it in a SQL Database. Everything works fine except for one JSON value in the response.
JSON Response
[
{
"pbxId": "...
0
votes
0
answers
45
views
Spark EOF Error (Parquet Read from S3)- Spark to Pandas conversion
I am reading close to 1 million rows stored in S3 as parquet files into a dataframe (900 MB size data in a bucket). Filtering the dataframe based on values and then later converting to a Pandas ...
1
vote
1
answer
39
views
How to use replace text using a regex in a Pandas dataframe [duplicate]
I have the following dataset:
meste = pd.DataFrame({'a':['06/33','40/2','05/22']})
a
0 06/33
1 40/2
2 05/22
And I want to remove the leading 0s in the text (06/33 to 6/33 for example). I ...
0
votes
1
answer
37
views
Subtract dataframe into subdataframes using pandas
I have large dataframe and I want to substract this dataframe into smaller dataframes based on two conditions. Below is the small a piece of the dataframe:
| | id |outcome|
| -----...
1
vote
2
answers
33
views
Group By Two Variables and then Create New Column which is the Value of One Variable Based on the Value of Another Variable in Python (pandas)
I can do this in R but have no idea how to do this in Python.
I have data with sbj, num_item, visit, and height. I want to create baseline_height using pandas.
Ex:
sbj
num_item
visit
height
...
1
vote
2
answers
54
views
Pandas dataframe groupby apply function with variable number of arguments
I have a pandas dataframe that looks like
import pandas as pd
data = {
"Race_ID": [2,2,2,2,2,5,5,5,5,5,5],
"Student_ID": [1,2,3,4,5,9,10,2,3,6,5],
"theta": [8,9,2,...
1
vote
2
answers
57
views
How to find the number of rows within a group since a nonzero value occurred for a pandas dataframe?
I have a dataframe like so:
ID
value
A
0
A
1
A
0
A
0
B
0
B
0
B
2
B
0
B
4
B
0
I want to add a column that counts the number of rows since a nonzero value occurred within the group (in this ...
0
votes
1
answer
34
views
Create a new column based on other columns for time series data in pandas
I have the following pandas dataframe with columns May, June, and July.
Month
June
July
Aug
June
a
d
g
July
b
e
h
Aug
c
f
i
I want to create a several new columns with a 1 month forecast, 2 month ...
2
votes
1
answer
41
views
how do you sort column names in Date in descending order in pandas
I have this DataFrame:
Node Interface Speed Band_In carrier Date
Server1 wan1 100 80 ATT 2024-05-09
Server1 wan1 100 50 ...
0
votes
1
answer
45
views
Keeping a running total of quantites while matching items and dates within a range
I'm attempting to match job lines to purchase orders on items within a date range while tracking the available quantity of the items.
If I have three dataframes:
joblines = pd.DataFrame({
'...
0
votes
2
answers
56
views
Vectorized way to check if a string is in a dataframe column (set of strings)?
I have a pandas dataframe df. This dataframe has a column to_filter. to_filter is either an empty set or a set of strings. This dataframe also has an integer column id. The id may not be unique.
Given ...
0
votes
0
answers
16
views
Concurrency Control Mechanism For Dataframe Processing In Django WebApp
I have django webapp where processing excel file data directly using pandas dataframe. now, I want to make this operations concurrency control for multiple request processing simultaneously. suggest ...
0
votes
3
answers
38
views
How to use Python Pandas Groupby for multiple columns?
I have a dataframe that I am trying to do some calculations on and add a few columns.
Here is an example of the input dataframe:
df1:
Index Type Product Late or On Time
0 A X ...