Questions tagged [pandas]
Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data science libraries in Python.
pandas
288,095
questions
0
votes
0
answers
23
views
Why data that are being written to excel are not starting from the 'A' column?
I'm using pandas to copy data from one excel to another and the data are being copied just not at the right place.
I have this function that reads the data:
def updated_file(self, progress_bar):
...
0
votes
0
answers
19
views
Is using a Pandas Dataframe as a read-only table scalable in a Flask App?
I'm developing a small website in Flask that relies on data from a CSV file to output data to a table on the frontend using JQuery.
The user would select an ID from a drop-down on the front-end, then ...
0
votes
0
answers
18
views
Pandas check if a column has NaT type, unable to find date diff with NaT values [duplicate]
I have StartDate and ExitDate two columns in my dataframe with NaT values in ExitDate column
I wish to create a third column Tenure by finding Difference between ExitDate and StartDate.
StartDate ...
1
vote
1
answer
25
views
Python Pandas difference in boolean indexing between ~ != and ==
I am confused about different results of boolean indexing when using ~ after != versus when using just ==
I have a pandas df with 4 columns:
dic = {
"a": [1,1,1,0,0,1,1],
"b&...
0
votes
0
answers
32
views
Pandas.ExcelFile Include first header row but only searching on second header row
ugds = set()
for sheet_name in excel_data.sheet_names:
df = pd.read_excel(excel_data, sheet_name=sheet_name, header=1)
ugds.update(df["Pool\n Gestão"].unique())
for UGD in ugds:
...
0
votes
0
answers
9
views
FutureWarning in emobpy: incompatible dtype assignment with Pandas DataFrame
I am using the emobpy library to set custom rules for a mobility analysis, but I encounter a FutureWarning about incompatible data types when trying to modify DataFrame items. Here's the problematic ...
-3
votes
0
answers
22
views
لا استطيع ايجاد المكاتب التي قمت بتنزيلها مثل pandas opencv [closed]
مشكلتي هي انني قمت بتنزيل المكاتب مثل numpy - opencv - pandas و الكثيير من المكاتب التي قمت بتنزيلها من واجهه الاوامر في نظام التشغيل ويندوز 10 ولكن عند الدخول الى بيئه برمجه بايثون وهي ال pycharm و ...
-1
votes
1
answer
31
views
how do you merge values in rows, replace nan values in pandas
I am doing some manipulation on a data frame:
df
Node Interface Speed carrier 1-May 9-May 2-Jun 21-Jun
Server1 internet1 10 ATT 20 30 ...
0
votes
0
answers
20
views
text_auto Parameter Not Working in Plotly
The text_auto parameter for a Plotly Express bar chart is not functioning for me, despite seemingly correct syntax. I am using both Jupyter Notebook and Eclipse and the issue persists in both. Plotly ...
0
votes
2
answers
39
views
Sort Pandas dataframe by Sub Total and count
I have a very large dataset called bin_df.
Using pandas and the following code I've assigned sub-total "Total" to each group:
bin_df = df[df["category"].isin(model....
0
votes
3
answers
56
views
How to find rows with value on either side of a given value?
Python, Pandas, I have a dataframe containing datetimes and values.
# Create an empty DataFrame with 'timestamp' and 'value' columns
df = pd.DataFrame(columns=['timestamp', 'value'])
df.set_index('...
-1
votes
1
answer
31
views
pd.to_datetime() not consistently working to convert objects
I have been working with this data (csv) that exists in an AWS S3 bucket. When I am pulling the data I have to transform all the columns to their correct dtypes.
All other dtypes are working properly ...
1
vote
1
answer
66
views
How can I filter df “A” using as a condition a comparison to df “B”?
I’ve got 2 dataframes, dfA and dfB, with different shapes and with different orders. dfA is contained in dfB.
There are 3 columns in this example, “Job Title”, “Job Department” and “Job Salary”. dfA ...
-1
votes
0
answers
30
views
Index into fields of a DataFrame row without needing a row index? [duplicate]
In pandas, is there a way to work with one row of a DataFrame at a time, and for each row, indexing into the columns by name but without indexing into the rows? My current approach is (say) modifying ...
0
votes
1
answer
24
views
Pandas UDF to derive new column
In Spark/Databricks, I have a pandas dataframe with a string column. I need to perform multiple actions on this column (data cleansing type stuff), and produce a new column from the result.
Here's my ...
-5
votes
0
answers
39
views
Pandas .isin returns an empty dataframe even though I know the data is there [closed]
I have two dataframes, one titled small and one titled big. Small has a column called 'Part Number' while big has columns called 'fpartno' and 'fgroup'. I want to find all values in the 'Part Number' ...
-1
votes
0
answers
44
views
How to split a info in a single row in excel into columns using python [duplicate]
I have read a CSV file using pd.read_csv()
I am trying to clean up this data but it is proving a bit difficult.
Essentially all of the information is in a single column and row 1 and I need to split ...
2
votes
3
answers
57
views
How to convert JSONL to parquet efficiently?
Given a jsonl file like this:
{"abc1": "hello world", "foo2": "foo bar"}
{"foo2": "bar bar blah", "foo3": "blah foo"}
...
0
votes
0
answers
28
views
Python threads 'starved' by pandas operations
I am creating a UI application with Qt in Python. It performs operations on pandas DataFrames in a separate threading.Thread to keep the UI responsive; no individual pandas instruction takes noticable ...
0
votes
0
answers
31
views
Minimize RAM usage of pandas operations in python
I have a python function using pandas that does operations on some dataframes. This python functions currently consumes a lot of RAM. I have tried to minimize RAM usage as much as possible but ...
-1
votes
1
answer
37
views
How do I perform a smear between two dataframes in python/pandas? [duplicate]
I have two dataframes and I need to perform a smear (if that is what it's generally called). Basically the first one is smaller (5 million rows) and the other is 40 million rows. I want to add the ...
-2
votes
1
answer
62
views
Convert time into seconds [removing milliseconds]
I have a Pandas dataframe where, column name 'A' has date and time value (as of now it is of type string).
Column A
Column B
2024-07-11 13:09:37.466
PC2
2024-07-11 13:24:43.03
PC1
May 6 2024 22:49:...
0
votes
0
answers
13
views
A value is trying to be set on a copy of a slice from a DataFrame while using loc [duplicate]
I am aware that this is a common issue, but I am confused why I am getting it here:
train_df.loc[:,'decision'] = np.where(train_probs[:,1]>cutoff, 1, 0)
I am doing exactly what the warning says:
...
1
vote
1
answer
31
views
how do you pick the max value of each row of certain columns in pandas
I have this data frame:
df
Node Interface Speed Band_In carrier 1-Jun 10-Jun
Server1 wan1 100 80 ATT 80 30
Server1 wan2 ...
2
votes
2
answers
38
views
How to compare lists in two Pandas dataframes to get the common elements?
I want to compare lists from columns set_1 and set_2 in df_2 with ins column in df_1 to find all common elements.
I've started doing it for one row and one column but I have no idea how to compare all ...
-5
votes
0
answers
45
views
Pandas introducing lineterminators via to_csv without cause or reason [closed]
I've bug checked this thoroughly. I know that the bug is introduced when outputting to csv via the df.to_csv method.
The method is randomly adding lineterminators which aren't called for in any way.
I ...
0
votes
1
answer
19
views
RDKit PandasTools WriteSDF: RuntimeError: Bad pickle format: unexpected End-of-File while reading
I face the error:
PandasTools.WriteSDF(pp, args.output_file, molColName='ID', properties=list(pp.columns))
File "/scratch/micromamba/envs/biotools_py39/lib/python3.9/site-packages/rdkit/Chem/...
-1
votes
0
answers
25
views
NameError Traceback (most recent call last) <ipython-input-3-9ec55f7a7976> in <module> : NameError: name 'books' is not defined
I am trying to plot the evolution of degree centrality over the books for some of the characters from Game of Thrones .I have a list evol that contains the computed degree centrality from all the ...
0
votes
1
answer
60
views
How do I handle merged cells in Excel using Pandas parse function?
I have an Excel file with merged columns and rows, and I want to read the excel file and parse it to convert it into a DataFrame.
This is just a small example of what happened because the real data ...
0
votes
1
answer
51
views
Multi-level rolling average with missing values
I have data on frequencies (N), for combinations of [from, to, subset], and the month. Importantly, when N=0, the row is missing.
N from to subset
month ...