Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [dataframe]

A data frame is a 2D tabular data structure. Usually, it contains data where rows are observations and columns are variables and are allowed to be of different types (as distinct from an array or matrix). While "data frame" or "dataframe" is the term used for this concept in several languages (R, Apache Spark, deedle, Maple, the pandas library in Python and the DataFrames library in Julia), "table" is the term used in MATLAB and SQL.

dataframe
1 vote
2 answers
62 views

Identify starting row of actual data in Pandas DataFrame with merged header cells

My original df looks like this - df Note in the data frame: The headers are there till row 3 & from row 4 onwards, the values for those headers are starting. The numbers of rows & columns ...
Debojit Roy's user avatar
0 votes
0 answers
34 views

spark sql query returns column has 0 length but non null

I have a spark dataframe for a parquet file. The column is string type. spark.sql("select col_a, length(col_a) from df where col_a is not null") +-------------------+------------------------...
Dozel's user avatar
  • 159
0 votes
2 answers
52 views

Converting JSON list with multiple nested dictionaries to csv or excel

I have a JSON that I download from a website that has multiple nested dictionaries inside the main list. This is a very simplified version of it. [ { "id": 1, "...
TxHemi's user avatar
  • 9
0 votes
1 answer
54 views

Grid data to Pandas

I have developed a code that successfully creates a grid, allows the user to input data, and saves it in a list. However, when I press the button to save the grid data to pandas, an error occurs: ...
abdullah qureshi's user avatar
3 votes
2 answers
72 views

How do I get variable length slices of values using Pandas?

I have data that includes a full name and first name, and I need to make a new column with the last name. I can assume full - first = last. I've been trying to use slice with an index the length of ...
J Web's user avatar
  • 65
3 votes
1 answer
86 views

How to compare rows within the same csv file faster

I have a csv file containing 720,000 rows with and 10 columns, the columns that are relevant to the problem are ['timestamp_utc', 'looted_by__name', 'item_id', 'quantity'] This File is logs of items ...
banom's user avatar
  • 41
0 votes
1 answer
38 views

Dataframe replace columns and save to new df

I loaded in a .dat file to a pandas dataframe. Two of the columns are mean and error. I used the values in these two columns to create a randomized value for mean. I want to replace the mean column in ...
Allyand Camshow's user avatar
1 vote
1 answer
34 views

How to apply an expression from a column to another column in pyspark dataframe?

I would like to know if it is possible to apply. for example, I have this table: new_feed_dt regex_to_apply expr_to_apply 053021 | _(\d+) | date_format(to_date(new_feed_dt, '...
Tomás Jullier's user avatar
1 vote
2 answers
94 views

How to calculate the Relative Strength Index (RSI) through record iterations in pandas dataframe

I have created a pandas dataframe as follows: import pandas as pd import numpy as np ds = { 'trend' : [1,1,1,1,2,2,3,3,3,3,3,3,4,4,4,4,4], 'price' : [23,43,56,21,43,55,54,32,9,12,11,12,23,3,2,1,1]...
Giampaolo Levorato's user avatar
0 votes
1 answer
86 views

Using Polars expressions to apply `eval()` to a column

I would like to achieve the following via Polars expressions, as opposed to mapping the elements row-by-row, but I have not been able to figure out a way. import polars def foo(): return 1 + 1 ...
FISR's user avatar
  • 63
1 vote
1 answer
51 views

How to group rows based on column ID in a pandas dataframe?

I have below the dataframe below df1: ID Label Value id_1 A id_1 B id_1 C id_1 D id_1 E id_1 10 id_1 20 id_1 30 id_2 F id_2 G ...
Saly07's user avatar
  • 25
0 votes
1 answer
62 views

split pandas datafram based on given row string

I have a text file with a data set of the form Line 1 Line 2 ! 1.01499999 0.504999995 6.19969398E-7 5.38933136E-7 1.35450875E-6 1.74000001 0.220000029 7.92876381E-6 4.1831604E-6 6.61433387E-6 2....
Py-ser's user avatar
  • 2,020
1 vote
5 answers
86 views

String Manipulation based on Char Length in a dataframe

I wanted to do some string manipulation based on Char length condition. I have this table, let's called it sample table. RiskCode A01 A02.999 I want to transform the RiskCode column in sample ...
Dhestar Bagus Wirawan's user avatar
0 votes
0 answers
63 views

Resample ohlc pandas

There is a dataframe with the data specified below. Datetime as an index I need to generate a resample of a 3-hour interval, even if there is only 2 hours of data at the end of the day 2024-07-17 09:...
Ruslan's user avatar
  • 1
1 vote
4 answers
68 views

Shift part of row in dataframe to new row

I have a dataframe (pandas) that I want to transform for displaying purposes. Therefore I want to shift some parts of the dataframe to new rows like below : col1 col2 col_to_shift col_not_to_shift1 ...
Arthur's user avatar
  • 623

15 30 50 per page
1
3 4
5
6 7
9798