Questions tagged [dataframe]
A data frame is a 2D tabular data structure. Usually, it contains data where rows are observations and columns are variables and are allowed to be of different types (as distinct from an array or matrix). While "data frame" or "dataframe" is the term used for this concept in several languages (R, Apache Spark, deedle, Maple, the pandas library in Python and the DataFrames library in Julia), "table" is the term used in MATLAB and SQL.
dataframe
10,829
questions
1547
votes
14
answers
1.8m
views
How to join (merge) data frames (inner, outer, left, right)
Given two data frames:
df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3)))
df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1)))
...
251
votes
9
answers
335k
views
Reshaping data.frame from wide to long format
I have some trouble to convert my data.frame from a wide table to a long table.
At the moment it looks like this:
Code Country 1950 1951 1952 1953 1954
AFG Afghanistan 20,249 ...
1439
votes
26
answers
2.4m
views
How to deal with SettingWithCopyWarning in Pandas
Background
I just upgraded my Pandas from 0.11 to 0.13.0rc1. Now, the application is popping out many new warnings. One of them like this:
E:\FinReporter\FM_EXT.py:449: SettingWithCopyWarning: A value ...
517
votes
19
answers
998k
views
How to sum a variable by group
I have a data frame with two columns. First column contains categories such as "First", "Second", "Third", and the second column has numbers that represent the number of times I saw the specific ...
3548
votes
19
answers
6.6m
views
How do I select rows from a DataFrame based on column values?
How can I select rows from a DataFrame based on values in some column in Pandas?
In SQL, I would use:
SELECT *
FROM table
WHERE column_name = some_value
271
votes
10
answers
384k
views
How do I make a list of data frames?
How do I make a list of data frames and how do I access each of those data frames from the list?
For example, how can I put these data frames in a list ?
d1 <- data.frame(y1 = c(1, 2, 3),
...
4135
votes
34
answers
7.5m
views
How can I iterate over rows in a Pandas DataFrame?
I have a pandas dataframe, df:
c1 c2
0 10 100
1 11 110
2 12 120
How do I iterate over the rows of this dataframe? For every row, I want to access its elements (values in cells) by the name ...
500
votes
14
answers
848k
views
How do I create a new column where the values are selected based on an existing column?
How do I add a color column to the following dataframe so that color='green' if Set == 'Z', and color='red' otherwise?
Type Set
1 A Z
2 B Z
3 B X
4 C Y
877
votes
12
answers
1.3m
views
How to filter Pandas dataframe using 'in' and 'not in' like in SQL
How can I achieve the equivalents of SQL's IN and NOT IN?
I have a list with the required values. Here's the scenario:
df = pd.DataFrame({'country': ['US', 'UK', 'Germany', 'China']})
...
867
votes
15
answers
2.5m
views
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
I want to filter my dataframe with an or condition to keep rows with a particular column's values that are outside the range [-0.25, 0.25]. I tried:
df = df[(df['col'] < -0.25) or (df['col'] > 0....
181
votes
10
answers
190k
views
Dynamically select data frame columns using $ and a character value
I have a vector of different column names and I want to be able to loop over each of them to extract that column from a data.frame. For example, consider the data set mtcars and some variable names ...
395
votes
11
answers
932k
views
How do I Pandas group-by to get sum?
I am using this dataframe:
Fruit Date Name Number
Apples 10/6/2016 Bob 7
Apples 10/6/2016 Bob 8
Apples 10/6/2016 Mike 9
Apples 10/7/2016 Steve 10
Apples 10/7/2016 Bob 1
Oranges ...
203
votes
10
answers
232k
views
Aggregate / summarize multiple variables per group (e.g. sum, mean)
From a data frame, is there a easy way to aggregate (sum, mean, max etc) multiple variables simultaneously?
Below are some sample data:
library(lubridate)
days = 365*2
date = seq(as.Date("2000-01-...
221
votes
16
answers
155k
views
How to unnest (explode) a column in a pandas DataFrame, into multiple rows
I have the following DataFrame where one of the columns is an object (list type cell):
df = pd.DataFrame({'A': [1, 2], 'B': [[1, 2], [1, 2]]})
Output:
A B
0 1 [1, 2]
1 2 [1, 2]
My ...
680
votes
11
answers
337k
views
The difference between bracket [ ] and double bracket [[ ]] for accessing the elements of a list or dataframe
R provides two different methods for accessing the elements of a list or data.frame: [] and [[]].
What is the difference between the two, and when should I use one over the other?