Set value for particular cell in pandas DataFrame using index

Question

I have created a Pandas DataFrame

df = DataFrame(index=['A','B','C'], columns=['x','y'])

Now, I would like to assign a value to particular cell, for example to row C and column x. In other words, I would like to perform the following transformation:

     x    y             x    y
A  NaN  NaN        A  NaN  NaN
B  NaN  NaN   ⟶   B  NaN  NaN
C  NaN  NaN        C   10  NaN

with this code:

df.xs('C')['x'] = 10

However, the contents of df has not changed. The dataframe contains yet again only NaNs. How do I what I want?

Don't use 'chained indexing' (df['x']['C']), use df.ix['x','C']. — Yariv, Commented Jan 22, 2014 at 15:55
The order of index access needs to be: dataframe[column (series)] [row (Series index)], whereas many people (including myself) are more used to the dataframe[row][column] order. As a Matlab and R programmer the latter feels more intuitive to me but that apparently is not the way Pandas works.. — Zhubarb, Commented Jan 31, 2014 at 11:24
i tried that, but i ended up adding another row names x and another column names C. you have to do the row first then the column. so df.ix['C','x']=10 — Matthew, Commented Apr 1, 2016 at 14:58
To @Yariv's comment. Warning: Starting in 0.20.0, the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers. pandas.pydata.org/pandas-docs/stable/generated/… . df.at looks like it is sticking around. — jeffhale, Commented Aug 30, 2018 at 23:24
Be sure to check (and upvote to undig) Atta Jutt’s answer if you need to change values for a whole subset of the dataframe using the index values. — Skippy le Grand Gourou, Commented Mar 3, 2021 at 11:02

unutbu · Accepted Answer · 2017-09-16 02:35:13Z

945

RukTech's answer, df.set_value('C', 'x', 10), is far and away faster than the options I've suggested below. However, it has been slated for deprecation.

Going forward, the recommended method is .iat/.at.

Why df.xs('C')['x']=10 does not work:

df.xs('C') by default, returns a new dataframe with a copy of the data, so

df.xs('C')['x']=10

modifies this new dataframe only.

df['x'] returns a view of the df dataframe, so

df['x']['C'] = 10

modifies df itself.

Warning: It is sometimes difficult to predict if an operation returns a copy or a view. For this reason the docs recommend avoiding assignments with "chained indexing".

So the recommended alternative is

df.at['C', 'x'] = 10

which does modify df.

In [18]: %timeit df.set_value('C', 'x', 10)
100000 loops, best of 3: 2.9 µs per loop

In [20]: %timeit df['x']['C'] = 10
100000 loops, best of 3: 6.31 µs per loop

In [81]: %timeit df.at['C', 'x'] = 10
100000 loops, best of 3: 9.2 µs per loop

edited Sep 16, 2017 at 2:35

answered Dec 12, 2012 at 14:51

unutbu

869k193 gold badges1.8k silver badges1.7k bronze badges

1

There's no such thing as df.x in the API. What did you mean?
– smci
Commented May 20, 2013 at 2:21
5

@smci: 'x' is the name of a column in df. df.x returns a Series with the values in column x. I'll change it to df['x'] since this notation will work with any column name (unlike the dot notation) and I think is clearer.
– unutbu
Commented May 20, 2013 at 11:58
1

I knew that, I thought you were saying df.x was some unknown new method alongside df.xs, df.ix
– smci
Commented May 20, 2013 at 23:27
7

According to the maintainers, this is not the recommended way to set a value. See stackoverflow.com/a/21287235/1579844 and my answer.
– Yariv
Commented Jan 22, 2014 at 15:45
2

In my case I had a mix, i.e. index location and column label. I found this way to get it to work : df_temp.iat[0, df_temp.columns.get_loc('Cash')] = df_temp['Cash'].iloc[0] + start_val
– Pete
Commented Mar 26, 2019 at 2:09

| Show 4 more comments

InSync · Accepted Answer · 2023-04-16 01:21:04Z

265

Update: The .set_value method is going to be deprecated. .iat/.at are good replacements, unfortunately pandas provides little documentation

The fastest way to do this is using set_value. This method is ~100 times faster than .ix method. For example:

df.set_value('C', 'x', 10)

edited Apr 16, 2023 at 1:21

InSync

8,6744 gold badges12 silver badges42 bronze badges

answered Jul 1, 2014 at 19:16

RukTech

5,2015 gold badges22 silver badges23 bronze badges

6

It's even better than df['x']['C'] = 10 .
– Alireza
Commented Oct 17, 2015 at 13:16
7

1000 loops, best of 3: 195 µs per loop "df['x']['C'] = 10" 1000 loops, best of 3: 310 µs per loop "df.ix['C','x'] = 10" 1000 loops, best of 3: 189 µs per loop "df.xs('C', copy=False)['x'] = 10" 1000 loops, best of 3: 7.22 µs per loop "df.set_value('C', 'x', 10)"
– propjk007
Commented Jan 12, 2016 at 17:37
1

does this also work for adding a new row/col to the dataframe?
– st.ph.n
Commented Feb 24, 2016 at 18:46
Yes it does (for pandas 0.16.2)
– RukTech
Commented Mar 2, 2016 at 0:33
3

Seems to be deprecated pandas.pydata.org/pandas-docs/version/0.24/reference/api/…
– henning no longer feeds AI
Commented Aug 4, 2023 at 10:17

| Show 3 more comments

Blairg23 · Accepted Answer · 2018-12-18 20:19:54Z

169

You can also use a conditional lookup using .loc as seen here:

df.loc[df[<some_column_name>] == <condition>, [<another_column_name>]] = <value_to_add>

where <some_column_name is the column you want to check the <condition> variable against and <another_column_name> is the column you want to add to (can be a new column or one that already exists). <value_to_add> is the value you want to add to that column/row.

This example doesn't work precisely with the question at hand, but it might be useful for someone wants to add a specific value based on a condition.

edited Dec 18, 2018 at 20:19

answered Jul 19, 2016 at 19:59

Blairg23

11.9k7 gold badges74 silver badges74 bronze badges

14

the second column needs to be on brackets, otherwise all of columns will be overwritten with value. Like this: df.loc[df['age']==3, ['age-group']] = 'toddler'
– Piizei
Commented Sep 12, 2018 at 10:55
I can't get this to work when <some_column_name> is my index (unixtime index say) and I'm trying to add a timestamp which doesn't exit yet (i.e. a new timestamp reading). Any thoughts?
– yeliabsalohcin
Commented Jun 14, 2019 at 20:43
Is it possible to change a value based on the index and cell values?
– BND
Commented Jan 8, 2020 at 10:59
@BND I'm not sure, but you could get around this apparent pitfall but simply duplicating the index column with another column with the same value? The short answer is I don't know.
– Blairg23
Commented Jan 17, 2020 at 23:58
@yeliabsalohcin see above answer.
– Blairg23
Commented Jan 17, 2020 at 23:59

| Show 1 more comment

sdgluck · Accepted Answer · 2015-10-16 13:31:32Z

85

Try using df.loc[row_index,col_indexer] = value

edited Oct 16, 2015 at 13:31

sdgluck

26.3k11 gold badges80 silver badges94 bronze badges

answered Oct 15, 2015 at 13:32

Yash

9676 silver badges3 bronze badges

10

Welcome to Stack Overflow! Please consider editing your post to add more explanation about what your code does and why it will solve the problem. An answer that mostly just contains code (even if it's working) usually wont help the OP to understand their problem. It's also recommended that you don't post an answer if it's just a guess. A good answer will have a plausible reason for why it could solve the OP's issue.
– SuperBiasedMan
Commented Oct 15, 2015 at 16:46
3

This code is not working (or no longer working?) because it does not replace the value.
– Muhammad Yasirroni
Commented Dec 5, 2021 at 10:11
Note: I've used this method to set values to None, however, Pandas expects the column to already exist; otherwise it may set a wrong value - in my case it was NaN which led to other problems. If the column doesn't exist, Pandas gives a hidden AttributeError that I was only able to read during debugging.
– Kelvin
Commented Jun 13, 2022 at 15:01

Add a comment |

Community · Accepted Answer · 2017-05-23 12:26:36Z

49

The recommended way (according to the maintainers) to set a value is:

df.ix['x','C']=10

Using 'chained indexing' (df['x']['C']) may lead to problems.

See:

edited May 23, 2017 at 12:26

CommunityBot

11 silver badge

answered Jan 22, 2014 at 15:48

Yariv

13.2k21 gold badges54 silver badges76 bronze badges

17

ix is deprecated: pandas-docs.github.io/pandas-docs-travis/…
– ecoe
Commented Nov 14, 2018 at 21:27
works perfect! although it gonna be deprecated sometime!
– Pavlos Ponos
Commented May 29, 2020 at 6:59
As of today (and Pandas version 1.4.4) ix appears to no longer be supported at all. (i.e. it isn't present in the API)
– The Photon
Commented Mar 9, 2023 at 0:15

Add a comment |

Vid Merljak · Accepted Answer · 2017-01-23 09:46:08Z

33

This is the only thing that worked for me!

df.loc['C', 'x'] = 10

Learn more about .loc here.

edited Jan 23, 2017 at 9:46

Vid Merljak

1251 silver badge6 bronze badges

answered Oct 28, 2015 at 20:56

Alon Galor

4394 silver badges5 bronze badges

1

did .loc replace .iat/.at?
– Gabriel Fair
Commented Jul 17, 2018 at 22:48
2

at Similar to loc, in that both provide label-based lookups. Use at if you only need to get or set a single value in a DataFrame or Series. From padas doc
– Rutrus
Commented Jul 31, 2018 at 1:31
Nice this worked for me when my index elements were numerical.
– Christopher John
Commented Feb 25, 2019 at 8:13
2

This doesn't work for a mix of numerical and string indices.
– Seanny123
Commented Mar 26, 2019 at 17:21

Add a comment |

Miladiouss · Accepted Answer · 2019-05-07 04:56:20Z

33

To set values, use:

df.at[0, 'clm1'] = 0

The fastest recommended method for setting variables.
set_value, ix have been deprecated.
No warning, unlike iloc and loc

answered May 7, 2019 at 4:56

Miladiouss

4,5801 gold badge28 silver badges37 bronze badges

1

I came up to the exactly same conclusion.
– prosti
Commented May 7, 2019 at 14:07

Add a comment |

DINA TAKLIT · Accepted Answer · 2019-04-29 15:33:13Z

.iat/.at is the good solution. Supposing you have this simple data_frame:

   A   B   C
0  1   8   4 
1  3   9   6
2  22 33  52

if we want to modify the value of the cell [0,"A"] u can use one of those solution :

df.iat[0,0] = 2
df.at[0,'A'] = 2

And here is a complete example how to use iat to get and set a value of cell :

def prepossessing(df):
  for index in range(0,len(df)): 
      df.iat[index,0] = df.iat[index,0] * 2
  return df

y_train before :

y_train after calling prepossessing function that iat to change to multiply the value of each cell by 2:

KasperGL · Accepted Answer · 2022-09-14 09:49:36Z

26

I would suggest:

df.loc[index_position, "column_name"] = some_value

To modifiy multiple cells at the same time:

df.loc[start_idx_pos: End_idx_pos, "column_name"] = some_value

edited Sep 14, 2022 at 9:49

answered Dec 15, 2020 at 23:15

KasperGL

5976 silver badges6 bronze badges

Add a comment |

Serge Stroobandt · Accepted Answer · 2021-09-26 18:56:43Z

9

Avoid Assignment with Chained Indexing

You are dealing with an assignment with chained indexing which will result in a SettingWithCopy warning. This should be avoided by all means.

Your assignment will have to resort to one single .loc[] or .iloc[] slice, as explained here. Hence, in your case:

df.loc['C', 'x'] = 10

edited Sep 26, 2021 at 18:56

answered Aug 16, 2021 at 9:04

Serge Stroobandt

30.5k9 gold badges115 silver badges107 bronze badges

Add a comment |

Grzegorz Brzęczyszczykiewicz · Accepted Answer · 2018-07-23 10:19:27Z

8

In my example i just change it in selected cell

    for index, row in result.iterrows():
        if np.isnan(row['weight']):
            result.at[index, 'weight'] = 0.0

'result' is a dataField with column 'weight'

answered Jul 23, 2018 at 10:19

Grzegorz Brzęczyszczykiewicz

6096 silver badges11 bronze badges

Add a comment |

Serge Stroobandt · Accepted Answer · 2021-10-07 18:10:26Z

Here is a summary of the valid solutions provided by all users, for data frames indexed by integer and string.

df.iloc, df.loc and df.at work for both type of data frames, df.iloc only works with row/column integer indices, df.loc and df.at supports for setting values using column names and/or integer indices.

When the specified index does not exist, both df.loc and df.at would append the newly inserted rows/columns to the existing data frame, but df.iloc would raise "IndexError: positional indexers are out-of-bounds". A working example tested in Python 2.7 and 3.7 is as follows:

import numpy as np, pandas as pd

df1 = pd.DataFrame(index=np.arange(3), columns=['x','y','z'])
df1['x'] = ['A','B','C']
df1.at[2,'y'] = 400

# rows/columns specified does not exist, appends new rows/columns to existing data frame
df1.at['D','w'] = 9000
df1.loc['E','q'] = 499

# using df[<some_column_name>] == <condition> to retrieve target rows
df1.at[df1['x']=='B', 'y'] = 10000
df1.loc[df1['x']=='B', ['z','w']] = 10000

# using a list of index to setup values
df1.iloc[[1,2,4], 2] = 9999
df1.loc[[0,'D','E'],'w'] = 7500
df1.at[[0,2,"D"],'x'] = 10
df1.at[:, ['y', 'w']] = 8000

df1
>>> df1
     x     y     z     w      q
0   10  8000   NaN  8000    NaN
1    B  8000  9999  8000    NaN
2   10  8000  9999  8000    NaN
D   10  8000   NaN  8000    NaN
E  NaN  8000  9999  8000  499.0

Keiku · Accepted Answer · 2018-02-14 08:17:50Z

6

you can use .iloc.

df.iloc[[2], [0]] = 10

edited Feb 14, 2018 at 8:17

Keiku

8,6154 gold badges43 silver badges45 bronze badges

answered Jun 28, 2017 at 15:39

Muge Cevik

2854 silver badges7 bronze badges

This method seems not supporting several values, e.g. df.iloc[[2:8], [0]] = [2,3,4,5,6,7] which the method df.loc() does natively.
– strpeter
Commented Nov 23, 2017 at 10:58

Add a comment |

ivanleoncz · Accepted Answer · 2019-02-07 04:19:32Z

set_value() is deprecated.

Starting from the release 0.23.4, Pandas "announces the future"...

>>> df
                   Cars  Prices (U$)
0               Audi TT        120.0
1 Lamborghini Aventador        245.0
2      Chevrolet Malibu        190.0
>>> df.set_value(2, 'Prices (U$)', 240.0)
__main__:1: FutureWarning: set_value is deprecated and will be removed in a future release.
Please use .at[] or .iat[] accessors instead

                   Cars  Prices (U$)
0               Audi TT        120.0
1 Lamborghini Aventador        245.0
2      Chevrolet Malibu        240.0

Considering this advice, here's a demonstration of how to use them:

by row/column integer positions

>>> df.iat[1, 1] = 260.0
>>> df
                   Cars  Prices (U$)
0               Audi TT        120.0
1 Lamborghini Aventador        260.0
2      Chevrolet Malibu        240.0

by row/column labels

>>> df.at[2, "Cars"] = "Chevrolet Corvette"
>>> df
                  Cars  Prices (U$)
0               Audi TT        120.0
1 Lamborghini Aventador        260.0
2    Chevrolet Corvette        240.0

References:

Atta Jutt · Accepted Answer · 2020-01-08 09:18:14Z

One way to use index with condition is first get the index of all the rows that satisfy your condition and then simply use those row indexes in a multiple of ways

conditional_index = df.loc[ df['col name'] <condition> ].index

Example condition is like

==5, >10 , =="Any string", >= DateTime

Then you can use these row indexes in variety of ways like

Replace value of one column for conditional_index

df.loc[conditional_index , [col name]]= <new value>

Replace value of multiple column for conditional_index

df.loc[conditional_index, [col1,col2]]= <new value>

One benefit with saving the conditional_index is that you can assign value of one column to another column with same row index

df.loc[conditional_index, [col1,col2]]= df.loc[conditional_index,'col name']

This is all possible because .index returns a array of index which .loc can use with direct addressing so it avoids traversals again and again.

just use, df.loc[conditional_index, ]= <new value> It will replace new value in all columns of rows which satisfy the condition — Atta Jutt, Commented Jan 8, 2020 at 9:16

chirob · Accepted Answer · 2023-04-26 18:18:59Z

Have been frustrated by multiple answers given with loc and .iloc such as

df.loc[index_position, "column_name"] = some_value

because when trying this they continue to throw errors or warnings, such as

Must have equal len keys and value when setting with an iterable

and

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

**however, would like to clarify to others having this issue that **

df.at[2, "Cars"] = mystuff

eliminated the warning for me and works like a charm!

prosti · Accepted Answer · 2019-05-15 11:25:29Z

I tested and the output is df.set_value is little faster, but the official method df.at looks like the fastest non deprecated way to do it.

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(100, 100))

%timeit df.iat[50,50]=50 # ✓
%timeit df.at[50,50]=50 #  ✔
%timeit df.set_value(50,50,50) # will deprecate
%timeit df.iloc[50,50]=50
%timeit df.loc[50,50]=50

7.06 µs ± 118 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
5.52 µs ± 64.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
3.68 µs ± 80.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
98.7 µs ± 1.07 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
109 µs ± 1.42 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Note this is setting the value for a single cell. For the vectors loc and iloc should be better options since they are vectorized.

Gonçalo Peres · Accepted Answer · 2021-02-08 11:33:57Z

If one wants to change the cell in the position (0,0) of the df to a string such as '"236"76"', the following options will do the work:

df[0][0] = '"236"76"'
# %timeit df[0][0] = '"236"76"'
# 938 µs ± 83.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Or using pandas.DataFrame.at

df.at[0, 0] = '"236"76"'
#  %timeit df.at[0, 0] = '"236"76"' 
#15 µs ± 2.09 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Or using pandas.DataFrame.iat

df.iat[0, 0] = '"236"76"'
#  %timeit df.iat[0, 0] = '"236"76"'
# 41.1 µs ± 3.09 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Or using pandas.DataFrame.loc

df.loc[0, 0] = '"236"76"'
#  %timeit df.loc[0, 0] = '"236"76"'
# 5.21 ms ± 401 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Or using pandas.DataFrame.iloc

df.iloc[0, 0] = '"236"76"'
#  %timeit df.iloc[0, 0] = '"236"76"'
# 5.12 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

If time is of relevance, using pandas.DataFrame.at is the fastest approach.

Trenton McKinney · Accepted Answer · 2020-12-18 03:53:31Z

2

Soo, your question to convert NaN at ['x',C] to value 10

the answer is..

df['x'].loc['C':]=10
df

alternative code is

df.loc['C', 'x']=10
df

edited Dec 18, 2020 at 3:53

Trenton McKinney

61.2k41 gold badges160 silver badges178 bronze badges

answered Feb 7, 2020 at 11:27

Ichsan

8189 silver badges13 bronze badges

Add a comment |

gilch · Accepted Answer · 2018-10-07 23:30:06Z

1

df.loc['c','x']=10 This will change the value of cth row and xth column.

edited Oct 7, 2018 at 23:30

gilch

11.4k1 gold badge24 silver badges28 bronze badges

answered Oct 7, 2018 at 19:01

Sujit Singh

191 bronze badge

Add a comment |

Kirill Dolmatov · Accepted Answer · 2018-03-13 15:48:11Z

0

If you want to change values not for whole row, but only for some columns:

x = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
x.iloc[1] = dict(A=10, B=-10)

answered Mar 13, 2018 at 15:48

Kirill Dolmatov

3275 silver badges12 bronze badges

Add a comment |

andrei deusteanu · Accepted Answer · 2018-07-11 10:42:35Z

0

From version 0.21.1 you can also use .at method. There are some differences compared to .loc as mentioned here - pandas .at versus .loc, but it's faster on single value replacement

answered Jul 11, 2018 at 10:42

andrei deusteanu

931 silver badge7 bronze badges

Add a comment |

gebbissimo · Accepted Answer · 2018-11-21 16:29:54Z

In addition to the answers above, here is a benchmark comparing different ways to add rows of data to an already existing dataframe. It shows that using at or set-value is the most efficient way for large dataframes (at least for these test conditions).

Create new dataframe for each row and...
- ... append it (13.0 s)
- ... concatenate it (13.1 s)
Store all new rows in another container first, convert to new dataframe once and append...
- container = lists of lists (2.0 s)
- container = dictionary of lists (1.9 s)
Preallocate whole dataframe, iterate over new rows and all columns and fill using
- ... at (0.6 s)
- ... set_value (0.4 s)

For the test, an existing dataframe comprising 100,000 rows and 1,000 columns and random numpy values was used. To this dataframe, 100 new rows were added.

Code see below:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Wed Nov 21 16:38:46 2018

@author: gebbissimo
"""

import pandas as pd
import numpy as np
import time

NUM_ROWS = 100000
NUM_COLS = 1000
data = np.random.rand(NUM_ROWS,NUM_COLS)
df = pd.DataFrame(data)

NUM_ROWS_NEW = 100
data_tot = np.random.rand(NUM_ROWS + NUM_ROWS_NEW,NUM_COLS)
df_tot = pd.DataFrame(data_tot)

DATA_NEW = np.random.rand(1,NUM_COLS)


#%% FUNCTIONS

# create and append
def create_and_append(df):
    for i in range(NUM_ROWS_NEW):
        df_new = pd.DataFrame(DATA_NEW)
        df = df.append(df_new)
    return df

# create and concatenate
def create_and_concat(df):
    for i in range(NUM_ROWS_NEW):
        df_new = pd.DataFrame(DATA_NEW)
        df = pd.concat((df, df_new))
    return df


# store as dict and 
def store_as_list(df):
    lst = [[] for i in range(NUM_ROWS_NEW)]
    for i in range(NUM_ROWS_NEW):
        for j in range(NUM_COLS):
            lst[i].append(DATA_NEW[0,j])
    df_new = pd.DataFrame(lst)
    df_tot = df.append(df_new)
    return df_tot

# store as dict and 
def store_as_dict(df):
    dct = {}
    for j in range(NUM_COLS):
        dct[j] = []
        for i in range(NUM_ROWS_NEW):
            dct[j].append(DATA_NEW[0,j])
    df_new = pd.DataFrame(dct)
    df_tot = df.append(df_new)
    return df_tot




# preallocate and fill using .at
def fill_using_at(df):
    for i in range(NUM_ROWS_NEW):
        for j in range(NUM_COLS):
            #print("i,j={},{}".format(i,j))
            df.at[NUM_ROWS+i,j] = DATA_NEW[0,j]
    return df


# preallocate and fill using .at
def fill_using_set(df):
    for i in range(NUM_ROWS_NEW):
        for j in range(NUM_COLS):
            #print("i,j={},{}".format(i,j))
            df.set_value(NUM_ROWS+i,j,DATA_NEW[0,j])
    return df


#%% TESTS
t0 = time.time()    
create_and_append(df)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))

t0 = time.time()    
create_and_concat(df)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))

t0 = time.time()    
store_as_list(df)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))

t0 = time.time()    
store_as_dict(df)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))

t0 = time.time()    
fill_using_at(df_tot)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))

t0 = time.time()    
fill_using_set(df_tot)
t1 = time.time()
print('Needed {} seconds'.format(t1-t0))

cottontail · Accepted Answer · 2023-09-27 19:42:18Z

You can also change the underlying array of a cell.

values/to_numpy() returns a view of the underlying array of a DataFrame, so if a particular value in the array is changed, the change is reflected on the DataFrame as well.

df = pd.DataFrame(index=['A','B','C'], columns=['x','y'])

# change the last value in the first column
df.values[-1, 0] = 10
df.to_numpy()[-1, 0] = 10


     x    y
A  NaN  NaN
B  NaN  NaN
C   10  NaN

You can also select a column, view its underlying array and change it by index as well. This method works even if the dtype is Extension Dtype.

# change the last value in column 'x'
df['x'].values[-1] = 100

Changing the DataFrame view the fastest way (5 times faster than the next fastest method) to set a value in a cell, which becomes relevant if this is done in a loop.

df = pd.DataFrame(index=['A', 'B', 'C'], columns=['x', 'y'])
%timeit df.values[-1, 0] = 10   # 1.89 µs ± 85.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
%timeit df.iat[-1, 0] = 10      # 10.9 µs ± 380 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
%timeit df.at['C', 'x'] = 10    # 13 µs ± 307 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
%timeit df.loc['C', 'x'] = 10   # 55.4 µs ± 6.16 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit df.iloc[-1, 0] = 10     # 39.7 µs ± 1.85 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Joshua Magaña · Accepted Answer · 2016-01-26 21:40:21Z

I too was searching for this topic and I put together a way to iterate through a DataFrame and update it with lookup values from a second DataFrame. Here is my code.

src_df = pd.read_sql_query(src_sql,src_connection)
for index1, row1 in src_df.iterrows():
    for index, row in vertical_df.iterrows():
        src_df.set_value(index=index1,col=u'etl_load_key',value=etl_load_key)
        if (row1[u'src_id'] == row['SRC_ID']) is True:
            src_df.set_value(index=index1,col=u'vertical',value=row['VERTICAL'])

Collectives™ on Stack Overflow

Set value for particular cell in pandas DataFrame using index

25 Answers 25

Avoid Assignment with Chained Indexing

Not the answer you're looking for? Browse other questions tagged
python
pandas
dataframe
cell
nan
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

25 Answers 25

Avoid Assignment with Chained Indexing

Not the answer you're looking for? Browse other questions tagged pythonpandasdataframecellnan or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
pandas
dataframe
cell
nan
or ask your own question.