If a Series is provided, then its name attribute must match the name of the column you wish to update.
If a DataFrame is provided, then the column names must match.

2. overwritelink | boolean | optional

If True, then all values in the source DataFrame will be updated using other.
If False, then only NaN values in the source DataFrame will be updated using other.

By default, overwrite=True.

3. filter_funclink | function | optional

The values you wish to update. The function takes in a column as a 1D Numpy array, and returns an 1D array of booleans that indicate whether or not a value should be updated.

4. errorslink | string | optional

Whether or not to raise errors:

Value	Description
`"raise"`	An error will be raised if a non-`NaN` value is to be updated by another non-`NaN` value.
`"ignore"`	No error will be raised.

By default, errors="ignore".

Return value

Nothing is returned since the update is performed in-place. This means that the source DataFrame will be directly modified.

Examples

Basic usage

Consider the following DataFrames:


        
        
            
                
                
                    df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df_other = pd.DataFrame({"B":[5,6], "C":[7,8]})
                
            
              [df]     [df_other]
   A  B        B  C
0  1  3     0  5  7
1  2  4     1  6  8

Notice how the two DataFrames both have a column with label B. Performing the update gives:


        
        
            
                
                
                    df.update(df_other)
df
                
            
               A  B
0  1  5
1  2  6

The values in column B of the original DataFrame have been replaced by those in column B of the other DataFrame.

Case when other DataFrame contains missing values

Consider the following DataFrames:


        
        
            
                
                
                    df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df_other = pd.DataFrame({"B":[5,np.NaN], "C":[7,8]})
                
            
              [df]     [df_other]
   A  B        B    C 
0  1  3     0  5.0  7
1  2  4     1  NaN  8

Notice how the other DataFrame has a NaN.

Performing the update gives:


        
        
            
                
                
                    df.update(df_other)
df
                
            
               A  B
0  1  5.0
1  2  4.0

The takeaway here is that if the new value is a missing value, then no update is performed for that value.

Specifying the overwrite parameter

Consider the following DataFrames:


        
        
            
                
                
                    df = pd.DataFrame({"A":[1,2], "B":[3,np.NaN]})
df_other = pd.DataFrame({"B":[5,6], "C":[7,8]})
                
            
               [df]      [df_other]
   A  B          B  C
0  1  3       0  5  7
1  2  NaN     1  6  8

Performing the update with default parameter overwrite=True gives:


        
        
            
                
                
                    df.update(df_other)
df
                
            
               A  B
0  1  5.0
1  2  6.0

Notice how all the values in column B of the source DataFrame got updated.

Now, let's compare this with overwrite=False:


        
        
            
                
                
                    df.update(df_other, overwrite=False)
df
                
            
               A  B
0  1  3.0
1  2  6.0

Here, the value 3 was left intact, while the NaN was replaced by the corresponding value of 6. This is because overwrite=False ensures that only NaNs get updated, while non-NaN values remain the unchanged.

Specifying the filter_func parameter

Consider the following DataFrames:


        
        
            
                
                
                    df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df_other = pd.DataFrame({"B":[5,6], "C":[7,8]})
                
            
               [df]      [df_other]
   A  B          B  C
0  1  3       0  5  7
1  2  4       1  6  8

Suppose we only wanted to only update values that were larger than 3. We could do so by specifying a custom function like so:


        
        
            
                
                
                    def foo(vals):
   return vals > 3

df.update(df_other, filter_func=foo)
df
                
            
               A  B
0  1  3
1  2  6

Notice how the value 3 was left unchanged.

Specifying the errors parameter

Consider the following DataFrames:


        
        
            
                
                
                    df = pd.DataFrame({"A":[1,2], "B":[3,4]})
df_other = pd.DataFrame({"B":[5,6]})
                
            
               [df]    [df_other]
   A  B          B
0  1  3       0  5
1  2  4       1  6

Performing the update with the default parameter errors="ignore" gives:


        
        
            
                
                
                    df.update(df_other)   # errors="ignore"
df
                
            
               A  B
0  1  5
1  2  6

The update completes without any error, even if non-NaN values are updated with non-NaN values.

Performing the update with errors="raise" gives:


        
        
            
                
                
                    df.update(df_other, errors="raise")
df
                
            
            ValueError: Data overlaps.

We end up with an error because we are trying to update non-NaN values with non-NaN values. Note that if column B in df_other just had NaN as its values, then no error will be thrown.

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

Official Pandas Documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.update.html

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!