Whether to return a new copy. If copy=False and no reindexing is performed, then the original DataFrames/Series will be returned. By default, copy=True.

6. fill_value | scalar | optional

The value to fill missing values (NaN). By default, fill_value=np.NaN, that is, the missing values are left as is.

7. method | None or string | optional

The method by which to fill missing values:

Method	Description
`"pad"` or `"ffill"`	Fill using the previous valid observation
`"backfill"` or `"bfill"`	Fill using the next valid observation

By default, method=None.

8. limit | int | optional

The maximum number of consecutive fills allowed. For instance, if you have 3 consecutive NaNs, and you set limit=2, then only the first two NaNs will be filled, and the third will be left as is. By default, limit=None.

9. fill_axis | int or string | optional

Whether to apply the method horizontally or vertically:

Axis	Description
`0` or `"index"`	Filling is applied vertically.
`1` or `"columns"`	Filling is applied horizontally.

By default, fill_axis=0.

10. broadcast_axis | int or string | optional

The axis along which to perform broadcasting:

Axis	Description
`0` or `"index"`	Broadcast along the index axis.
`1` or `"columns"`	Broadcast along the columns axis.

By default, broadcast_axis=None. This is only relevant when the source DataFrame and other have different dimensions.

Return value

A sized-two tuple of DataFrames (aligned source DataFrame, other DataFrame/Series).

Examples

Specifying the join type

Consider the following two DataFrames:


        
        
            
                
                
                    df_one = pd.DataFrame({"A":[1,2], "B":[3,4], "C":[5,6]})
df_two = pd.DataFrame({"A":[7,8], "E":[9,10], "B":[11,12]}, index=["a","b"])
                
            
             [df_one]          [df_two]
   A  B  C          A  E   B
0  1  3  5       a  7  9   11
1  2  4  6       b  8  10  12

Outer full-join

To align the two DataFrame via outer full-join:


        
        
            
                
                
                    a_one, a_two = df_one.align(df_two, axis=1)      # join="outer"
                
            
                [a_one]        |        [a_two]
   A  B  C  E      |      A  B   C    E
0  1  3  5  NaN    |   a  7  12  NaN  9
1  2  4  6  NaN    |   b  8  12  NaN  10

Here, note the following:

By default, join="outer", which means that the resulting DataFrames will have all column labels present in both the input DataFrames. This is the reason we see column label E in a_one, and column label C in a_two.
The axis=1 parameter is telling Pandas to perform the alignment column-wise.
Despite the fact that new columns are added, they do not hold any values as they are filled with NaN.

Inner join

To align via an inner-join:


        
        
            
                
                
                    a_one, a_two = df_one.align(df_two, join="inner", axis=1)
                
            
             [a_one]      [a_two]
   A  B          A  B  
0  1  3       a  7  11
1  2  4       b  8  12

We obtain this result because column labels "A" and "B" are present in both the DataFrames - every other columns are stripped away.

Left join

To align via a left-join:


        
        
            
                
                
                    a_one, a_two = df_one.align(df_two, join="left", axis=1)
a_one
                
            
              [a_one]         [a_two]
   A  B  C        A  B   C
0  1  3  5     a  7  11  NaN
1  2  4  6     b  8  12  NaN

By performing a left join, we are ensuring that the other DataFrame has all the column labels of the source DataFrame. This is why we see column C appear in a_two.

Specifying the axis

Once again, suppose we have the following two DataFrames:


        
        
            
                
                
                    df_one = pd.DataFrame({"A":[1,2], "B":[3,4], "C":[5,6]})
df_two = pd.DataFrame({"A":[7,8], "E":[9,10], "B":[11,12]}, index=["a","b"])
                
            
             [df_one]         [df_two]
   A  B  C          A  E   B
0  1  3  5       a  7  9   11
1  2  4  6       b  8  10  12

axis=0


        
        
            
                
                
                    a_one, a_two = df_one.align(df_two, axis=0)
                
            
                [a_one]               [a_two]
   A    B    C          A    E    B
0  1.0  3.0  5.0     0  NaN  NaN  NaN
1  2.0  4.0  6.0     1  NaN  NaN  NaN
a  NaN  NaN  NaN     a  7.0  9.0  11.0
b  NaN  NaN  NaN     b  8.0  10.0 12.0

By setting axis=0, we are telling Pandas to align the row labels, that is, for both resulting DataFrames to have the exact same row labels. However, notice how the column labels are kept intact for both DataFrames.

axis=1


        
        
            
                
                
                    a_one, a_two = df_one.align(df_two, axis=1)
                
            
                [a_one]        |        [a_two]
   A  B  C  E      |      A  B   C    E
0  1  3  5  NaN    |   a  7  12  NaN  9
1  2  4  6  NaN    |   b  8  12  NaN  10

By setting axis=1, we are telling Pandas to align the column labels, that is, for both resulting DataFrames to have the exact same column labels. However, notice how the row labels are kept intact for both DataFrames.

axis=None

The default parameter value is axis=None:


        
        
            
                
                
                    a_one, a_two = df_one.align(df_two)      # axis=None
                
            
                   [a_one]                    [a_two]
   A    B    C    E         A    B     C    E
0  1.0  3.0  5.0  NaN    0  NaN  NaN   NaN  NaN
1  2.0  4.0  6.0  NaN    1  NaN  NaN   NaN  NaN
a  NaN  NaN  NaN  NaN    a  7.0  11.0  NaN  9.0
b  NaN  NaN  NaN  NaN    b  8.0  12.0  NaN  10.0

The axis=None is a combination of axis=0 and axis=1, that is, the resulting DataFrames will share the same row labels as well as the column labels.

Performing filling

Consider the same DataFrames we had before:


        
        
            
                
                
                    df_one = pd.DataFrame({"A":[1,2], "B":[3,4], "C":[5,6]})
df_two = pd.DataFrame({"A":[7,8], "E":[9,10], "B":[11,12]}, index=["a","b"])
                
            
             [df_one]          [df_two]
   A  B  C          A  E   B
0  1  3  5       a  7  9   11
1  2  4  6       b  8  10  12

Performing horizontal alignment using outer full-join yields:


        
        
            
                
                
                    a_one, a_two = df_one.align(df_two, axis=1)      # join="outer"
                
            
                [a_one]        |        [a_two]
   A  B  C  E      |      A  B   C    E
0  1  3  5  NaN    |   a  7  12  NaN  9
1  2  4  6  NaN    |   b  8  12  NaN  10

Notice how we end up with missing values here since no filling is performed by default.

To fill the NaNs, we can specify parameters method and optionally fill_axis:


        
        
            
                
                
                    a_one, a_two = df_one.align(df_two, axis=1, method="ffill", fill_axis=1)
a_one, a_two
                
            
                    [a_one]         |           [a_two]
   A    B    C    E     |      A    B     C     E
0  1.0  3.0  5.0  5.0   |   a  7.0  11.0  11.0  9.0
1  2.0  4.0  6.0  6.0   |   b  8.0  12.0  12.0  10.0

Here, note the following:

method="ffill" applies a forward-fill, meaning NaNs are filled using the previous valid observation.
fill_axis=1 performs the forward-fill horizontally.

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

Official Pandas Documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.align.html

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!