Pandas DataFrame | fillna method
Start your free 7-days trial now!
Pandas' DataFrame.fillna(~) method fills NaN (missing values) with a specified value or with a filling rule.
Parameters
1. valuelink | scalar or dict or Series or DataFrame | optional
The value to replace NaN. If a dict or Series is specified, then the key/index is the column label, and the value is the filler.
2. methodlink | None or string | optional
The rule by which to fill NaN:
Value | Description |
|---|---|
| Use the next non- |
| Use the previous non- |
| Does not perform any filling. |
By default, method=None.
Only specify either value or method - not both.
3. axislink | int or string | optional
Whether to fill each column or each row:
Axis | Description |
|---|---|
| Fill column-wise |
| Fill row-wise |
By default, axis=0. Note that this is only relevant if you specified method instead of value.
4. inplace | boolean | optional
If
True, then the method will directly modify the source DataFrame instead of creating a new DataFrame.If
False, then a new DataFrame will be created and returned.
By default, inplace=False.
5. limitlink | None or int | optional
If
methodis specified, thenlimitis the maximum number of consecutiveNaNto fill. If there are more thanlimitnumber of consecutiveNaN, then no fill will be performed.If
methodis not specified, thenlimitis the maximum number of fills to perform per row or column.
By default, limit=None.
Return Value
A DataFrame with NaN replaced by a filler.
Examples
Consider the following DataFrame:
df
A B C0 NaN 7.0 9.01 5.0 NaN NaN2 6.0 8.0 NaN
Filling with a value
To fill NaN with the value 10:
df.fillna(10)
A B C0 10.0 7.0 9.01 5.0 10.0 10.02 6.0 8.0 10.0
Filling certain columns only
To specify which columns to fill, provide a dict or Series like so:
df.fillna({"A":"*","C":10})
A B C0 * 7.0 9.01 5 NaN 10.02 6 8.0 10.0
Notice how column B still has NaN here since the provided dict has no B key.
Specifying a filling method
Consider the same df as before:
df
A B C0 NaN 7.0 9.01 5.0 NaN NaN2 6.0 8.0 NaN
Backward-fill
To fill NaNs with the next non-NaN value column-wise:
df.fillna(method="bfill") # or method="backfill"
A B C0 5.0 7.0 9.01 5.0 8.0 NaN2 6.0 8.0 NaN
Notice how we still have some NaNs remaining. This is because there is no value that comes after the NaN, and so we don't have a filling value.
Forward-fill
To fill NaNs with the previous non-NaN value column-wise:
df.fillna(method="ffill") # or method="pad"
A B C0 NaN 7.0 9.01 5.0 7.0 9.02 6.0 8.0 9.0
Again, we have a NaN at A[0] because there is no value that comes before the NaN, and so we don't have a filling value.
Specifying the axis
Just for your reference, here's the same df:
df
A B C0 NaN 7.0 9.01 5.0 NaN NaN2 6.0 8.0 NaN
By default, axis=0, which means that the filling method is applied column-wise:
df.fillna(method="ffill") # axis=0
A B C0 NaN 7.0 9.01 5.0 7.0 9.02 6.0 8.0 9.0
We can perform the filling row-wise by setting axis=1:
df.fillna(method="ffill", axis=1)
A B C0 NaN 7.0 9.01 5.0 5.0 5.02 6.0 8.0 8.0
Note that the axis parameter is only relevant if you specify the method parameter instead of value.
Specifying a limit
When parameter method is specified
If method is specified, then limit is the maximum number of consecutive NaN to fill.
Just for your reference, here's the same df:
df
A B C0 NaN 7.0 9.01 5.0 NaN NaN2 6.0 8.0 NaN
To limit the number of consecutive NaNs to fill to 1:
df.fillna(method="ffill", limit=1)
A B C0 NaN 7.0 9.01 5.0 7.0 9.02 6.0 8.0 NaN
Notice how cell C[2] still has a NaN value. This is because we have 2 consecutive NaNs here, and since we specified limit=1, only the first one got filled.
When parameter method is not specified
When method is not specified, then the limit represents the maximum number of fills to perform per row or column. For instance, consider the following DataFrame:
df
A B C0 NaN 7.0 9.01 5.0 NaN NaN2 NaN 8.0 NaN
Performing a fill with limit=1 yields:
df.fillna(2, limit=1)
A B C0 2.0 7.0 9.01 5.0 2.0 2.02 NaN 8.0 NaN
Notice how column A and C still have NaNs in them.