Difference between None and NaN in Pandas
The distinction between
NaN in Pandas is subtle:
Nonerepresents a missing entry, but its type is not numeric. This means that any column (Series) that contains a
Nonecannot be of type numeric (e.g.
NaN, which stands for not-a-number, is a numeric type. This means that
NaNcan appear in columns of type
Consider a Series initialised with
s = pd.Series([3,None])s0 3.01 NaNdtype: float64
The resulting Series contains a
NaN instead of
None. This is because Pandas automatically converted
NaN given that the other value (
3) is a numeric, which then allows the column type to be
None was not casted into
NaN, then the column type would end up as
object, which is inaccurate and makes certain operations in Pandas less performant.
Let us create a Series with
import numpy as nps = pd.Series([3,np.nan])s0 3.01 NaNdtype: float64
As you would expect, the result is identical, and the only difference is that Pandas did not need to perform any casting from
NaN was directly given.
We have seen that
None is automatically converted into
NaN when the Series type is numeric.
For non-numeric Series,
None does not get casted to
s = pd.Series(["3",None])s0 31 Nonedtype: object
In comparison, creating a Series with
s = pd.Series(["3",np.nan])s0 31 NaNdtype: object
NaN simply remains a
NaN since numeric values are allowed in a Series that holds other data types (a
string in this case). Note that since the Series holds mixed-types, the dtype is
The fact that
None is not a numeric type, whereas
NaN is, has consequences when performing arithmetics.
When performing arithmetics with
None + 5TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
Here, we get an error because a summation between a non-numeric type (
None) and a number is not defined.
np.nan + 5nan
Here, no error is thrown and instead, a
NaN is returned. Any arithmetic operation that involves a
NaN will result in another
Another difference in how
NaN behave is in equality comparison.
None will result in
None == NoneTrue
NaN will result in
np.nan == np.nanFalse
As a side note, equating anything with
NaN will result in
np.nan == NoneFalse
To check for values that are
NaN, instead of using
==, opt to use
None as well: