Difference between None and NaN in Pandas
Start your free 7-days trial now!
The distinction between None and NaN in Pandas is subtle:
Nonerepresents a missing entry, but its type is not numeric. This means that any column (Series) that contains aNonecannot be of type numeric (e.g.intandfloat).NaN, which stands for not-a-number, is a numeric type. This means thatNaNcan appear in columns of typeintandfloat.
Numeric Series
Consider a Series initialised with None:
s = pd.Series([3,None])s
0 3.01 NaNdtype: float64
The resulting Series contains a NaN instead of None. This is because Pandas automatically converted None to NaN given that the other value (3) is a numeric, which then allows the column type to be float64. If None was not casted into NaN, then the column type would end up as object, which is inaccurate and makes certain operations in Pandas less performant.
Let us create a Series with NaN:
import numpy as nps = pd.Series([3,np.nan])s
0 3.01 NaNdtype: float64
As you would expect, the result is identical, and the only difference is that Pandas did not need to perform any casting from None to NaN since NaN was directly given.
Non-numeric Series
We have seen that None is automatically converted into NaN when the Series type is numeric.
For non-numeric Series, None does not get casted to NaN:
s = pd.Series(["3",None])s
0 31 Nonedtype: object
In comparison, creating a Series with NaN:
s = pd.Series(["3",np.nan])s
0 31 NaNdtype: object
Here, NaN simply remains a NaN since numeric values are allowed in a Series that holds other data types (a string in this case). Note that since the Series holds mixed-types, the dtype is object.
Arithmetics
The fact that None is not a numeric type, whereas NaN is, has consequences when performing arithmetics.
When performing arithmetics with None:
None + 5
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
Here, we get an error because a summation between a non-numeric type (None) and a number is not defined.
In contrast:
np.nan + 5
nan
Here, no error is thrown and instead, a NaN is returned. Any arithmetic operation that involves a NaN will result in another NaN.
Equality comparison
Another difference in how None and NaN behave is in equality comparison.
Equating None will result in True:
None == None
True
Equating NaN will result in False:
np.nan == np.nan
False
As a side note, equating anything with NaN will result in False:
np.nan == None
False
To check for values that are NaN, instead of using ==, opt to use isna(~):
pd.isna(np.nan)
True
Note that isna(~) returns True for None as well:
pd.isna(None)
True