Pandas Series str | split method
Start your free 7-days trial now!
Pandas Series.str.split(~) method performs a split on each string in the Series.
Parameters
1. pat | string | optional
The string or regular expression pattern to split the strings on. By default, pat=" " (a single whitespace).
2. n | int | optional
The number of splits to allow for each value. By default, there is no limit. Note that parameter values None, 0 or -1 will be interpreted as no limit.
3. expand | boolean | optional
If
True, then the returned list is horizontally expanded to separate columns.If
False, then a list is returned for each value.
By default, expand=False.
Return Value
If expand=True, then a DataFrame/MultiIndex is returned. Otherwise, a Series/Index is returned.
Examples
Basic usage
Consider the following Series:
s
0 a1 a_12 a_2dtype: object
To split each string by _:
s.str.split("_")
0 [a]1 [a, 1]2 [a, 2]dtype: object
Notice how each value in the Series is now a list.
Using regex
Regex can be directly used as the separator:
s.str.split(r'[_*]')
0 [a, 1]1 [a, 2]dtype: object
Specifying n
By default, there is no limit as to how many splits can be made:
s.str.split("_")
0 [a, 1]1 [a, 2, 3]dtype: object
To allow at most 1 split to take place for each value:
s.str.split("_", n=1)
0 [a, 1]1 [a, 2_3]dtype: object
Specifying expand
By default, expand=False, which means that each value becomes a list:
s.str.split("_")
0 [a]1 [a, 1]2 [a, 2]dtype: object
You can expand the list by setting expand=True like so:
s.str.split("_", expand=True) # returns a DataFrame
0 10 a None1 a 12 a 2
Handling missing values
The result of a split for a individual missing value (NaN) is also NaN:
s.str.split("_")
0 [a, 1]1 NaNdtype: object