PySpark SQL Functions | explode method
Start your free 7-days trial now!
PySpark SQL Functions' explode(~) method flattens the specified column values of type list or dictionary.
Parameters
1. col | string or Column
The column containing lists or dictionaries to flatten.
Return Value
A new PySpark Column.
Examples
Flattening lists
Consider the following PySpark DataFrame:
+------+| vals|+------+|[a, b]|| [d]|+------+
Here, the column vals contains lists.
To flatten the lists in the column vals, use the explode(~) method:
Here, we are using the alias(~) method to assign a label to the column returned by explode(~).
Flattening dictionaries
Consider the following PySpark DataFrame:
+----------------+| vals|+----------------+| {a -> b}||{e -> f, c -> d}|+----------------+
Here, the column vals contains dictionaries.
To flatten each dictionary in column vals, use the explode(~) method:
In the case of dictionaries, the explode(~) method returns two columns - the first column contains all the keys while the second column contains all the values.