search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to

PySpark SQL Functions | explode method

schedule Aug 12, 2023
Last updated
local_offer
PySpark
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

PySpark SQL Functions' explode(~) method flattens the specified column values of type list or dictionary.

Parameters

1. col | string or Column

The column containing lists or dictionaries to flatten.

Return Value

A new PySpark Column.

Examples

Flattening lists

Consider the following PySpark DataFrame:

df = spark.createDataFrame([[['a','b']],[['d']]], ['vals'])
df.show()
+------+
| vals|
+------+
|[a, b]|
| [d]|
+------+

Here, the column vals contains lists.

To flatten the lists in the column vals, use the explode(~) method:

import pyspark.sql.functions as F
df.select(F.explode('vals').alias('exploded')).show()
+--------+
|exploded|
+--------+
| a|
| b|
| d|
+--------+

Here, we are using the alias(~) method to assign a label to the column returned by explode(~).

Flattening dictionaries

Consider the following PySpark DataFrame:

df = spark.createDataFrame([[{'a':'b'}],[{'c':'d','e':'f'}]], ['vals'])
df.show()
+----------------+
| vals|
+----------------+
| {a -> b}|
|{e -> f, c -> d}|
+----------------+

Here, the column vals contains dictionaries.

To flatten each dictionary in column vals, use the explode(~) method:

df.select(F.explode('vals').alias('exploded_key', 'exploded_val')).show()
+------------+------------+
|exploded_key|exploded_val|
+------------+------------+
| a| b|
| e| f|
| c| d|
+------------+------------+

In the case of dictionaries, the explode(~) method returns two columns - the first column contains all the keys while the second column contains all the values.

robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
1
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!