PySpark SQL Functions | element_at method
Start your free 7-days trial now!
PySpark SQL Functions' element_at(~) method is used to extract values from lists or maps in a PySpark Column.
Parameters
1. col | string or Column
The column of lists or maps from which to extract values.
2. extraction | int
The position of the value that you wish to extract. Negative positioning is supported - extraction=-1 will extract the last element from each list.
The position is not indexed-based. This means that extraction=1 will extract the first value in the lists or maps.
Return Value
A new PySpark Column.
Examples
Extracting n-th value from arrays in PySpark Column
Consider the following PySpark DataFrame that contains some lists:
+------+| vals|+------+|[5, 6]||[7, 8]|+------+
To extract the second value from each list in vals, we can use element_at(~) like so:
Here, note the following:
the position
2is not index-based.we are using the
alias(~)method to assign a label to the column returned byelement_at(~).
Note that extracting values that are out of bounds will return null:
We can also extract the last element by supplying a negative value for extraction:
Extracting values from maps in PySpark Column
Consider the following PySpark DataFrame containing some dict values:
+----------------+| vals|+----------------+| {A -> 4}||{A -> 5, B -> 6}|+----------------+
To extract the values that has the key 'A' in the vals column:
Note that extracting values using keys that do not exist will return null:
Here, the key 'B' does not exist in the map {'A':4} so a null was returned for that row.