PySpark SQL Functions | element_at method
Start your free 7-days trial now!
PySpark SQL Functions' element_at(~) method is used to extract values from lists or maps in a PySpark Column.
Parameters
1. col | string or Column
The column of lists or maps from which to extract values.
2. extraction | int
The position of the value that you wish to extract. Negative positioning is supported - extraction=-1 will extract the last element from each list.
The position is not indexed-based. This means that extraction=1 will extract the first value in the lists or maps.
Return Value
A new PySpark Column.
Examples
Extracting n-th value from arrays in PySpark Column
Consider the following PySpark DataFrame that contains some lists:
        
        
            
                
                
            
            +------+|  vals|+------+|[5, 6]||[7, 8]|+------+
        
    To extract the second value from each list in vals, we can use element_at(~) like so:
        
        
    Here, note the following:
- the position - 2is not index-based.
- we are using the - alias(~)method to assign a label to the column returned by- element_at(~).
Note that extracting values that are out of bounds will return null:
        
        
    We can also extract the last element by supplying a negative value for extraction:
        
        
    Extracting values from maps in PySpark Column
Consider the following PySpark DataFrame containing some dict values:
        
        
            
                
                
            
            +----------------+|            vals|+----------------+|        {A -> 4}||{A -> 5, B -> 6}|+----------------+
        
    To extract the values that has the key 'A' in the vals column:
        
        
    Note that extracting values using keys that do not exist will return null:
        
        
    Here, the key 'B' does not exist in the map {'A':4} so a null was returned for that row.
