rows = [[[5,6]], [[7,8]]]
df = spark.createDataFrame(rows, ['vals'])
df.show()
                
            
            +------+
|  vals|
+------+
|[5, 6]|
|[7, 8]|
+------+

To extract the second value from each list in vals, we can use element_at(~) like so:


        
        
            
                
                
                    df_res = df.select(F.element_at('vals',2).alias('2nd value'))
df_res.show()
                
            
            +---------+
|2nd value|
+---------+
|        6|
|        8|
+---------+

Here, note the following:

the position 2 is not index-based.
we are using the alias(~) method to assign a label to the column returned by element_at(~).

Note that extracting values that are out of bounds will return null:


        
        
            
                
                
                    df_res = df.select(F.element_at('vals',3))
df_res.show()
                
            
            +-------------------+
|element_at(vals, 3)|
+-------------------+
|               null|
|               null|
+-------------------+

We can also extract the last element by supplying a negative value for extraction:


        
        
            
                
                
                    df_res = df.select(F.element_at('vals',-1).alias('last value'))
df_res.show()
                
            
            +----------+
|last value|
+----------+
|         6|
|         8|
+----------+

Extracting values from maps in PySpark Column

Consider the following PySpark DataFrame containing some dict values:


        
        
            
                
                
                    rows = [[{'A':4}], [{'A':5, 'B':6}]]
df = spark.createDataFrame(rows, ['vals'])
df.show()
                
            
            +----------------+
|            vals|
+----------------+
|        {A -> 4}|
|{A -> 5, B -> 6}|
+----------------+

To extract the values that has the key 'A' in the vals column:


        
        
            
                
                
                    df_res = df.select(F.element_at('vals', F.lit('A')))
df_res.show()
                
            
            +-------------------+
|element_at(vals, A)|
+-------------------+
|                  4|
|                  5|
+-------------------+

Note that extracting values using keys that do not exist will return null:


        
        
            
                
                
                    df_res = df.select(F.element_at('vals', F.lit('B')))
df_res.show()
                
            
            +-------------------+
|element_at(vals, B)|
+-------------------+
|               null|
|                  6|
+-------------------+