PySpark SQL Functions | lit method
Start your free 7-days trial now!
PySpark SQL Functions' lit(~) method creates a Column object with the specified value.
Parameters
1. col | value
A value to fill the column.
Return Value
A Column object.
Examples
Consider the following PySpark DataFrame:
+----+---+|name|age|+----+---+|Alex| 20|| Bob| 30|+----+---+
Creating a column of constants in PySpark DataFrame
To create a new PySpark DataFrame with the name column of df and a new column called is_single made up of True values:
Here, F.lit(True) returns a Column object, which has a method called alias(~) that assigns a label.
Note that you could append a new column of constants using the withColumn(~) method:
import pyspark.sql.functions as F
+----+---+---------+|name|age|is_single|+----+---+---------+|Alex| 20| true|| Bob| 30| true|+----+---+---------+
Creating a column whose values are based on a condition in PySpark
We can also use lit(~) to create a column whose values depend on some condition:
import pyspark.sql.functions as F
+----+---+------+|name|age|status|+----+---+------+|Alex| 20|junior|| Bob| 30|senior|+----+---+------+
Note the following:
we are using the
when(~)andotherwise(~)pattern to fill the values of the column conditionally.we are using the
withColumn(~)method to append a new column namedstatus.the
F.lit("junior")can actually be replaced by"junior"- this is just to demonstrate one usage oflit(~).