PySpark SQL Functions | date_format method
Start your free 7-days trial now!
PySpark SQL Functions' date_format(~) method converts a date, timestamp or string into a date string with the specified format.
Parameters
1. date | Column or string
The date column - this could be of type date, timestamp or string.
2. format | string
The format of the resulting date string.
Return Value
A Column object of date strings.
Examples
Formatting date strings in PySpark DataFrame
Consider the following PySpark DataFrame with some date strings:
+----+----------+|name| birthday|+----+----------+|Alex|1995-12-16|| Bob|1998-05-06|+----+----------+
To convert the date strings in the column birthday:
Here,:
"dd/MM/yyyy"indicates a date string starting with the day, then month, then year.alias(~)is used to give a name to theColumnobject returned bydate_format(~).
Formatting datetime values in PySpark DataFrame
Consider the following PySpark DataFrame with some datetime values:
import datetimedf = spark.createDataFrame([["Alex", datetime.date(1995,12,16)], ["Bob", datetime.date(1995,5,9)]], ["name", "birthday"])
+----+----------+|name| birthday|+----+----------+|Alex|1995-12-16|| Bob|1995-05-09|+----+----------+
To convert the datetime values in column birthday:
Here, we are using the date format "dd-MM-yyyy", which means day first, and then month followed by year. We also assign the column name "birthday_new" to the Column returned by date_format().