PySpark Column | alias method
Start your free 7-days trial now!
PySpark Column's alias(~) method assigns a column label to a PySpark Column.
Parameters
1. *alias | string
The column label.
2. metadata | dict | optional
A dictionary holding additional meta-information to store in the StructField of the returned Column.
Return Value
A new PySpark Column.
Examples
Consider the following PySpark DataFrame:
+-----+---+| name|age|+-----+---+| ALEX| 20|| BOB| 30||CATHY| 40|+-----+---+
Most methods in the PySpark SQL Functions library return Column objects whose label is governed by the method that we use. For instance, consider the lower(~) method:
Here, the PySpark Column returned by lower(~) has the label lower(name) by default.
Assigning new label to PySpark Column using the alias method
We can assign a new label to a column by using the alias(~) method:
Here, we have assigned the label "lower_name" to the column returned by lower(~).
Storing meta-data in PySpark Column's alias method
To store some meta-data in a PySpark Column, we can add the metadata option in alias(~):
The metadata is a dictionary that will be stored in the Column object.
To access the metadata, we can use the PySpark DataFrame's schema property:
df_new.schema["lower_name"].metadata["some_data"]
10