chevron_left
PySpark DataFrame
Method aliasMethod coalesceMethod collectMethod colRegexMethod corrMethod countMethod covMethod describeMethod distinctMethod dropMethod dropDuplicatesMethod dropnaMethod exceptAllMethod fillnaMethod filterMethod foreachMethod groupByMethod headMethod intersectMethod intersectAllMethod joinMethod limitMethod orderByMethod printSchemaMethod randomSplitMethod repartitionMethod replaceMethod sampleMethod sampleByMethod selectMethod selectExprMethod showMethod sortMethod summaryMethod tailMethod takeMethod toDFMethod toJSONMethod toPandasMethod transformMethod unionMethod unionByNameMethod whereMethod withColumnMethod withColumnRenamedProperty columnsProperty dtypesProperty rdd
check_circle
Mark as learned thumb_up
0
thumb_down
0
chat_bubble_outline
0
auto_stories new
settings
PySpark DataFrame | toDF method
Machine Learning
chevron_rightPySpark
chevron_rightDocumentation
chevron_rightPySpark DataFrame
schedule Jul 1, 2022
Last updated local_offer PySpark
Tags tocTable of Contents
expand_more Check out the interactive map of data science
PySpark DataFrame's toDF(~)
method returns a new DataFrame with the columns arranged in the order that you specify.
WARNING
This method only allows you to change the ordering of the columns - the new DataFrame must contain the same columns as before.
Parameters
1. *cols
| str
The columns to include.
Return Value
A PySpark DataFrame.
Examples
Consider the following PySpark DataFrame:
+----+---+|name|age|+----+---+|Alex| 20|| Bob| 30|+----+---+
Arranging columns in specific order in PySpark
To arrange the columns from age
first and name
second:
+----+----+| age|name|+----+----+|Alex| 20|| Bob| 30|+----+----+
Note that if the columns of the new DataFrame do not match the original DataFrame, then an error will be thrown:
IllegalArgumentException: requirement failed: The number of columns doesn't match.Old column names (2): name, ageNew column names (1): age
Arrange columns in alphabetical order in PySpark
To arrange the columns in alphabetical order:
Here:
sorted(~)
returns the column labels in alphabetical order.the
*
is used to convert the list into positional arguments.
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...
Official PySpark Documentation
https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.toDF.html
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!