PySpark | User Guide
Start your free 7-days trial now!
PySpark is an API interface that allows you to write Python code to interact with Apache Spark, which is an open source distributing computing framework to handle big data.
RDD is the central data structure of Spark in which the data is partitioned across a number of worker nodes to facilitate parallel operations.
Databricks offer a platform to gain some hands-on experience with PySpark for free using the community edition.