PySpark: Save a DataFrame To ADLS

(Last Updated On: )

This how-to is how to save a DataFrame to ADLS

First we need a spark Session. See PySpark: Create a Spark Session for my details on that.

Then we need to create a DataFrame. See PySpark: Create a DataFrame.

Then we do the following:

You should note you don’t need all the options below. I just gave an example.

path = 'abfss://my_container@my_storage_account.dfs.core.windows.net/my_folder/'
mode = 'overwrite'
format = 'delta'
partitions = []

df.write.mode(mode).format(format).option('mergeSchema', False).partitionBy(*partitions).save(path)