How to load all csv files in a folder with pyspark-CodePudding

Home > Net > How to load all csv files in a folder with pyspark

How to load all csv files in a folder with pyspark

Time：09-30

I have a folder which has

Sales_December.csv
Sales_January.csv
Sales_February.csv
etc.

How can i make pyspark read all of them into 1 dataframe?

CodePudding user response：

create an empty list
read your csv files one by one and append DataFrames to the list
use reduce(DataFrame.unionAll, <list>) to combine them into one single DataFrame