Home > Net >  How to load all csv files in a folder with pyspark
How to load all csv files in a folder with pyspark

Time:09-30

I have a folder which has

Sales_December.csv
Sales_January.csv
Sales_February.csv
etc.

How can i make pyspark read all of them into 1 dataframe?

CodePudding user response:

  • create an empty list
  • read your csv files one by one and append DataFrames to the list
  • use reduce(DataFrame.unionAll, <list>) to combine them into one single DataFrame
  • Related