Home > Enterprise >  printSchema having all columns in the first one
printSchema having all columns in the first one

Time:12-11

I have loaded a text file using the load csv function but when I try to print the schema it shows just one field from the root including every row in that one. like this:

root
 |-- Prscrbr_Geo_Lvl    Prscrbr_Geo_Cd  Prscrbr_Geo_Desc    Brnd_Name

Any idea how to fix this?

CodePudding user response:

Adding my comment as an answer since it seems to have solved the problem.

From the output, it looks like the CSV file is actually using tab characters as the separator between columns instead of commas. To get Spark to use tabs as the separator, you can use spark.read.format("csv").option("sep", "\t").load("/path/to/file")

  • Related