I have loaded a text file using the load
csv function but when I try to print the schema it shows just one field from the root including every row in that one. like this:
root
|-- Prscrbr_Geo_Lvl Prscrbr_Geo_Cd Prscrbr_Geo_Desc Brnd_Name
Any idea how to fix this?
CodePudding user response:
Adding my comment as an answer since it seems to have solved the problem.
From the output, it looks like the CSV file is actually using tab characters as the separator between columns instead of commas. To get Spark to use tabs as the separator, you can use spark.read.format("csv").option("sep", "\t").load("/path/to/file")