Home > Software engineering >  How to get the schema of the Dataframe as a string using spark-scala?
How to get the schema of the Dataframe as a string using spark-scala?

Time:03-16

I'm new to dataframes and my question is there any way i could get the dataframe schema as a string in spark scala?

val df = spark.read.option("inferSchema","true").option("header","true").csv("sample_file1.txt")
df.show(truncate = false)

I have read the above dataframe and in the result section i'm getting the schema as:

Schema: code0, code1, date, hi, _c4, first_name, _c6, last_name1, _c8, _c9, _c10, _c11, _c12, _c13, _c14, _c15, _
c16, _c17, _c18, _c19, _c20, _c21, _c22, _c23, _c24

How will I be able to read this as a string as I need to validate this schema in spark SQL by passing this as a string..

Please share your suggestions.

CodePudding user response:

Generation of a sample dataframe

val df = spark.range(1).selectExpr("'hello' as mystr","1 as myint","2.3 as mydec","current_date as mydt")

The solution

val cols = df.columns.mkString(",")

println(cols)

mystr,myint,mydec,mydt

CodePudding user response:

If you want the list of columns as a string, David's answer will work. If you want the actual schema as a string (for some reason):

val schemaAsString = yourDF.schema.toString
  • Related