I have a csv file that does not have headers, and it consists of 49 columns. I was given a separate csv file with columns' description and column name. Instead of adding StructField 49 times (like StructField("srcip",StringType(),True)), is there another way to do it? Like a function?
Thank you.
CodePudding user response:
Assuming you have a list of column names (by reading from csv etc), you can loop through it and create a proper schema
cols = ['a', 'b', 'c']
schema = T.StructType([T.StructField(c, T.StringType()) for c in cols])
# StructType(List(StructField(a,StringType,true),StructField(b,StringType,true),StructField(c,StringType,true)))