Home > Net >  I have a pyspark DataFrame and I want to cast the type of it's columns
I have a pyspark DataFrame and I want to cast the type of it's columns

Time:09-30

df = spark.read.csv("Sales_December.csv", header=True)

df.printSchema()

returns string for Order Id

I want to change the schema so it returns int for order id

CodePudding user response:

from pyspark.sql.types import IntegerType

df = df.withColumn('order id', func.col('order id').cast(IntegerType()))

CodePudding user response:

you can use withColumn.

WithColumn syntax -->withColumn(colName : String, col : Column) : DataFrame

For example:

df2 = df.withColumn("Order Id",col("Order Id").cast(IntegerType())) 
df2.printSchema()
  • Related