Home > database >  Converting to Encoding Cyclical Features in Pyspark
Converting to Encoding Cyclical Features in Pyspark

Time:03-04

I try to convert the month, weak and dayofyear columns into Cyclical Features (sin, cos) , my python code is like this :

def encode(data, col, max_val):
data[col   '_sin'] = np.sin(2 * np.pi * data[col]/max_val)
data[col   '_cos'] = np.cos(2 * np.pi * data[col]/max_val)
return data 

The code in pyspark is this :

df = df.withColumn('month_sin',np.sin(2 * np.pi * df['month']/12)) 

I get this error:

TypeError: loop of ufunc does not support argument 0 of type Column which has no callable sin method

The column type of month is integer , I converted it to float and double but it did not help .

Note: The column has no zero (0) value.

CodePudding user response:

You'd have to use PySpark sin and python math.pi instead of np

  • Related