my Split() function return only NaN or null data-CodePudding

first thank you for your time.

I need to make a cross between two Pandas dataframes with the values that I have in a field. The values come in the following form within a field: [A,B,C,N]

I am trying to apply a SPLIT() function to the dataframe field as follows:

df_test = df_temp["NAME"].str.split(expand=True)

The "Name" field is of type object.

My problem is that for some reason my split() splits the values of my NAME field with Null (NaN) values. I don't understand what I'm doing wrong.

From already thank you very much.

CodePudding user response：

You should provide a value to separate the column with.

df['Name'].str.split(',', expand=True)

CodePudding user response：

Based upon the input -

Input data -

from pyspark.sql.types import *
from pyspark.sql.functions import *

df = spark.createDataFrame([(['A', 'B', 'C', 'D'],), ], schema = ['Name'])
df.show()

 ------------ 
|        Name|
 ------------ 
|[A, B, C, D]|
 ------------

Required Output -

df.select(explode(col("Name")).alias("exploded_Name")).show()

 ------------- 
|exploded_Name|
 ------------- 
|            A|
|            B|
|            C|
|            D|
 -------------