Home > Mobile >  Add dataframe to list of dataframes -pyspark
Add dataframe to list of dataframes -pyspark

Time:12-03

I have created a list of dataframes and now want to add another dataframe in order to union this list afterwards.

df_list = [spark.createDataFrame([("3", "4")], db_tables_schema), spark.createDataFrame([("1", "2")], db_tables_schema), spark.createDataFrame([("7", "4")], db_tables_schema)]

df_list = df_list.append([spark.createDataFrame([("7", "4")], db_tables_schema)])

#Reduce to one df
df_complete = reduce(DataFrame.union, df_list)

I thought that with .append one could easily add the element, however if I do so a "NoneType" Element is being created - so basically my list isn´t existing anymore. What´s the matter here. Thanks for helping!

CodePudding user response:

Careful, you are using append function and it adds an item to a list inplace. This function returns None, so if you assign to df_list the return of df_list.append it is the same as : df_list = None. Just do it like this:

df_list.append(my_element)

Secondly, you are trying to append a list of item(s), if you do it with append your list will look like this:

my_list = [1, 2, 3]
my_list.append([4])
print(my_list)  # [1, 2, 3, [4]]
my_list.append([5, 6])
print(my_list)  # [1, 2, 3, [4], [5, 6]]

You should use extend instead:

my_list = [1, 2, 3]
my_list.extend([4])
print(my_list)  # [1, 2, 3, 4]
my_list.extend([5, 6])
print(my_list)  # [1, 2, 3, 4, 5, 6]
  • Related