How can i add a list or a numpy array as a column to a Dask dataframe? When i try with the regular pandas syntax df['x']=x
it gives me a TypeError: Column assignment doesn't support type list
error.
CodePudding user response:
You can add a pandas series:
df["new_col"] = pd.Series(my_list, index=index_matching_df_index)
The issue is that the index is extremely important so dask can understand how to partition the data. The size of each partition in a dask dataframe is not always known, so you cannot assign by position.
CodePudding user response:
I finally solved it just casting the list into a dask array with dask.array.from_array()
, which i think it's the most direct way.