Home > other >  Add list or numpy array as column to a dask dataframe
Add list or numpy array as column to a dask dataframe

Time:08-21

How can i add a list or a numpy array as a column to a Dask dataframe? When i try with the regular pandas syntax df['x']=x it gives me a TypeError: Column assignment doesn't support type list error.

CodePudding user response:

You can add a pandas series:

df["new_col"] = pd.Series(my_list, index=index_matching_df_index)

The issue is that the index is extremely important so dask can understand how to partition the data. The size of each partition in a dask dataframe is not always known, so you cannot assign by position.

CodePudding user response:

I finally solved it just casting the list into a dask array with dask.array.from_array(), which i think it's the most direct way.

  • Related