Home > Enterprise >  Python: How to assign elements of an array to multiple columns in DataFrame?
Python: How to assign elements of an array to multiple columns in DataFrame?

Time:03-10

So, I have a function like below:

def do_something(row, args):
    # doing something
    return arr 

where arr = array([1, 2, 3, 4, 5])

and in my main function, I have a pandas DataFrame object df, it has columns A, B, C, D, E along with other columns.

I use function do_something as below, try to assign each of the elements of the returning array to the columns:

df[['A', 'B', 'C', 'D', 'E']] = df.apply(lambda row: do_something(row, args), axis=1)

However, it gave me an error:

ValueError: Must have equal len keys and values when setting with an iterable

I think it means I'm trying to assign the whole array as one variable to multiple columns in my df, hence the unequal length trigger the error. But I'm not sure how to make my goal.

CodePudding user response:

The error is telling you that you are trying to assign a sequence of values to a sequence of keys. You can use the zip function to create a dictionary from the two lists. Then you can use the dictionary to assign the values to the columns.

CodePudding user response:

apply returns a Series object, so you can't assign it like you are doing. One possible solution is to convert it to a list and assign:

df[['A','B','C','D','E']] = df.apply(lambda row: do_something(row, args), axis=1).tolist()

CodePudding user response:

This is because the return value of apply is a pd.Series which comes with 1 column while you try to assign it to 5. See this example:

import numpy as np
import pandas as pd

def do_something(row, args):
    arr = np.array([1,2,3,4,5])
    return arr 
df = pd.DataFrame(np.ones((2,6)),columns = ['A', 'B', 'C', 'D', 'E', 'some other column'])
df.apply(lambda row: do_something(row, 2), axis=1)
0    [1, 2, 3, 4, 5]
1    [1, 2, 3, 4, 5]
dtype: object

The solution is to convert it to 5 columns so you could do the assign later:

df[['A', 'B', 'C', 'D', 'E']] = df.apply(lambda row: do_something(row, 2), axis=1).to_list()
  • Related