Home > Enterprise >  ValueError: Must have equal len keys and value when setting with an iterable
ValueError: Must have equal len keys and value when setting with an iterable

Time:03-31

When I ran this toy code

test = pd.DataFrame({'a': [1, 2, 3, 4]})
test['b'] = ''
for i in range(len(test)):
    test['b'].loc[i] = [5, 6, 7]

I had a warning

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)

But if I used the loc as per this approach

test = pd.DataFrame({'a': [1, 2, 3, 4]})
test['b'] = ''
for i in range(len(test)):
    test.loc[i, 'b'] = [5, 6, 7]

I got an error

ValueError: Must have equal len keys and value when setting with an iterable

What did the error mean? In particular, what keys, value and iterable it referred to?

CodePudding user response:

You could use apply:

test['b'] = test['b'].apply(lambda x: [5,6,7])

or list concatenation (or list comprehension):

test['b'] = [[5,6,7]] * len(test)

If you have to use a loop, you could use at (this works because you've set "b" as dtype object by assigning it with '' initially and at can assign values to a cell, as long as the types match):

for i in range(len(test)):
    test.at[i, 'b'] = [5, 6, 7]

Output:

   a          b
0  1  [5, 6, 7]
1  2  [5, 6, 7]
2  3  [5, 6, 7]
3  4  [5, 6, 7]

CodePudding user response:

This is happening because in the second snippet of code, you are trying to set values based upon the index number and not based upon the column name.

If you do:

for i in range(len(test)):
    print(test.loc[i])

This gives you:

a    1
b     
Name: 0, dtype: object
a    2
b     
Name: 1, dtype: object
a    3
b     
Name: 2, dtype: object
a    4
b     
Name: 3, dtype: object

That means using loc and i variable, you are accessing it row wise on index and that's why you get the mismatch len error.

To overcome this warning, specifically navigate to that column using loc:

test = pd.DataFrame({'a': [1, 2, 3, 4]})
test['b'] = ''
for i in range(len(test)):
    test.loc[:,'b'].loc[i] = [5, 6, 7]
  • Related