When I ran this toy code
test = pd.DataFrame({'a': [1, 2, 3, 4]})
test['b'] = ''
for i in range(len(test)):
test['b'].loc[i] = [5, 6, 7]
I had a warning
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self._setitem_single_block(indexer, value, name)
But if I used the loc
as per this approach
test = pd.DataFrame({'a': [1, 2, 3, 4]})
test['b'] = ''
for i in range(len(test)):
test.loc[i, 'b'] = [5, 6, 7]
I got an error
ValueError: Must have equal len keys and value when setting with an iterable
What did the error mean? In particular, what keys, value and iterable it referred to?
CodePudding user response:
You could use apply
:
test['b'] = test['b'].apply(lambda x: [5,6,7])
or list concatenation (or list comprehension):
test['b'] = [[5,6,7]] * len(test)
If you have to use a loop, you could use at
(this works because you've set "b"
as dtype object by assigning it with ''
initially and at
can assign values to a cell, as long as the types match):
for i in range(len(test)):
test.at[i, 'b'] = [5, 6, 7]
Output:
a b
0 1 [5, 6, 7]
1 2 [5, 6, 7]
2 3 [5, 6, 7]
3 4 [5, 6, 7]
CodePudding user response:
This is happening because in the second snippet of code, you are trying to set values based upon the index number and not based upon the column name.
If you do:
for i in range(len(test)):
print(test.loc[i])
This gives you:
a 1
b
Name: 0, dtype: object
a 2
b
Name: 1, dtype: object
a 3
b
Name: 2, dtype: object
a 4
b
Name: 3, dtype: object
That means using loc and i variable, you are accessing it row wise on index and that's why you get the mismatch len error.
To overcome this warning, specifically navigate to that column using loc
:
test = pd.DataFrame({'a': [1, 2, 3, 4]})
test['b'] = ''
for i in range(len(test)):
test.loc[:,'b'].loc[i] = [5, 6, 7]