I am trying to set all values of a row to the same value base on another dataframe (or series derived from a dataframe).
Simple dfs:
df=pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]],columns=['a','b','c'])
df2=pd.DataFrame([['const',10,'other'],['const',20,'other'],['var',30,'other'],['var',40,'other']],columns=['type','val','z'])
df
a b c
0 1 2 3
1 4 5 6
2 7 8 9
df2
type val z
0 const 10 other
1 const 20 other
2 var 30 other
2 var 40 other
I want to set all columns of df to 'val' found in df2 if the 'type' column in df2 for the same index is 'const'
The elements I want to change are given by:
df.loc[df2['type']=='const',:]
a b c
0 1 2 3
1 4 5 6
The values I want to change them to are:
df2.loc[df2['type']=='const','val']
0 10
1 20
And the end result would be:
a b c
0 10 10 10
1 20 20 20
2 7 8 9
I was hoping broadcasting the series to the dataframe would do the trick, but it doesn't:
df.loc[df2['type']=='const',:] = df2.loc[df2['type']=='const','val']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\indexing.py", line 723, in __setitem__
iloc._setitem_with_indexer(indexer, value, self.name)
File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\indexing.py", line 1732, in _setitem_with_indexer
self._setitem_single_block(indexer, value, name)
File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\indexing.py", line 1968, in _setitem_single_block
self.obj._mgr = self.obj._mgr.setitem(indexer=indexer, value=value)
File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\internals\managers.py", line 355, in setitem
return self.apply("setitem", indexer=indexer, value=value)
File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\internals\managers.py", line 327, in apply
applied = getattr(b, f)(**kwargs)
File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\internals\blocks.py", line 984, in setitem
values[indexer] = value
ValueError: shape mismatch: value array of shape (2,) could not be broadcast to indexing result of shape (2,3)
Appreciate thoughts on what I am missing. The closest thing I could find is how to set one column at a time: Pandas - Replace values based on index
CodePudding user response:
You're close, but matching indices of a DataFrame and a Series doesn't appear to be the default function. So, with a tiny change, we can ask for a DataFrame back from .loc
instead of a Series.
m = df2['type'].eq('const')
df.loc[m] = df2.loc[m, ['val']]
print(df)
Output:
a b c
0 10 10 10
1 20 20 20
2 7 8 9