Home > Software engineering >  Set value for all columns based on series
Set value for all columns based on series

Time:10-02

I am trying to set all values of a row to the same value base on another dataframe (or series derived from a dataframe).

Simple dfs:

df=pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]],columns=['a','b','c'])
df2=pd.DataFrame([['const',10,'other'],['const',20,'other'],['var',30,'other'],['var',40,'other']],columns=['type','val','z'])

df
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

df2
    type   val      z
0  const    10  other
1  const    20  other
2    var    30  other
2    var    40  other

I want to set all columns of df to 'val' found in df2 if the 'type' column in df2 for the same index is 'const'

The elements I want to change are given by:

df.loc[df2['type']=='const',:]
   a  b  c
0  1  2  3
1  4  5  6

The values I want to change them to are:

df2.loc[df2['type']=='const','val']

0    10
1    20

And the end result would be:

    a   b   c
0  10  10  10
1  20  20  20
2   7   8   9

I was hoping broadcasting the series to the dataframe would do the trick, but it doesn't:

df.loc[df2['type']=='const',:] = df2.loc[df2['type']=='const','val']


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\indexing.py", line 723, in __setitem__
    iloc._setitem_with_indexer(indexer, value, self.name)
  File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\indexing.py", line 1732, in _setitem_with_indexer
    self._setitem_single_block(indexer, value, name)
  File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\indexing.py", line 1968, in _setitem_single_block
    self.obj._mgr = self.obj._mgr.setitem(indexer=indexer, value=value)
  File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\internals\managers.py", line 355, in setitem
    return self.apply("setitem", indexer=indexer, value=value)
  File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\internals\managers.py", line 327, in apply
    applied = getattr(b, f)(**kwargs)
  File "C:\ProgramData\Anaconda3\envs\py37\lib\site-packages\pandas\core\internals\blocks.py", line 984, in setitem
    values[indexer] = value
ValueError: shape mismatch: value array of shape (2,) could not be broadcast to indexing result of shape (2,3)

Appreciate thoughts on what I am missing. The closest thing I could find is how to set one column at a time: Pandas - Replace values based on index

CodePudding user response:

You're close, but matching indices of a DataFrame and a Series doesn't appear to be the default function. So, with a tiny change, we can ask for a DataFrame back from .loc instead of a Series.

m = df2['type'].eq('const')
df.loc[m] = df2.loc[m, ['val']]
print(df)

Output:

    a   b   c
0  10  10  10
1  20  20  20
2   7   8   9
  • Related