I'm working with a time-series dataset whose my interest columns are the month column (mes), the year column (ano), the minimum prices (min) and the modal price (mcom).
some months from the year 2013 doesn't have any value in the min column, and in those months and want to fill with the values that are in the mcom column.
I used the following line to find the months where there are no min values:
dfc.loc[(dfc['ano'] == 2013) & (dfc['mes'] == 1)]
Basically, I want to fill the 'min' column with the values in the 'mcom' column when
dfc['ano'] == 2013 & dfc['mes'] == 1
how can I do it?
That's the head of the dataframe:
data dia mes ano wday prod und proc tipo min mcom max merc date julian
6 02/01/2013 2 1 2013 quarta Peixe de agua salgada Kg RS-SC Sardinha NaN 5.28 5.28 Est 2013-01-02 12:00:00 1.416667
14 03/01/2013 3 1 2013 quinta Peixe de agua salgada Kg RS-SC Sardinha NaN 5.28 5.28 Est 2013-01-03 12:00:00 2.416667
22 04/01/2013 4 1 2013 sexta Peixe de agua salgada Kg RS-SC Sardinha NaN 5.28 5.28 Est 2013-01-04 12:00:00 3.416667
30 07/01/2013 7 1 2013 segunda Peixe de agua salgada Kg RS-SC Sardinha NaN 5.28 5.28 Est 2013-01-07 12:00:00 6.416667
38 08/01/2013 8 1 2013 nan Peixe de agua salgada Kg RS-SC Sardinha NaN 5.28 5.28 Est 2013-01-08 12:00:00 7.416667
The 'tipo' variabl, only has 'sardinha' and the types of the variables are:
dfc.dtypes
data object
dia int64
mes int64
ano int64
wday object
prod object
und object
proc object
tipo object
min float64
mcom float64
max float64
merc object
date object
julian float64
dtype: object
CodePudding user response:
m = (dfc['ano'] == 2013) & (dfc['mes'] == 1)
dfc.loc[m, 'min'] = dfc.loc[m, 'mcom']
# or
dfc['min'] = dfc['min'].mask(m, df['mcom'])
# or
dfc['min'] = np.where(m, df['mcom'], dfc['min'])