I'm working with a prices time-series from 2013 to 2020 which the columns I'm interested about are: 'date': date of when the data was collected 'max': maximum price commercialized that day 'mcom': most common price commercialized that day 'min': minimum price commercialized that day.
In exploratory analysis I realized that in some dates the 'mcom' are higher than 'max' and I discovered that in these specific rows the 'mcom' values should be the 'max' values and vice versa.
I used the loc function as bellow to see the dates where this happen:
dfc.loc[dfc['mcom'] > dfc['max']]
and it returned me:
data dia mes ano wday prod und proc tipo min mcom max merc julian mm
date
2016-10-04 12:00:00 04/10/2016 4 10 2016 nan Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1372.416667 1.11
2016-10-05 12:00:00 05/10/2016 5 10 2016 quarta Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1373.416667 1.11
2016-10-06 12:00:00 06/10/2016 6 10 2016 quinta Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1374.416667 1.11
2016-10-07 12:00:00 07/10/2016 7 10 2016 sexta Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1375.416667 1.11
2016-10-10 12:00:00 10/10/2016 10 10 2016 segunda Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1378.416667 1.11
2016-10-11 12:00:00 11/10/2016 11 10 2016 nan Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1379.416667 1.11
2016-10-13 12:00:00 13/10/2016 13 10 2016 quinta Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1381.416667 1.11
2016-10-14 12:00:00 14/10/2016 14 10 2016 sexta Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1382.416667 1.11
2016-10-17 12:00:00 17/10/2016 17 10 2016 segunda Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1385.416667 1.11
2016-10-18 12:00:00 18/10/2016 18 10 2016 nan Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1386.416667 1.11
2016-10-19 12:00:00 19/10/2016 19 10 2016 quarta Peixe de agua salgada Kg nan Sardinha 4.89 8.85 6.00 Est 1387.416667 1.11
2017-10-03 12:00:00 03/10/2017 3 10 2017 nan Peixe de agua salgada Kg nan Sardinha 6.80 6.90 6.89 Est 1736.416667 0.09
2017-10-04 12:00:00 04/10/2017 4 10 2017 quarta Peixe de agua salgada Kg nan Sardinha 6.80 6.90 6.89 Est 1737.416667 0.09
2017-10-05 12:00:00 05/10/2017 5 10 2017 quinta Peixe de agua salgada Kg nan Sardinha 6.80 6.90 6.89 Est 1738.416667 0.09
How can I put the 'mcom' values in where the 'max' values are and vice versa?
CodePudding user response:
IIUC, you can try
m = dfc['mcom'] > dfc['max']
dfc.loc[m, 'mcom'], dfc.loc[m, 'max'] = dfc.loc[m, 'max'], dfc.loc[m, 'mcom']
# or
dfc.loc[m, ['max', 'mcom']] = dfc.loc[m, ['mcom', 'max']].values
print(dfc)
min mcom max
0 4.89 6.0 8.85
1 4.89 6.0 8.85
2 4.89 6.0 8.85