Home > Blockchain >  Compare rows and replace values in one of the rows in pandas dataframe
Compare rows and replace values in one of the rows in pandas dataframe

Time:06-21

I have the following pandas dataframe:

Year Quarter Year-Month Current Value Future Value Date
2021 Q3 2021-10 0.411 NaN 2021-10-01
2021 Q4 2022-01 -0.100 NaN 2022-01-01
2022 Q1 2022-04 -0.224 NaN 2022-04-01
2022 Q1 2022-03 0.110 0.092 2022-03-01

In the last two rows it can be seen that the Year and Quarter values are duplicates but the Year-Month, no. So, what I want to do in this case is to identify such cases. In this example where Year and Quarter are duplicates, I want to take the value from the last one, move it to the row above and delete the last row. So the result will look like this:

Year Quarter Year-Month Current Value Future Value Date
2021 Q3 2021-10 0.411 NaN 2021-10-01
2021 Q4 2022-01 -0.100 NaN 2022-01-01
2022 Q1 2022-04 -0.224 0.092 2022-04-01

CodePudding user response:

Assuming rows are sorted by decreasing Year-Month, you can use GroupBy.agg with a crafted dictionary for aggregation ('first' value by default, 'last' for "Future Value"):

d = {c: 'first' for c in df}
d['Future Value'] = 'last'

out = df.groupby(['Year', 'Quarter'], as_index=False).agg(d)

output:

   Year Quarter Year-Month  Current Value  Future Value        Date
0  2021      Q3    2021-10          0.411           NaN  2021-10-01
1  2021      Q4    2022-01         -0.100           NaN  2022-01-01
2  2022      Q1    2022-04         -0.224         0.092  2022-04-01
  • Related