I have following table in Pandas:
index | project | category | period | update | amount
0 | 100130 | labour | 202201 | 202203 | 1000
1 | 100130 | labour | 202202 | 202203 | 1000
2 | 100130 | labour | 202203 | 202203 | 1000
3 | 100130 | labour | 202204 | 202203 | 1000
4 | 100130 | labour | 202205 | 202203 | 1000
And my final goal is to get table grouped by project and category with summary of amount column but only from month of update until now. So for example above I will get summary from 202203 until 202205 which is 3000 for project 100130 and category labour.
As a first step I tried following condition:
for index, row in table.iterrows():
if row["period"] < row["update"]
row["amount"] = 0
But:
- this iteration is not working
- is there some simple and not so time consuming way how to do it? As my table has over 60.000 rows, so iteration not so good idea probably.
CodePudding user response:
table["amount"] = 0 if table["period"] < table["update"] else None
CodePudding user response:
I did some more research and this code seems to solve my problem:
def check_update(row):
if row["period"] < row["update"]:
return 0
else:
return row["amount"]
table["amount2"] = table.apply(check_update, axis=1)