I have a dataframe with account information:
Date | Transaction type | Amount
I've sorted the data into deposit and withdrawals like:
deposit = accountHistory.loc[(accountHistory['Transaction type']=="Cash In")]
withdrawal = accountHistory.loc[(accountHistory['Transaction type']=="Withdrawal")]
So, that returns all of the locations of deposits and withdrawals.
The current account balance seems easy to get:
sum(deposit["Amount"])-sum(withdrawal["Amount"])
Now, I want to graph the account balance over time.
Should I iterate through the dataframe and create a new col called Account balance
or is there some better way to do it using pd
functions I'm not aware of?
For each row, account balance should show the deposit - withdrawal
(to that date).
To add some more information: I have date like:
Date | Transaction type | Amount
X/Y/Z In 50
X/Y/Z Interest 1
X/Y/Z Withdrawal 20
X/Y/Z Investment 30
And I want:
Date | Transaction type | Amount | Balance
X/Y/Z In 50 50
X/Y/Z Fees 1 49
X/Y/Z Withdrawal 20 29
X/Y/Z Investment 10 19
X/Y/Z In 20 39
X/Y/Z In 10 49
X/Y/Z Withdrawal 10 39
Is there any other way to do this than using a for loop, iterating over the entire df?
CodePudding user response:
I would use pivot_table()
and then create the Balance column as the difference between Cash in (as C) and Withdraw (W). Alternatively, you can use assign()
to make it one line:
Let's suppose this is our dataframe:
Date Transaction type Amount
0 2021-01-01 C 100
1 2021-01-01 W 80
2 2021-02-02 C 120
3 2021-02-02 W 40
4 2021-02-03 C 200
5 2021-02-03 W 150
Using the proposed answer:
df = df.pivot_table(index='Date',columns='Transaction type',values='Amount',fillna=0)
df['Balance'] = df['C'] - df['W']
Outputs:
Transaction type C W Balance
Date
2021-01-01 100 80 20
2021-02-02 120 40 80
2021-02-03 200 150 50
If you want the cummulative value then simply add .cumsum()
. So:
df['Balance'] = (df['C'] - df['W']).cumsum()
CodePudding user response:
I noticed that Transaction type == "In" is the only case when the balance increases. All other transaction types decrease the balance.
So the first step is to check the Transaction type. "In" means increase (the value should stay positive (as is)), otherwise it should actually be negative.
Then the only step to do is to compute the cumulative sum.
So the whole code to compute the new column can be:
df['Balance'] = df.Amount.where(df['Transaction type'] == 'In',
-df.Amount).cumsum()