I am trying to format some data from an external source and make some sorting and groupings. Below is the snippet that I am using currently. I am trying to make the sorting and grouping work.
pmtAccount = []
pmtAmount = []
for payment in payments_sent:
print ("{} {:>15}".format(payment['to'], payment['amount']))
pmtAmount.append(payment['amount'])
pmtAccount.append(payment['to'])
#-- Attempt to summarize output
df = pd.DataFrame(list(zip(pmtAccount, pmtAmount)), columns =['To', 'Amount'])
result = df.groupby('To')['Amount'].agg(['count','sum']).reset_index()
result['count'] = result['count'].astype(str) ' Transactions'
print(result)
Current Output:
#-- Output of the line print ("{} {:>15}".format(payment['to'], payment['amount']))
199SJFW1K 100.0000000
ABC123ABC 100.0000000
ABC123ABC 50.0000000
ABC123ABC 300.0000000
123ABC123 200.0000000
123ABC123 50.0000000
123ABC123 100.0000000
ABC123ABC 300.0000000
ABC123ABC 250.0000000
XYXY12XYX 250.0000000
XYXY12XYX 500.0000000
1OPR12DRT 41.0000000
KSJDHW812 0.1900000
#--
To count sum
0 ABC123ABC ... 5 Transactions 100.0000000 50.0000000 300.0000000 300.0000000250...
1 XYXY12XYX ... 2 Transactions 250.0000000500.0000000
2 KSJDHW812 ... 1 Transactions 0.1900000
3 1OPR12DRT ... 1 Transactions 41.0000000
4 123ABC123 ... 3 Transactions 200.000000050.0000000100.0000000
5 199SJFW1K ... 1 Transactions 100.0000000
Needed Output: #-- Sorted and grouped
0 ABC123ABC ... 5 Transactions 1,100.0000000 #-- Sum of (100.0000000, 50.0000000, 300.0000000, 300.0000000, 250.0000000)
1 123ABC123 ... 3 Transactions 350.0000000
2 XYXY12XYX ... 2 Transactions 750.0000000
3 1OPR12DRT ... 1 Transactions 41.0000000
4 199SJFW1K ... 1 Transactions 100.0000000
5 KSJDHW812 ... 1 Transactions 0.1900000
Total Transactions: 13 Total Amount: 2,341.19
CodePudding user response:
You have the right approach, but after you thought the data, you have not converted Amount into Float type.
df['Amount'] = df['Amount'].astype(float)
Now you can separately calculate Total Transactions and Total Amount:
result = df.groupby('To')['Amount'].agg(['count','sum']).reset_index()
total_transactions = result['count'].sum()
total_amount = result['sum'].sum()
Now the full block of code with these changes will look like this:
df = pd.DataFrame(list(zip(pmtAccount, pmtAmount)), columns =['To', 'Amount'])
df['Amount'] = df['Amount'].astype(float)
result = df.groupby('To')['Amount'].agg(['count','sum']).reset_index()
total_transactions = result['count'].sum()
total_amount = result['sum'].sum()
result['count'] = result['count'].astype(str) ' Transactions'
result = result.sort_values(by=['sum', 'count'], ascending=[False, False])
print(result)
print("Total Transactions: {} Total Amount: {}".format(total_transactions, total_amount))
Output for this:
To count sum
3 ABC123ABC 5 Transactions 1000.00
5 XYXY12XYX 2 Transactions 750.00
0 123ABC123 3 Transactions 350.00
1 199SJFW1K 1 Transactions 100.00
2 1OPR12DRT 1 Transactions 41.00
4 KSJDHW812 1 Transactions 0.19
Total Transactions: 13 Total Amount: 2241.19
CodePudding user response:
IIUC: you may need this:
import pandas as pd
df=pd.DataFrame([
['199SJFW1K',100.0000000],
['ABC123ABC',100.0000000],
['ABC123ABC',50.0000000],
['ABC123ABC',300.0000000],
['123ABC123',200.0000000],
['123ABC123',50.0000000],
['123ABC123',100.0000000],
['ABC123ABC',300.0000000],
['ABC123ABC',250.0000000],
['XYXY12XYX',250.0000000],
['XYXY12XYX',500.0000000],
['1OPR12DRT',41.0000000],
['KSJDHW812',0.1900000],
], columns=['to','amount'])
And you want to group by the to
column
df2=df.groupby('to').size()\
.reset_index(name='transaction')\
.sort_values('transaction', ascending=False)
df2
groupby('to')
will group by the to
column
size()
will count each group
reset_index(name='transaction')
will set the count column to transaction
sort_values('transaction', ascending=False)
will sort values by the transaction
column and ascending=False
mean order from max to min
This is the output
to | transaction | |
---|---|---|
3 | ABC123ABC | 5 |
0 | 123ABC123 | 3 |
5 | XYXY12XYX | 2 |
1 | 199SJFW1K | 1 |
2 | 1OPR12DRT | 1 |
4 | KSJDHW812 | 1 |
and the total should use the code:
print(f'Total transaction: {df2.transaction.sum()}')
Then the output will be
Total transaction: 13
Additional, if you want to sum the amount, try this
df3=df.groupby('to')\
.sum()\
.reset_index()\
.rename(columns={'amount':'sum_of_amount'})
df3
This is the output
to | sum_of_amount | |
---|---|---|
0 | 123ABC123 | 350.00 |
1 | 199SJFW1K | 100.00 |
2 | 1OPR12DRT | 41.00 |
3 | ABC123ABC | 1000.00 |
4 | KSJDHW812 | 0.19 |
5 | XYXY12XYX | 750.00 |
And if you want to merge 2 dataframe, df2 and df3 to show to transaction and a sum_of_amount
pd.merge(df2,df3,on='to')
This is the final output
to | transaction | sum_of_amount | |
---|---|---|---|
0 | ABC123ABC | 5 | 1000.00 |
1 | 123ABC123 | 3 | 350.00 |
2 | XYXY12XYX | 2 | 750.00 |
3 | 199SJFW1K | 1 | 100.00 |
4 | 1OPR12DRT | 1 | 41.00 |
5 | KSJDHW812 | 1 | 0.19 |