The Situation:
I grouped my dataframe data and did some aggregation. When I'm printing, I get correct data, but while writing it into CSV file, I am getting only 3 columns coming in CSV which is in aggregation part.
Also I want to add one more column named AVG
which will come from deviation:
AMOUNT/CUST_NUMB
My code:
final=pd.DataFrame(gg.groupby(['Phone','BID', 'P_NAME',
'STATE','VILLAGE', 'TALUK', 'DISTRICT'])
.agg({'AMOUNT':'sum', 'CUST_NUMB': 'count','DATE': 'max'}))
final.to_csv("spend.csv",index=False)
OUTPUT I AM GETTING FROM MY CODE
The Question:
So i want to write whole DF in one CSV with that extra column named :
avg_spend
Please could you help me with this.
CodePudding user response:
Can you try this one:
final.to_csv("spend.csv")
If this is not what you expect please provide more details
CodePudding user response:
Here is a minimal demo reproducing your problem:
import pandas as pd
data = {
'Phone': ['x', 'x', 'x'],
'BID': ['x', 'x', 'x'],
'P_NAME': ['x', 'x', 'x'],
'STATE': ['x', 'x', 'x'],
'VILLAGE': ['x', 'x', 'x'],
'TALUK': ['1', '2', '1'],
'DISTRICT': ['x', 'x', 'x'],
'AMOUNT': [3, 4, 5],
'CUST_NUMB': ['12', '13', '14'],
'DATE': ['20211025', '20211025', '20211025']
}
gg = pd.DataFrame(data)
final = pd.DataFrame(gg.groupby(['Phone', 'BID', 'P_NAME',
'STATE', 'VILLAGE', 'TALUK', 'DISTRICT'])
.agg({'AMOUNT': 'sum', 'CUST_NUMB': 'count', 'DATE': 'max'}))
print(final)
# AMOUNT CUST_NUMB DATE
# Phone BID P_NAME STATE VILLAGE TALUK DISTRICT
# x x x x x 1 x 8 2 20211025
# 2 x 4 1 20211025
Use reset_index()
to convert the groups to columns:
final = final.reset_index()
print(final)
# Phone BID P_NAME STATE VILLAGE TALUK DISTRICT AMOUNT CUST_NUMB DATE
# 0 x x x x x 1 x 8 2 20211025
# 1 x x x x x 2 x 4 1 20211025
Add new column avg_spend
and write in CSV
:
final['avg_spend'] = final['AMOUNT'] / final['CUST_NUMB']
print(final)
# Phone BID P_NAME STATE ... AMOUNT CUST_NUMB DATE avg_spend
# 0 x x x x ... 8 2 20211025 4.0
# 1 x x x x ... 4 1 20211025 4.0
final.to_csv("spend.csv", index=False)