I have a pandas data frame like this (represent a investment portfolio):
data = {'category':['stock', 'bond', 'cash', 'stock',’cash’],
'name':[‘AA’ , ‘BB’, ‘CC’, ‘DD’, ’EE’],
'quantity':[2, 2, 10, 4, 3],
'price':[10, 15, 4, 2, 4],
'value':[ 20, 30, 40,8, 12],
df = pd.DataFrame(data)
I would like to generate a report in a text file that looks like this :
Stock: Total: 60
Name quantity price value
AA 2 10 20
CC 10 4 40
Bond: Total: 60
Name quantity price value
BB 2 15 30
Cash: Total: 52
Name quantity price value
CC 10 4 40
EE 3 4 12
I found a way to do this by looping through a list of dataframe but it is kind of ugly, I think there should be a way with iterrow or iteritem, but I can’t make it work.
Thank you for your help !
CodePudding user response:
You should try this library pandas-profiling
which will create an HTML report of your dataset with descriptive statistics.
You can install it with :
pip install pandas-profiling
And then use it with:
from pandas_profiling import ProfileReport
profile = ProfileReport(df, title="Pandas Profiling Report")
profile.to_file("your_report.html")
CodePudding user response:
You can loop by groupby
object and write custom header with data:
for i, g in df.groupby('category', sort=False):
with open('out.csv', 'a') as f:
f.write(f'{i}: Total: {g["value"].sum()}\n')
(g.drop('category', axis=1)
.to_csv(f, index=False, mode='a', sep='\t', line_terminator='\n'))
f.write('\n')
Output:
stock: Total: 28
name quantity price value
AA 2 10 20
DD 4 2 8
bond: Total: 30
name quantity price value
B 2 15 30
cash: Total: 52
name quantity price value
CC 10 4 40
EE 3 4 12