I have a question on pivot_table python pandas.
I have a dataframe like this
Agent Detail Value
report1 General Section YESS
report1 jobID 558
report1 Priority normal
report1 Run As Owner's Credentials
report1 Schedule Section
report1 disabled TRUE
report1 timeZoneId None
report1 startImmediately FALSE
report1 repeatMinuteInterval None
report1 start date None
report1 start time None
report1 Email Recipient [email protected]
report1 Email Recipient [email protected]
report2 General Section YESS
report2 jobID 559
report2 Priority normal
report2 Run As Owner's Credentials
report2 Schedule Section
report2 disabled TRUE
report2 timeZoneId None
report2 startImmediately FALSE
report2 repeatMinuteInterval None
report2 start date None
report2 start time None
report2 Email Recipient [email protected]
report2 Email Recipient [email protected]
i am trying to pivot the dataframe and convert all the detail values as columns . Index being agent field which is a report name . Each report can have multiple recipients . I need to have each row for recipient for each report . sample output as below:
[enter image description here]
my current code is below:
import csv
import pandas as pd
resultsFile = 'C:\\Oracle\\testfile.csv' #input to transpose file
df=pd.read_csv(resultsFile,skip_blank_lines=True)
df2=df.pivot_table(index='Agent',columns='Detail',values='Value',aggfunc='sum')
df2
which is concatenating the email addresses in single field and that is not what I am looking for ? How can I pivot a df with duplicate columns values and transform them to mulitple lines ?
Thanks for your help
CodePudding user response:
You can group your df by agent
and pivot the groups (with original index as index). You'll have to fill NaN values and drop duplicates since you'll get one line per value:
reports = []
for a, sub_df in df.groupby('Agent'):
rep = sub_df.pivot(None, 'Detail', 'Value').ffill().bfill().drop_duplicates()
rep.insert(0, 'Agent', a)
reports.append(rep)
result = pd.concat(reports).reset_index()
print(result)
Output:
Detail Agent Email Recipient General Section Priority Run As ... repeatMinuteInterval start date start time startImmediately timeZoneId
0 report1 [email protected] YESS normal Owner's Credentials ... None None None FALSE None
1 report1 [email protected] YESS normal Owner's Credentials ... None None None FALSE None
2 report2 [email protected] YESS normal Owner's Credentials ... None None None FALSE None
3 report2 [email protected] YESS normal Owner's Credentials ... None None None FALSE None