I need help on how to properly transform my df from this:
df_installation = pd.DataFrame({'InstallationID': ["Item 1", "Item 2", "Item 3","Item 1", "Item 2", "Item 3"],
'Type': ["Metric", "Metric","Metric", "Imperial","Imperial","Imperial"],
'Measure 1': [11998, 11076,12025,129145,119221,129],
'Measure 2': [12000,12001,12002,129168,129178.764,129189.528],
'Measure Type': ["Morning","Afternoon","Evening","Morning","Afternoon","Evening"],
})
to this: --> (i attached an image sample too because the sample final df, I'm not sure how to create a json sample so i made it look like text so it will be accepted in a column)
df_installation_new = pd.DataFrame({'InstallationID': ['ID no1', "ID no2", "ID no3"],
'Metric': ["""{"text 1": {"label": "Measure Time","value": "Morning"},"text 2": {"label": "Measure 1","value": 11998},"text 3": {"label": "Measure 2","value": 12000}}""",
"""{"text 1": {"label": "Measure Time","value": "Afternoon"},"text 2": {"label": "Measure 1","value": 11076},"text 3": {"label": "Measure 2","value": 12001}}""",
"""{"text 1": {"label": "Measure Time","value": "Evening"},"text 2": {"label": "Measure 1","value": 12025},"text 3": {"label": "Measure 2","value": 12002}}"""
],
'Imperial': ["""{"text 1": {"label": "Measure Time","value": "Morning"},"text 2": {"label": "Measure 1","value": 129145},"text 3": {"label": "Measure 2","value": 129168}}""",
"""{"text 1": {"label": "Measure Time","value": "Afternoon"},"text 2": {"label": "Measure 1","value": 119221},"text 3": {"label": "Measure 2","value": 129178.764}}""",
"""{"text 1": {"label": "Measure Time","value": "Evening"},"text 2": {"label": "Measure 1","value": 129},"text 3": {"label": "Measure 2","value": 129189.528}}"""
],
})
CodePudding user response:
Given your input dataframe:
df = pd.DataFrame({'InstallationID': ["Item 1", "Item 2", "Item 3","Item 1", "Item 2", "Item 3"],
'Type': ["Metric", "Metric","Metric", "Imperial","Imperial","Imperial"],
'Measure 1': [11998, 11076,12025,129145,119221,129],
'Measure 2': [12000,12001,12002,129168,129178.764,129189.528],
'Measure Type': ["Morning","Afternoon","Evening","Morning","Afternoon","Evening"],
})
You could do the following:
1- Create a function to take the measure type, measure 1, and measure 2 to convert them into the needed format:
def to_my_format(mt, m1, m2):
return(f"""{{"text 1": {{"label": "Measure Time","value": "{mt}"}},"text 2": {{"label": "Measure 1","value": {m1}}},"text 3": {{"label": "Measure 2","value": {m2}}}}}""")
2- Apply this function to all rows of your dataframe:
df['json'] = df.apply(lambda x: to_my_format(x['Measure Type'], x['Measure 1'], x['Measure 2']), axis=1)
3- Now, pivot your dataframe on the first occurrence with ID as the index and Type as the columns:
df.pivot_table(index="InstallationID", columns="Type", values="json", aggfunc="first")
Output:
Type Imperial Metric
InstallationID
Item 1 {"text 1": {"label": "Measure Time","value": "... {"text 1": {"label": "Measure Time","value": "...
Item 2 {"text 1": {"label": "Measure Time","value": "... {"text 1": {"label": "Measure Time","value": "...
Item 3 {"text 1": {"label": "Measure Time","value": "... {"text 1": {"label": "Measure Time","value": "...
Please, for the future, provide some attempt on solving your problem.