I have a pandas dataframe:
id val label
"a1" "ab" "first"
"a1" "aa" "second"
"a1" "ca" "third"
"b1" "cc" "first"
"b1" "kf" "second"
"b1" "ff" "third"
"c1" "wer" "first"
"c1" "iid" "second"
"c1" "ff" "third"
I want to transform it into dictionary wwhere key will be values from columns "id" and values will be dictionaries with keys "label" and values from column "val". so the output must be:
{"a1": {"first": {"ab"}, "second": {"aa"}, "third": {"ca"}},
"b1": {"first": {"cc"}, "second": {"kf"}, "third": {"ff"}},
"c1": {"first": {"wer"}, "second": {"iid"}, "third": {"ff"}},
}
how could I do that?
CodePudding user response:
You can groupby
on id
with a lambda function to convert the label
/val
pairs into a dict, then to_dict
on the result to get your desired output:
df.groupby('id').apply(lambda x:dict(zip(x['label'], x['val']))).to_dict()
Output for your sample data:
{
'a1': {'first': 'ab', 'second': 'aa', 'third': 'ca'},
'b1': {'first': 'cc', 'second': 'kf', 'third': 'ff'},
'c1': {'first': 'wer', 'second': 'iid', 'third': 'ff'}
}
CodePudding user response:
First I recreated your dataframe like
import pandas as pd
import io
text = """
id val label
"a1" "ab" "first"
"a1" "aa" "second"
"a1" "ca" "third"
"b1" "cc" "first"
"b1" "kf" "second"
"b1" "ff" "third"
"c1" "wer" "first"
"c1" "iid" "second"
"c1" "ff" "third"
"""
df = pd.read_csv(io.StringIO(text), sep="\s ")
Then I initialized an empty dict
reordered_dict = {}
then I calculated unique values in df.id
and iterated over them to populate values:
for unique_id in df.id.unique():
all_matched_items = df.where(df["id"] == unique_id).dropna(axis=0, how="all") # only contains rows where id matches
reordered_dict[unique_id] = all_matched_items.set_index("label")["val"].to_dict() # change index to label and export as dict
print(reordered_dict)