How to transform pandas dataframe into specific dictionary?-CodePudding

I have a pandas dataframe:

 id      val       label
"a1"    "ab"      "first"
"a1"    "aa"      "second"
"a1"    "ca"      "third"
"b1"    "cc"      "first"
"b1"    "kf"      "second"
"b1"    "ff"      "third"
"c1"    "wer"     "first"
"c1"    "iid"     "second"
"c1"    "ff"      "third"

I want to transform it into dictionary wwhere key will be values from columns "id" and values will be dictionaries with keys "label" and values from column "val". so the output must be:

{"a1": {"first": {"ab"}, "second": {"aa"}, "third": {"ca"}},
 "b1": {"first": {"cc"}, "second": {"kf"}, "third": {"ff"}},
 "c1": {"first": {"wer"}, "second": {"iid"}, "third": {"ff"}},
}

how could I do that?

CodePudding user response：

You can groupby on id with a lambda function to convert the label/val pairs into a dict, then to_dict on the result to get your desired output:

df.groupby('id').apply(lambda x:dict(zip(x['label'], x['val']))).to_dict()

Output for your sample data:

{
 'a1': {'first': 'ab', 'second': 'aa', 'third': 'ca'},
 'b1': {'first': 'cc', 'second': 'kf', 'third': 'ff'},
 'c1': {'first': 'wer', 'second': 'iid', 'third': 'ff'}
}

CodePudding user response：

First I recreated your dataframe like

import pandas as pd
import io


text = """
 id      val       label
"a1"    "ab"      "first"
"a1"    "aa"      "second"
"a1"    "ca"      "third"
"b1"    "cc"      "first"
"b1"    "kf"      "second"
"b1"    "ff"      "third"
"c1"    "wer"     "first"
"c1"    "iid"     "second"
"c1"    "ff"      "third"
"""

df = pd.read_csv(io.StringIO(text), sep="\s ")

Then I initialized an empty dict

reordered_dict = {}

then I calculated unique values in df.id and iterated over them to populate values:

for unique_id in df.id.unique():
    all_matched_items = df.where(df["id"] == unique_id).dropna(axis=0, how="all") # only contains rows where id matches
    reordered_dict[unique_id] = all_matched_items.set_index("label")["val"].to_dict() # change index to label and export as dict

print(reordered_dict)