How to creat key and list of elements from column as values from dataframes-CodePudding

How to create python dictionary using the data below Df1:

Id	mail-id
1	xyz@gm
1	ygzbb
2.	Ghh.
2.	Hjkk.

I want it as

{1:[xyz@gm,ygzbb], 2:[Ghh,Hjkk]}

CodePudding user response：

One option is to groupby the Id column and turn the mail-id into a list in a dictionary comprehension:

{k:v["mail-id"].values.tolist() for k,v in df.groupby("Id")}

CodePudding user response：

Something like this?

data = [
    [1, "xyz@gm"],
    [1, "ygzbb"],
    [2, "Ghh"],
    [2, "Hjkk"],
]

dataDict = {}

for k, v in data:
    if k not in dataDict:
        dataDict[k] = []
    dataDict[k].append(v)

print(dataDict)

CodePudding user response：

One option is to iterate over the set version of the ids and check one by one:

>>> _d = {}
>>> df = pd.DataFrame({"Id":[1,1,2,2],"mail-id":["xyz@gm","ygzbb","Ghh","Hjkk"]})
>>> for x in set(df["Id"]):
...     _d.update({x:df[df["id"]==x]["mail_id"]})

But it's much faster to use dictionary comprehension and builtin pandas DataFrame.groupby; a quick look from the Official Documentation:

A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.

DataFrame.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=NoDefault.no_default, observed=False, dropna=True)

as @fsimonjetz pointed out, this code will be sufficent:

>>> df = pd.DataFrame({"Id":[1,1,2,2],"mail-id":["xyz@gm","ygzbb","Ghh","Hjkk"]})
>>> {k:v["mail-id"].values.tolist() for k,v in df.groupby("Id")}

CodePudding user response：

You can do:

df.groupby('Id').agg(list).to_dict()['mail-id']

Output:

{1: ['xyz@gm', 'ygzbb'], 2: ['Ghh.', 'Hjkk.']}