How to create python dictionary using the data below Df1:
Id | mail-id |
---|---|
1 | xyz@gm |
1 | ygzbb |
2. | Ghh. |
2. | Hjkk. |
I want it as
{1:[xyz@gm,ygzbb], 2:[Ghh,Hjkk]}
CodePudding user response:
One option is to groupby
the Id
column and turn the mail-id
into a list in a dictionary comprehension:
{k:v["mail-id"].values.tolist() for k,v in df.groupby("Id")}
CodePudding user response:
Something like this?
data = [
[1, "xyz@gm"],
[1, "ygzbb"],
[2, "Ghh"],
[2, "Hjkk"],
]
dataDict = {}
for k, v in data:
if k not in dataDict:
dataDict[k] = []
dataDict[k].append(v)
print(dataDict)
CodePudding user response:
One option is to iterate over the set
version of the ids and check one by one:
>>> _d = {}
>>> df = pd.DataFrame({"Id":[1,1,2,2],"mail-id":["xyz@gm","ygzbb","Ghh","Hjkk"]})
>>> for x in set(df["Id"]):
... _d.update({x:df[df["id"]==x]["mail_id"]})
But it's much faster to use dictionary comprehension and builtin pandas DataFrame.groupby
; a quick look from the Official Documentation:
A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.
DataFrame.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=NoDefault.no_default, observed=False, dropna=True)
as @fsimonjetz pointed out, this code will be sufficent:
>>> df = pd.DataFrame({"Id":[1,1,2,2],"mail-id":["xyz@gm","ygzbb","Ghh","Hjkk"]})
>>> {k:v["mail-id"].values.tolist() for k,v in df.groupby("Id")}
CodePudding user response:
You can do:
df.groupby('Id').agg(list).to_dict()['mail-id']
Output:
{1: ['xyz@gm', 'ygzbb'], 2: ['Ghh.', 'Hjkk.']}