Home > Net >  Convert dict to a dataframe with keys repeating for each value?
Convert dict to a dataframe with keys repeating for each value?

Time:02-22

Given a dict:

{1: [1,2,3,4,5], 2: [55,22,112]}

I want to build a dataframe:

key_id ids
1      1
1      2 
1      3
1      4
1      5
2      55
2      22
2      112

How can I do this? I am trying to play with pd.DataFrame.from_dict() but it seems as not the right approach. Also tried to run over the dict, and create a list with the same key as long as the value list's length. Is there any efficient way to do this?

CodePudding user response:

I think a simple list comprehension would suffice here:

pd.DataFrame(
    [(k, i) for k, v in d.items() for i in v], 
    columns=['key_id', 'ids']
)

   key_id  ids
0       1    1
1       1    2
2       1    3
3       1    4
4       1    5
5       2   55
6       2   22
7       2  112

CodePudding user response:

You could use a Series and explode:

d = {1: [1,2,3,4,5], 2: [55,22,112]}
df = (
 pd.Series(d, name='ids')
   .explode()
   .rename_axis('key_ids').reset_index()
)

output:

   key_ids  ids
0        1    1
1        1    2
2        1    3
3        1    4
4        1    5
5        2   55
6        2   22
7        2  112

CodePudding user response:

Here's one solution I just though of (probably not the best):

df = (pd.json_normalize(d).T
        .reset_index()
        .explode(0)
        .reset_index(drop=True)
        .set_axis(['key_id', 'ids'], axis=1)
     )

Output:

>>> df
   key_id  ids
0       1    1
1       1    2
2       1    3
3       1    4
4       1    5
5       2   55
6       2   22
7       2  112

CodePudding user response:

another way is to use a list comp & assign.

df = pd.concat([pd.DataFrame({'ids' : v}
             ).assign(key_ids=k) for k,v in d.items()])[['key_ids', 'ids']]


print(df)

   key_ids  ids
0        1    1
1        1    2
2        1    3
3        1    4
4        1    5
0        2   55
1        2   22
2        2  112

  • Related