Given a dict:
{1: [1,2,3,4,5], 2: [55,22,112]}
I want to build a dataframe:
key_id ids
1 1
1 2
1 3
1 4
1 5
2 55
2 22
2 112
How can I do this?
I am trying to play with pd.DataFrame.from_dict()
but it seems as not the right approach.
Also tried to run over the dict, and create a list with the same key as long as the value list's length.
Is there any efficient way to do this?
CodePudding user response:
I think a simple list comprehension would suffice here:
pd.DataFrame(
[(k, i) for k, v in d.items() for i in v],
columns=['key_id', 'ids']
)
key_id ids
0 1 1
1 1 2
2 1 3
3 1 4
4 1 5
5 2 55
6 2 22
7 2 112
CodePudding user response:
You could use a Series and explode
:
d = {1: [1,2,3,4,5], 2: [55,22,112]}
df = (
pd.Series(d, name='ids')
.explode()
.rename_axis('key_ids').reset_index()
)
output:
key_ids ids
0 1 1
1 1 2
2 1 3
3 1 4
4 1 5
5 2 55
6 2 22
7 2 112
CodePudding user response:
Here's one solution I just though of (probably not the best):
df = (pd.json_normalize(d).T
.reset_index()
.explode(0)
.reset_index(drop=True)
.set_axis(['key_id', 'ids'], axis=1)
)
Output:
>>> df
key_id ids
0 1 1
1 1 2
2 1 3
3 1 4
4 1 5
5 2 55
6 2 22
7 2 112
CodePudding user response:
another way is to use a list comp & assign.
df = pd.concat([pd.DataFrame({'ids' : v}
).assign(key_ids=k) for k,v in d.items()])[['key_ids', 'ids']]
print(df)
key_ids ids
0 1 1
1 1 2
2 1 3
3 1 4
4 1 5
0 2 55
1 2 22
2 2 112