Home > OS >  Dictionary from columns' elements and indexe
Dictionary from columns' elements and indexe

Time:04-06

I have table like this:

Column A Column B
a [1, 2, 3]
b [4, 1, 2]

And I want to create dictionary like this using NumPy:

{1: [a, b],

2: [a, b],

3: [a],

4: [b]}

is there a more or less simple way to do this?

CodePudding user response:

Let us try with explode

d = df.explode('col2').groupby('col2')['col1'].agg(list).to_dict()
Out[206]: {1: ['a', 'b'], 2: ['a', 'b'], 3: ['a'], 4: ['b']}

CodePudding user response:

As long as I know, numpy doesn't support dictionaries, it actually uses Arrays (numpy Arrays), as you can see here.

But there are many ways to achieve the creation of a dict from a pandas dataframe. Well, looping over dataframes is not a good practice as you can see in this answer, so we can use pandas.to_numpy as follows:

import pandas as pd
import numpy as np

d = {'col1': ['a', 'b'], 'col2': [[1,2,3], [4,1,2]]}
df = pd.DataFrame(data=d)

my_dict = {}
np_array=df.to_numpy()
for row in np_array:
    my_dict.update({row[0]: row[1]})

Output:

>my_dict: {'a': [1, 2, 3], 'b': [4, 1, 2]}

Which is different from the output you wished, but I didn't see the pattern on it. Could you clarity more?

UPDATED

To achieve the output you want, one possible way is to iterate over each row then over the values in the list, like this:

for row in np_array:
    for item in row[1]:
      if item in my_dict.keys():
          my_dict[item].append(row[0])
      else:
          my_dict.update({item: [row[0]]})
  • Related