Home > Software design >  Convert dictionary of sets to pandas dataframe
Convert dictionary of sets to pandas dataframe

Time:02-24

I have a dictionary of sets in python, something like:

{'   ': {'---', '--0', '-00', '0--', '00-', '000'}, '0  ': {' --', ' 0-', '---', '--0', '-00', '0--', '00-', '000'}}

and I want to convert this into a pandas dictionary, with two columns: the first being the indices of the dictionary and the 2nd, being the set of strings. When I try to do this with Dataframe.from_dict, pandas creates as many columns as the max number of strings in a set.

CodePudding user response:

You can do explode, d is your dict here

#d = {'   ': {'---', '--0', '-00', '0--', '00-', '000'}, '0  ': {' --', ' 0-', '---', '--0', '-00', '0--', '00-', '000'}}
out = pd.Series(d).explode().reset_index(name='value')
Out[306]: 
   index value
0          ---
1          00-
2          -00
3          000
4          --0
5          0--
6    0     ---
7    0      0-
8    0      --
9    0     00-
10   0     -00
11   0     000
12   0     --0
13   0     0--

Or just

pd.Series(d).reset_index(name='value')
Out[310]: 
  index                                     value
0                  {---, 00-, -00, 000, --0, 0--}
1   0    {---,  0-,  --, 00-, -00, 000, --0, 0--}

CodePudding user response:

I think you should surround your value in dict by list.

import pandas as pd
test_dict = {
    "   ": {"---", "--0", "-00", "0--", "00-", "000"},
    "0  ": {" --", " 0-", "---", "--0", "-00", "0--", "00-", "000"},
}
for key, value in test_dict.items():
    test_dict[key] = [value]
print(test_dict)

then your dict change to this:

{
    "   ": [{"00-", "-00", "---", "0--", "--0", "000"}],
    "0  ": [{"00-", "-00", "---", " --", "0--", " 0-", "--0", "000"}],
}

last, use from_dict:

test_df = pd.DataFrame.from_dict(test_dict, orient="index").reset_index()
print(test_df)

this is the result.

index                                         0
0                  {00-, -00, ---, 0--, --0, 000}
1   0    {00-, -00, ---,  --, 0--,  0-, --0, 000}

CodePudding user response:

If you want the set to remain whole, you can try:

data = {'   ': {'---', '--0', '-00', '0--', '00-', '000'}, '0  ': {' --', ' 0-', '---', '--0', '-00', '0--', '00-', '000'}}

pd.DataFrame([data.keys(), data.values()]).T

     0                                         1
0                 {--0, -00, ---, 00-, 0--, 000}
1  0    {--0, -00,  --, ---, 00-, 0--, 000,  0-}

CodePudding user response:

I would imagine manually mapping your keys and values to specific columns would be suitable for your problem.

a = {'   ': {'---', '--0', '-00', '0--', '00-', '000'}, '0  ': {' --', ' 0-', '---', '--0', '-00', '0--', '00-', '000'}}

pd.DataFrame({'key': a.keys(), 'value': a.values()})
  • Related