I have a dictionary of sets in python, something like:
{' ': {'---', '--0', '-00', '0--', '00-', '000'}, '0 ': {' --', ' 0-', '---', '--0', '-00', '0--', '00-', '000'}}
and I want to convert this into a pandas dictionary, with two columns: the first being the indices of the dictionary and the 2nd, being the set of strings. When I try to do this with Dataframe.from_dict, pandas creates as many columns as the max number of strings in a set.
CodePudding user response:
You can do explode
, d
is your dict here
#d = {' ': {'---', '--0', '-00', '0--', '00-', '000'}, '0 ': {' --', ' 0-', '---', '--0', '-00', '0--', '00-', '000'}}
out = pd.Series(d).explode().reset_index(name='value')
Out[306]:
index value
0 ---
1 00-
2 -00
3 000
4 --0
5 0--
6 0 ---
7 0 0-
8 0 --
9 0 00-
10 0 -00
11 0 000
12 0 --0
13 0 0--
Or just
pd.Series(d).reset_index(name='value')
Out[310]:
index value
0 {---, 00-, -00, 000, --0, 0--}
1 0 {---, 0-, --, 00-, -00, 000, --0, 0--}
CodePudding user response:
I think you should surround your value in dict by list.
import pandas as pd
test_dict = {
" ": {"---", "--0", "-00", "0--", "00-", "000"},
"0 ": {" --", " 0-", "---", "--0", "-00", "0--", "00-", "000"},
}
for key, value in test_dict.items():
test_dict[key] = [value]
print(test_dict)
then your dict change to this:
{
" ": [{"00-", "-00", "---", "0--", "--0", "000"}],
"0 ": [{"00-", "-00", "---", " --", "0--", " 0-", "--0", "000"}],
}
last, use from_dict:
test_df = pd.DataFrame.from_dict(test_dict, orient="index").reset_index()
print(test_df)
this is the result.
index 0
0 {00-, -00, ---, 0--, --0, 000}
1 0 {00-, -00, ---, --, 0--, 0-, --0, 000}
CodePudding user response:
If you want the set to remain whole, you can try:
data = {' ': {'---', '--0', '-00', '0--', '00-', '000'}, '0 ': {' --', ' 0-', '---', '--0', '-00', '0--', '00-', '000'}}
pd.DataFrame([data.keys(), data.values()]).T
0 1
0 {--0, -00, ---, 00-, 0--, 000}
1 0 {--0, -00, --, ---, 00-, 0--, 000, 0-}
CodePudding user response:
I would imagine manually mapping your keys and values to specific columns would be suitable for your problem.
a = {' ': {'---', '--0', '-00', '0--', '00-', '000'}, '0 ': {' --', ' 0-', '---', '--0', '-00', '0--', '00-', '000'}}
pd.DataFrame({'key': a.keys(), 'value': a.values()})