I have the following subsample of dictionary of lists of dictionaries (from a larger dictionary of millions of items):
bool_dict = {0: [{0: 4680}, {1: 1185}],
1: [{0: 172}, {1: 9}],
2: [{0: 149}, {1: 1282}],
3: [{0: 20}, {1: 127}],
4: [{0: 0}, {1: 0}]}
which I converted to a dataframe of the form:
0 1
0 {0: 4680} {1: 1185}
1 {0: 172} {1: 9}
2 {0: 149} {1: 1282}
3 {0: 20} {1: 127}
4 {0: 0} {1: 0}
by doing the following:
test=pd.DataFrame(bool_dict.values(),columns['0','1'],index=bool_dict.keys()).sort_index()
The problem is that I only need each cell's value, not the key, in the dataframe. So, the desired output is:
0 1
0 4680 1185
1 172 9
2 149 1282
3 20 127
4 0 0
I tried the following:
test['0'] = test['0'].apply(lambda x: x[0])
but then I get a key error on what I thought was a dictionary.
To make sure it indeed was a dictionary, I then tried
from ast import literal_eval
test['0']=test['0'].apply(lambda x: literal_eval(str(x)))
then tried this again
test['0'] = test['0'].apply(lambda x: x[0])
with no success (I also tried the key as '0').
I could do the hacky thing of a split by the :
and then remove extraneous stuff, but that just feels wrong for so many reasons.
CodePudding user response:
One way is to convert the inner list into a dictionary then pass it to the DataFrame constructor:
bool_dict_flattened = {i: {k:v for d in lst for k,v in d.items()} for i, lst in bool_dict.items()}
df = pd.DataFrame.from_dict(bool_dict_flattened, orient='index')
Another option is to apply str
accessor on the columns by using the fact that column names and keys match for each column:
out = pd.DataFrame.from_dict(bool_dict, orient='index').apply(lambda x: x.str[x.name])
Output:
0 1
0 4680 1185
1 172 9
2 149 1282
3 20 127
4 0 0
CodePudding user response:
You can iterate through each row by first lambda and iterate through each cell in that row with the second lambda and read the values of the dictionary:
df = pd.DataFrame(bool_dict).T
df.apply(lambda x: x.apply(lambda y: list(y.values())[0]))
df
0 1
0 4680 1185
1 172 9
2 149 1282
3 20 127
4 0 0
CodePudding user response:
test['0'] = test['0'].apply(lambda x: x[0])
but then I get a key error on what I thought was a dictionary.
You get the key error is because your column name is integer, however, you access it with string. Try
test[0] = test[0].apply(lambda x: x[0])