I need to convert a list of nested dictionary into Pandas Dataframe. My list is the following:
data = [{"2016-09-24":{"totalRevenue":123, "netIncome":456, "ebit":789}}, {"2015-09-24":{"totalRevenue":789, "netIncome":456, "ebit":123}}]
I want to transform the list into something like this, where the dates are column-headers and the rows are the keys to the values in the nested dicts.
I have tried different things, e.g.: https://www.tutorialspoint.com/python-convert-list-of-nested-dictionary-into-pandas-dataframe
But i can't seem to fix my problem.
I hope this makes sense and thanks for your help :-)
Update: I have found a solution :-)
CodePudding user response:
Thanks for the notice on how to write questions @HarryPlotter and thanks for the suggested solution @Geoffrey.
I found an answer to my problem:
pd.concat([pd.DataFrame(l) for l in my_list],axis=1)
CodePudding user response:
Here's my solution.
The for loops could probably be vectorized, but you must watch for correct arrangement of keys. Dictionaries are stored using hash maps and the order is not always returned in the same way. In fact, you might run some code and see that the keys are always returned in the same order, but this behavior is not always guaranteed, so I chose to use for loops.
Also, you must:
import pandas as pd
In case that wasn't already clear...
#Function can be iterated over list of multiple nested dictionaries.
def toPandasDF(listOfDicts,listEl):
#Grab first key of outer dictionary.
firstKey = next(iter(listOfDicts[listEl]))
#Extract rows (indices) of df and headers of df using first key.
indices = list(listOfDicts[listEl][firstKey].keys())
headers = list(listOfDicts[listEl].keys())
#Initialize df.
df = pd.DataFrame(columns = headers, index = indices)
#Store relevant information in respective df elements. (could be vectorized)
for row in range(len(indices)):
for col in range(len(headers)):
df.iat[row,col] = listOfDicts[listEl][headers[col]][indices[row]]
#Return df
return df
One more thing, here's how I'd iterate over a list of dicts and extract multiple data frames:
for k in range(len(listOfDicts)):
df = toPandasDF(k)
but without an example, its tough to tell if this would work for your application.