I have a list of dictionaries. Each dictionary contains a single key-vaue pair. I want to convert this list into a pandas DataFrame that has one column called "time" containing the values as rows from each dictionary with the label for each row being the key from the corresponding dictionary.
As an example, I will show only the first two elements from the list:
list_example = [{'companies_info_5000_5100': 121.20147228240967},\
{'companies_info_5100_5200': 116.49221062660217}]
from this list_example
I want to create a DataFrame like this one:
time | |
---|---|
companies_info_5000_5100 | 121.201472 |
companies_info_5100_5200 | 116.492211 |
I have searched for possible solutions and came up with my own solution which looks like this:
import pandas as pd
df_list = []
for d in list_example:
d_df = pd.DataFrame.from_dict(d, orient="index", columns=["time"])
df_list.append(d_df)
df = pd.concat(df_list,axis= 0)
With this code i get what I want, BUT I am sure there must be some function that does this without the for loop. For example, if I just run df = pd.DataFrame(df_list)
, then it creates a DataFrame but dictionary keys are used as columns. I think there must be some modification of this function that tells pandas to use keys as row labels. Iam looking for this simpler and more elegant and Pythonic solution.
As far as I search here I couldnot find the answer.
CodePudding user response:
You can use:
df = (pd.concat(map(pd.Series, list_example))
.to_frame('time')
)
Output:
time
companies_info_5000_5100 121.201472
companies_info_5100_5200 116.492211
CodePudding user response:
Try this
# build a nested dict from list_example and build df
df = pd.DataFrame.from_dict({k: {'time': v} for d in list_example for k,v in d.items()}, orient='index')
print(df)
time
companies_info_5000_5100 121.201472
companies_info_5100_5200 116.492211
CodePudding user response:
One of possible solutions is to:
- create a Series from each dictionary,
- concatenate them (so far the result is still a Series),
- convert it to a DataFrame, setting the name of the (only) column.
The code to do it is:
result = pd.concat([ pd.Series(d.values(), index=d.keys())
for d in list_example ]).to_frame('time')
For your sample data I got:
time
companies_info_5000_5100 121.201472
companies_info_5100_5200 116.492211
CodePudding user response:
Pandas approach
pd.DataFrame(list_example).stack().droplevel(0).to_frame('time')
time
companies_info_5000_5100 121.201472
companies_info_5100_5200 116.492211