Home > Net >  More efficient way of converting a list of a single dictionary in pandas rows to column titles and c
More efficient way of converting a list of a single dictionary in pandas rows to column titles and c

Time:10-16

I have a pandas dataframe column that consists of a list of a single dictionary in each row (created from a machine learning algorithm run on other data in the dataframe). My goal is to convert the dictionary keys into the title of new dataframe columns and the row values will be the values of the dictionaries. I've searched through a number of different articles on stackoverflow, but have not found a definitive answer.

The list of a single dictionaries looks like this in each row of an existing column:

[{'label': 'POSITIVE', 'score': 0.9969509840011597}] 

I know I can break the task into multiple steps, such as a function that will convert the list of dictionaries into a single dictionary, and then convert those keys into the new column titles - like so:

# function to break the list of single dictionary into a dictionary
def make_dict(list):
    d1={}
    for i in my_list:
        d1.update(i)
    return d1

analyze['sentiment'] = analyze['sentiment'].swifter.apply(make_dict)

The output in the 'sentiment' column in the dataframe is what you would expect:

{'label': 'POSITIVE', 'score': 0.9969509840011597} 

From here I would just assign the keys to column names and values to the columnar data. However, is there a more efficient way with existing pandas method to extract the keys into columns and values into rows without the intermediate step of using the function?

CodePudding user response:

If each list has only one dictionary, try:

analyze[['label','score']] = analyze['sentiment'].agg(pd.Series)[0].agg(pd.Series)

The first agg(pd.Series) will give rows only with dictionaries, and since you have only one dictionary, use agg again on the first exploded column. Output:

       label    score
0   POSITIVE    0.996951
1   POSITIVE    0.996951
2   POSITIVE    0.996951
3   POSITIVE    0.996951
4   POSITIVE    0.996951
5   POSITIVE    0.996951
  • Related