Pandas explode dictionary to row while maintaining multi-index-CodePudding

Having now checked a multitude of Stack Overflow threads on this, I'm struggling to apply the answers to my particular use case so hoping someone can help me on my specific problem.

I'm trying to explode data out of a dictionary into two separate columns while maintaining a multi-index.

Here is what I currently have:

| short_url | platform | css_problem_files                                                    |
|-----------|----------|----------------------------------------------------------------------|
| /url_1/   | desktop  | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |
|           | mobile   | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |
| /url_2/   | desktop  | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |
|           | mobile   | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |

and here is what I would like to achieve:

| short_url | platform | css_file   | css_value |
|-----------|----------|------------|-----------|
| /url_1/   | desktop  | css_file_1 | css_value |
|           |          | css_file_2 | css_value |
|           |          | css_file_3 | css_value |
|           | mobile   | css_file_1 | css_value |
|           |          | css_file_2 | css_value |
|           |          | css_file_3 | css_value |
| /url_2/   | desktop  | css_file_1 | css_value |
|           |          | css_file_2 | css_value |
|           |          | css_file_3 | css_value |
|           | mobile   | css_file_1 | css_value |
|           |          | css_file_2 | css_value |
|           |          | css_file_3 | css_value |

The only thing I've come up with that's remotely close to what I need is the below, however this is creating over 200K rows when I'd expect it to be only in the thousands (and I've not included platform yet):

m = pd.DataFrame([*df['css_problem_files']], df.index).stack()\
      .rename_axis([None,'css_files']).reset_index(1, name='pct usage')

out = df[['short_url']].join(m)

Any assistance or a point in the right direction would be greatly appreciated

CodePudding user response：

If you turn the dictionaries into lists of key-value pairs, you can explode them and then transform the result into two new columns with .apply(pd.Series) (and rename them to your liking) like so:

df = (df
      .css_problem_files.apply(dict.items) # turn into key value list
      .explode() # explode
      .apply(pd.Series) # turn into columns
      .rename(columns={0: "css_file", 1: "css_value"}) # rename
      )

CodePudding user response：

It was annoying to parse your data frame from a Markdown table. It would have been much easier if it were in a DataFrame constructor.

df = pd.read_csv(io.StringIO('''short_url | platform | css_problem_files                                                   
/url_1/   | desktop  | {'css_file_1': 'css_value', 'css_file_2': 'css_value', 'css_fle_3': 'css_value'}
/url_1/   | mobile   | {'css_file_1': 'css_value', 'css_file_2': 'css_value', 'css_fle_3': 'css_value'}
/url_2/   | desktop  | {'css_file_1': 'css_value', 'css_file_2': 'css_value', 'css_fle_3': 'css_value'}
/url_2/   | mobile   | {'css_file_1': 'css_value', 'css_file_2': 'css_value', 'css_fle_3': 'css_value'}'''), sep=r'\s \|\s ', index_col=[0, 1])
df['css_problem_files'] = df['css_problem_files'].apply(eval)

pd.DataFrame.apply(pd.Series) is the most straightforward way to make this happen, though it is really inefficient (see answer for illustration thereof). This uses the DataFrame constructor directly, which avoids that performance issue. It preserves the index by copying it over to the new data frame and provides columns explicitly:

temp_df = df['css_problem_files'].apply(dict.items).explode()
pd.DataFrame(temp_df.to_list(), index=temp_df.index, columns=['css_file', 'css_value'])

                      css_file  css_value
short_url platform                       
/url_1/   desktop   css_file_1  css_value
          desktop   css_file_2  css_value
          desktop    css_fle_3  css_value
          mobile    css_file_1  css_value
          mobile    css_file_2  css_value
          mobile     css_fle_3  css_value
/url_2/   desktop   css_file_1  css_value
          desktop   css_file_2  css_value
          desktop    css_fle_3  css_value
          mobile    css_file_1  css_value
          mobile    css_file_2  css_value
          mobile     css_fle_3  css_value