Home > Software engineering >  How can I append each key and value of dictionary to columns?
How can I append each key and value of dictionary to columns?

Time:08-24

In Python, I have a dataframe that has json variables.

For example, if keys and values of a column are like:

{'a': 3, 'b': 2, 'c': 1}
{'d': 287, 'e': 56, 'f': 17, 'g': 9}
{'h': 22, 'i': 15, 'a': 7}
{'g': 17, 'd': 15, 'e': 15, 'f': 7, 'h': 15}

I want to append columns to be like

key_1 value_1 key_2 value_2 key_3 value_3 key_4 value_4 key_5 value_5
a      3       b      2       c      1
d      287     e      56      f      17     g      9
h      22      i      15      a       7       
g      17      d      15      e      15     f      7      h      15

i.e.,

  1. One pair generates two columns. (key_, value_)

  2. Each row has different numbers of key-value pairs.

  3. Though keys are the same, the values of the keys can be different.

  4. Though keys are different, the values of the keys can be the same.

  5. I need to make only up to 600 columns (300 pairs * 2).

CodePudding user response:

Try:

# if type of the column is string, convert it to python object first:

#from ast import literal_eval
#df["column"] = df["column"].apply(literal_eval)

x = (
    df["column"]
    .apply(
        lambda x: {
            f"{k}{i}": v
            for i, t in enumerate(x.items(), 1)
            for k, v in zip(["key_", "value_"], t)
        },
    )
    .to_list()
)

print(pd.DataFrame(x).fillna(""))

Prints:

  key_1  value_1 key_2  value_2 key_3  value_3 key_4 value_4 key_5 value_5
0     a        3     b        2     c        1                            
1     d      287     e       56     f       17     g     9.0              
2     h       22     i       15     a        7                            
3     g       17     d       15     e       15     f     7.0     h    15.0

Dataframe used:

                                         column
0                      {'a': 3, 'b': 2, 'c': 1}
1          {'d': 287, 'e': 56, 'f': 17, 'g': 9}
2                    {'h': 22, 'i': 15, 'a': 7}
3  {'g': 17, 'd': 15, 'e': 15, 'f': 7, 'h': 15}

CodePudding user response:

For fun (and as a "challenge", I would have used something similar to @Andrej's approach in real life that is more concise and efficient), here is a pure pandas solution:

s = df['col'].apply(list).explode().rename('key')
idx = pd.MultiIndex.from_arrays([s.index, s])

(pd
 .json_normalize(df['col']).stack()
 .to_frame(name='value')
 .reindex(idx)
 .assign(pos=lambda d: d.groupby(level=0).cumcount().add(1))
 .reset_index()
 .pivot(index='level_0', columns='pos')
 .sort_index(level=1, axis=1, sort_remaining=False)
 .pipe(lambda d: d.set_axis(d.columns.map(lambda x: f'{x[0]}_{x[1]}'), axis=1))
)

output:

        key_1  value_1 key_2  value_2 key_3  value_3 key_4  value_4 key_5  value_5
level_0                                                                           
0           a      3.0     b      2.0     c      1.0   NaN      NaN   NaN      NaN
1           d    287.0     e     56.0     f     17.0     g      9.0   NaN      NaN
2           h     22.0     i     15.0     a      7.0   NaN      NaN   NaN      NaN
3           g     17.0     d     15.0     e     15.0     f      7.0     h     15.0
  • Related