Home > Enterprise >  Converting to dataframe, beginner question
Converting to dataframe, beginner question

Time:10-18

I have a piece of data that looks like this

    my_data[:5]

returns:

    [{'key': ['Aaliyah', '2', '2016'], 'values': ['10']},
     {'key': ['Aaliyah', '2', '2017'], 'values': ['26']},
     {'key': ['Aaliyah', '2', '2018'], 'values': ['21']},
     {'key': ['Aaliyah', '2', '2019'], 'values': ['26']},
     {'key': ['Aaliyah', '2', '2020'], 'values': ['15']}]

The key represents Name, Gender, and Year. The value is number. I do not manage to generate a data frame with columns name, gender, year, and number.

Can you help me?

CodePudding user response:

Here is one way, using a generator:

from itertools import chain
pd.DataFrame.from_records((dict(zip(['name', 'gender', 'year', 'number'],
                                    chain(*e.values())))
                           for e in my_data))

Without itertools:

pd.DataFrame(((E:=list(e.values()))[0] E[1] for e in my_data),
             columns=['name', 'gender', 'year', 'number'])

output:

      name gender  year number
0  Aaliyah      2  2016     10
1  Aaliyah      2  2017     26
2  Aaliyah      2  2018     21
3  Aaliyah      2  2019     26
4  Aaliyah      2  2020     15

CodePudding user response:

You can iterate over list of dict. Get all values then use chain to get list of lists and convert this to DataFrame like below:

>>> from itertools import chain

>>> table = [chain.from_iterable(m.values()) for m in my_data]

>>> pd.DataFrame(table, columns=['name', 'gender', 'year', 'number'])
     name   gender  year    number
0   Aaliyah 2       2016    10
1   Aaliyah 2       2017    26
2   Aaliyah 2       2018    21
3   Aaliyah 2       2019    26
4   Aaliyah 2       2020    15



# for more explanation
>>> [list(chain.from_iterable(m.values())) for m in my_data]
[['Aaliyah', '2', '2016', '10'],
 ['Aaliyah', '2', '2017', '26'],
 ['Aaliyah', '2', '2018', '21'],
 ['Aaliyah', '2', '2019', '26'],
 ['Aaliyah', '2', '2020', '15']]
  • Related