I have a piece of data that looks like this
my_data[:5]
returns:
[{'key': ['Aaliyah', '2', '2016'], 'values': ['10']},
{'key': ['Aaliyah', '2', '2017'], 'values': ['26']},
{'key': ['Aaliyah', '2', '2018'], 'values': ['21']},
{'key': ['Aaliyah', '2', '2019'], 'values': ['26']},
{'key': ['Aaliyah', '2', '2020'], 'values': ['15']}]
The key represents Name, Gender, and Year. The value is number. I do not manage to generate a data frame with columns name, gender, year, and number.
Can you help me?
CodePudding user response:
Here is one way, using a generator:
from itertools import chain
pd.DataFrame.from_records((dict(zip(['name', 'gender', 'year', 'number'],
chain(*e.values())))
for e in my_data))
Without itertools:
pd.DataFrame(((E:=list(e.values()))[0] E[1] for e in my_data),
columns=['name', 'gender', 'year', 'number'])
output:
name gender year number
0 Aaliyah 2 2016 10
1 Aaliyah 2 2017 26
2 Aaliyah 2 2018 21
3 Aaliyah 2 2019 26
4 Aaliyah 2 2020 15
CodePudding user response:
You can iterate over list
of dict
. Get all values then use chain
to get list of lists and convert this to DataFrame like below:
>>> from itertools import chain
>>> table = [chain.from_iterable(m.values()) for m in my_data]
>>> pd.DataFrame(table, columns=['name', 'gender', 'year', 'number'])
name gender year number
0 Aaliyah 2 2016 10
1 Aaliyah 2 2017 26
2 Aaliyah 2 2018 21
3 Aaliyah 2 2019 26
4 Aaliyah 2 2020 15
# for more explanation
>>> [list(chain.from_iterable(m.values())) for m in my_data]
[['Aaliyah', '2', '2016', '10'],
['Aaliyah', '2', '2017', '26'],
['Aaliyah', '2', '2018', '21'],
['Aaliyah', '2', '2019', '26'],
['Aaliyah', '2', '2020', '15']]