This question is about reshaping data from a CSV file using Python. I have a CSV file containing a wide table of values. Each row represents an organization and each column contains the value of a different variable.
How can I reshape this data so that each row represents a tuple of (date, orgID, variable, value):
Original shape:
Date | Org ID | A | B | C | D |
---|---|---|---|---|---|
6/30/2022 | 04815 | 10 | 15 | 20 | 30 |
6/30/2022 | 01712 | 4 | 8 | 9 | 14 |
Desired shape:
Date | Org ID | Variable | Value |
---|---|---|---|
6/30/2022 | 04815 | A | 10 |
6/30/2022 | 04815 | B | 15 |
6/30/2022 | 04815 | C | 20 |
6/30/2022 | 04815 | D | 30 |
6/30/2022 | 01712 | A | 4 |
6/30/2022 | 01712 | B | 8 |
6/30/2022 | 01712 | C | 9 |
6/30/2022 | 01712 | D | 14 |
CodePudding user response:
you can use melt
:
res = df.melt(id_vars=['Date','Org ID'])
output :
Date Org ID variable value
0 6/30/2022 4815 A 10
1 6/30/2022 1712 A 4
2 6/30/2022 4815 B 15
3 6/30/2022 1712 B 8
4 6/30/2022 4815 C 20
5 6/30/2022 1712 C 9
6 6/30/2022 4815 D 30
7 6/30/2022 1712 D 14
CodePudding user response:
You can use pd.melt
then use pandas.sort_values
on column='Org ID'
by ascending=False
.
df = pd.melt(df, id_vars=['Date', 'Org ID']).sort_values('Org ID', ascending=False)
print(df)
Date Org ID variable value
0 6/30/2022 4815 A 10
2 6/30/2022 4815 B 15
4 6/30/2022 4815 C 20
6 6/30/2022 4815 D 30
1 6/30/2022 1712 A 4
3 6/30/2022 1712 B 8
5 6/30/2022 1712 C 9
7 6/30/2022 1712 D 14