I am trying to use pd.melt()
to melt a dataframe and change it from pivot wide to pivot long. But since the number of columns in the dataframe is quite large, I was wondering if there is a way to input the range of the columns or any other easy way than to write the column header names in the pd.melt()
statement.
Attached is the data frame which I am using:
{'Key': {0: 'DE|12345', 1: 'DE|292956', 2: 'DE|339231', 3: 'DE|281014', 4: 'DE|121274', 5: 'DE|924952', 6: 'DE|935053', 7: 'DE|945154', 8: 'DE|955255', 9: 'DE|965356', 10: 'DE|975457', 11: 'DE|985558', 12: 'DE|995659'}, '2019-01-01': {0: 51, 1: 29, 2: 33, 3: 28, 4: 12, 5: 92, 6: 93, 7: 94, 8: 95, 9: 96, 10: 97, 11: 98, 12: 99}, '2019-02-01': {0: 5, 1: 29, 2: 92, 3: 10, 4: 12, 5: 49, 6: 50, 7: 51, 8: 52, 9: 53, 10: 54, 11: 55, 12: 56}, '2019-03-01': {0: 41, 1: 56, 2: 31, 3: 14, 4: 74, 5: 52, 6: 53, 7: 54, 8: 55, 9: 56, 10: 57, 11: 58, 12: 59}, '2019-04-01': {0: 64, 1: 93, 2: 25, 3: 145, 4: 40, 5: 60, 6: 61, 7: 62, 8: 63, 9: 64, 10: 65, 11: 66, 12: 67}, '2019-05-01': {0: 81, 1: 15, 2: 30, 3: 85, 4: 38, 5: 41, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-06-01': {0: 31, 1: 21, 2: 9, 3: 58, 4: 8, 5: 26, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-07-01': {0: 61, 1: 79, 2: 20, 3: 121, 4: 21, 5: 42, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-08-01': {0: 64, 1: 22, 2: 24, 3: 108, 4: 21, 5: 45, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-09-01': {0: 29, 1: 50, 2: 20, 3: 85, 4: 25, 5: 37, 6: 74, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-10-01': {0: 88, 1: 32, 2: 45, 3: 214, 4: 53, 5: 117, 6: 170, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-11-01': {0: 77, 1: 61, 2: 159, 3: 692, 4: 139, 5: 332, 6: 444, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-12-01': {0: 52, 1: 38, 2: 98, 3: 338, 4: 85, 5: 198, 6: 409, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2020-01-01': {0: 46, 1: 48, 2: 78, 3: 191, 4: 66, 5: 89, 6: 237, 7: 0, 8: 21, 9: 13, 10: 8, 11: 0, 12: 0}, '2020-02-01': {0: 46, 1: 40, 2: 45, 3: 124, 4: 34, 5: 49, 6: 98, 7: 45, 8: 143, 9: 46, 10: 11, 11: 0, 12: 0}, '2020-03-01': {0: 42, 1: 50, 2: 30, 3: 96, 4: 26, 5: 29, 6: 50, 7: 17, 8: 46, 9: 19, 10: 5, 11: 0, 12: 0}, '2020-04-01': {0: 62, 1: 104, 2: 37, 3: 128, 4: 41, 5: 52, 6: 111, 7: 55, 8: 10, 9: 17, 10: 6, 11: 0, 12: 0}, '2020-05-01': {0: 50, 1: 70, 2: 40, 3: 50, 4: 39, 5: 32, 6: 81, 7: 28, 8: 0, 9: 17, 10: 10, 11: 0, 12: 0}, '2020-06-01': {0: 55, 1: 50, 2: 55, 3: 50, 4: 0, 5: 3, 6: 48, 7: 8, 8: 20, 9: 11, 10: 6, 11: 0, 12: 0}, '2020-07-01': {0: 41, 1: 1, 2: 32, 3: 51, 4: 0, 5: 0, 6: 48, 7: 16, 8: 0, 9: 18, 10: 9, 11: 0, 12: 5}, '2020-08-01': {0: 34, 1: 117, 2: 11, 3: 77, 4: 0, 5: 0, 6: 59, 7: 17, 8: 14, 9: 9, 10: 10, 11: 0, 12: 5}, '2020-09-01': {0: 46, 1: 84, 2: 11, 3: 37, 4: 10, 5: 7, 6: 70, 7: 41, 8: 119, 9: 62, 10: 13, 11: 0, 12: 9}, '2020-10-01': {0: 16, 1: 93, 2: 33, 3: 115, 4: 77, 5: 9, 6: 85, 7: 44, 8: 183, 9: 27, 10: 32, 11: 14, 12: 71}, '2020-11-01': {0: 29, 1: 55, 2: 43, 3: 67, 4: 25, 5: 105, 6: 87, 7: 93, 8: 235, 9: 72, 10: 21, 11: 9, 12: 31}, '2020-12-01': {0: 16, 1: 78, 2: 53, 3: 462, 4: 7, 5: 118, 6: 152, 7: 59, 8: 190, 9: 88, 10: 18, 11: 1, 12: 64}, '2021-01-01': {0: 17, 1: 58, 2: 134, 3: 303, 4: 94, 5: 118, 6: 196, 7: 57, 8: 124, 9: 50, 10: 18, 11: 31, 12: 8}, '2021-02-01': {0: 25, 1: 42, 2: 46, 3: 83, 4: 81, 5: 98, 6: 79, 7: 36, 8: 60, 9: 31, 10: 16, 11: 26, 12: 34}, '2021-03-01': {0: 14, 1: 74, 2: 6, 3: 106, 4: 43, 5: 55, 6: 92, 7: 25, 8: 147, 9: 46, 10: 32, 11: 33, 12: 51}, '2021-04-01': {0: 46, 1: 142, 2: 45, 3: 105, 4: 22, 5: 40, 6: 108, 7: 119, 8: 210, 9: 43, 10: 26, 11: 21, 12: 40}, '2021-05-01': {0: 43, 1: 49, 2: 20, 3: 52, 4: 16, 5: 14, 6: 7, 7: 63, 8: 75, 9: 21, 10: 20, 11: 19, 12: 34}, '2021-06-01': {0: 31, 1: 83, 2: 15, 3: 40, 4: 19, 5: 1, 6: 24, 7: 84, 8: 209, 9: 38, 10: 16, 11: 21, 12: 18}, '2021-07-01': {0: 22, 1: 36, 2: 8, 3: 31, 4: 9, 5: 2, 6: 43, 7: 28, 8: 115, 9: 9, 10: 20, 11: 24, 12: 27}, '2021-08-01': {0: 15, 1: 42, 2: 24, 3: 26, 4: 28, 5: 2, 6: 45, 7: 28, 8: 89, 9: 9, 10: 7, 11: 27, 12: 26}, '2021-09-01': {0: 13, 1: 29, 2: 6, 3: 32, 4: 12, 5: 12, 6: 30, 7: 19, 8: 72, 9: 5, 10: 16, 11: 16, 12: 22}, '2021-10-01': {0: 25, 1: 20, 2: 21, 3: 26, 4: 38, 5: 6, 6: 67, 7: 23, 8: 107, 9: 10, 10: 9, 11: 36, 12: 30}, '2021-11-01': {0: 34, 1: 30, 2: 73, 3: 121, 4: 62, 5: 66, 6: 245, 7: 48, 8: 201, 9: 23, 10: 20, 11: 23, 12: 77}, '2021-12-01': {0: 30, 1: 41, 2: 210, 3: 321, 4: 95, 5: 169, 6: 278, 7: 42, 8: 191, 9: 57, 10: 51, 11: 51, 12: 72}, '2022-01-01': {0: 16, 1: 20, 2: 80, 3: 180, 4: 96, 5: 113, 6: 348, 7: 25, 8: 136, 9: 49, 10: 11, 11: 74, 12: 59}, '2022-02-01': {0: 11, 1: 23, 2: 48, 3: 72, 4: 45, 5: 56, 6: 129, 7: 20, 8: 113, 9: 3, 10: 6, 11: 47, 12: 33}, '2022-03-01': {0: 6, 1: 54, 2: 37, 3: 47, 4: 29, 5: 27, 6: 75, 7: 43, 8: 117, 9: 10, 10: 3, 11: 22, 12: 15}, '2022-04-01': {0: 6, 1: 17, 2: 9, 3: 29, 4: 7, 5: 9, 6: 40, 7: 20, 8: 81, 9: 4, 10: 3, 11: 12, 12: 10}, '2022-05-01': {0: 12, 1: 29, 2: 7, 3: 10, 4: 3, 5: 5, 6: 33, 7: 59, 8: 65, 9: 19, 10: 12, 11: 2, 12: 3}, '2022-06-01': {0: 12, 1: 34, 2: 21, 3: 28, 4: 19, 5: 16, 6: 122, 7: 67, 8: 68, 9: 12, 10: 5, 11: 5, 12: 9}, '2022-07-01': {0: 3, 1: 13, 2: 2, 3: 14, 4: 2, 5: 1, 6: 9, 7: 11, 8: 17, 9: 1, 10: 2, 11: 7, 12: 5}, '2022-08-01': {0: 3, 1: 42, 2: 2, 3: 19, 4: 1, 5: 7, 6: 26, 7: 28, 8: 26, 9: 42, 10: 2, 11: 5, 12: 9}, '2022-09-01': {0: 2, 1: 7, 2: 4, 3: 9, 4: 2, 5: 3, 6: 22, 7: 1, 8: 13, 9: 6, 10: 2, 11: 12, 12: 2}}
Visually the dataframe looks like this:
I need to melt the columns such that the columns are in rows.
The code I have written is:
df2 = pd.melt(df2,id_vars='Key',value_vars=[])
I am not sure how to add all the column names into value_vars here.? Can someone help me out.
CodePudding user response:
By combing everything which the community has mentioned on the chat, also reading the fact that you need a column name as well after melting, I believe both the codes shared below would work.
df2 = pd.melt(df2,id_vars='Key',value_vars=list(df1.columns[1:]),var_name ='ds')
Also we could get the same in another easier way which is:
df2 = pd.melt(df2, id_vars='Key',var_name='ds')
Very simple as our friend mentioned above, just added the column name by including var_name='ds'
.
CodePudding user response:
Considering that the dataframe df
looks like the following
Key 2019-01-01 ... 2022-07-01 2022-08-01 2022-09-01
0 DE|12345 51 ... 3 3 2
1 DE|292956 29 ... 13 42 7
2 DE|339231 33 ... 2 2 4
3 DE|281014 28 ... 14 19 9
4 DE|121274 12 ... 2 1 2
5 DE|924952 92 ... 1 7 3
6 DE|935053 93 ... 9 26 22
7 DE|945154 94 ... 11 28 1
8 DE|955255 95 ... 17 26 13
9 DE|965356 96 ... 1 42 6
10 DE|975457 97 ... 2 2 2
11 DE|985558 98 ... 7 5 12
12 DE|995659 99 ... 5 9 2
And given OP's goal
melt a dataframe and change it from pivot wide to pivot long.
This would be the difference between wide and long (
In order to do what OP wants, one can use pandas.melt
as follows
df_new = pd.melt(df, id_vars=['Key'], value_vars=df.columns[1:]).rename(columns={'variable': 'date'})
[Out]:
Key date value
0 DE|12345 2019-01-01 51
1 DE|292956 2019-01-01 29
2 DE|339231 2019-01-01 33
3 DE|281014 2019-01-01 28
4 DE|121274 2019-01-01 12
.. ... ... ...
580 DE|955255 2022-09-01 13
581 DE|965356 2022-09-01 6
582 DE|975457 2022-09-01 2
583 DE|985558 2022-09-01 12
584 DE|995659 2022-09-01 2
Note:
- As after the operation the column
date
was getting the namevariable
, one had to pass.rename()
to change the column name todate
.