Home > Software design >  Pivot Wide to Long without using all Column names in Python
Pivot Wide to Long without using all Column names in Python

Time:09-29

I am trying to use pd.melt() to melt a dataframe and change it from pivot wide to pivot long. But since the number of columns in the dataframe is quite large, I was wondering if there is a way to input the range of the columns or any other easy way than to write the column header names in the pd.melt() statement.

Attached is the data frame which I am using:

{'Key': {0: 'DE|12345', 1: 'DE|292956', 2: 'DE|339231', 3: 'DE|281014', 4: 'DE|121274', 5: 'DE|924952', 6: 'DE|935053', 7: 'DE|945154', 8: 'DE|955255', 9: 'DE|965356', 10: 'DE|975457', 11: 'DE|985558', 12: 'DE|995659'}, '2019-01-01': {0: 51, 1: 29, 2: 33, 3: 28, 4: 12, 5: 92, 6: 93, 7: 94, 8: 95, 9: 96, 10: 97, 11: 98, 12: 99}, '2019-02-01': {0: 5, 1: 29, 2: 92, 3: 10, 4: 12, 5: 49, 6: 50, 7: 51, 8: 52, 9: 53, 10: 54, 11: 55, 12: 56}, '2019-03-01': {0: 41, 1: 56, 2: 31, 3: 14, 4: 74, 5: 52, 6: 53, 7: 54, 8: 55, 9: 56, 10: 57, 11: 58, 12: 59}, '2019-04-01': {0: 64, 1: 93, 2: 25, 3: 145, 4: 40, 5: 60, 6: 61, 7: 62, 8: 63, 9: 64, 10: 65, 11: 66, 12: 67}, '2019-05-01': {0: 81, 1: 15, 2: 30, 3: 85, 4: 38, 5: 41, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-06-01': {0: 31, 1: 21, 2: 9, 3: 58, 4: 8, 5: 26, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-07-01': {0: 61, 1: 79, 2: 20, 3: 121, 4: 21, 5: 42, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-08-01': {0: 64, 1: 22, 2: 24, 3: 108, 4: 21, 5: 45, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-09-01': {0: 29, 1: 50, 2: 20, 3: 85, 4: 25, 5: 37, 6: 74, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-10-01': {0: 88, 1: 32, 2: 45, 3: 214, 4: 53, 5: 117, 6: 170, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-11-01': {0: 77, 1: 61, 2: 159, 3: 692, 4: 139, 5: 332, 6: 444, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2019-12-01': {0: 52, 1: 38, 2: 98, 3: 338, 4: 85, 5: 198, 6: 409, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0}, '2020-01-01': {0: 46, 1: 48, 2: 78, 3: 191, 4: 66, 5: 89, 6: 237, 7: 0, 8: 21, 9: 13, 10: 8, 11: 0, 12: 0}, '2020-02-01': {0: 46, 1: 40, 2: 45, 3: 124, 4: 34, 5: 49, 6: 98, 7: 45, 8: 143, 9: 46, 10: 11, 11: 0, 12: 0}, '2020-03-01': {0: 42, 1: 50, 2: 30, 3: 96, 4: 26, 5: 29, 6: 50, 7: 17, 8: 46, 9: 19, 10: 5, 11: 0, 12: 0}, '2020-04-01': {0: 62, 1: 104, 2: 37, 3: 128, 4: 41, 5: 52, 6: 111, 7: 55, 8: 10, 9: 17, 10: 6, 11: 0, 12: 0}, '2020-05-01': {0: 50, 1: 70, 2: 40, 3: 50, 4: 39, 5: 32, 6: 81, 7: 28, 8: 0, 9: 17, 10: 10, 11: 0, 12: 0}, '2020-06-01': {0: 55, 1: 50, 2: 55, 3: 50, 4: 0, 5: 3, 6: 48, 7: 8, 8: 20, 9: 11, 10: 6, 11: 0, 12: 0}, '2020-07-01': {0: 41, 1: 1, 2: 32, 3: 51, 4: 0, 5: 0, 6: 48, 7: 16, 8: 0, 9: 18, 10: 9, 11: 0, 12: 5}, '2020-08-01': {0: 34, 1: 117, 2: 11, 3: 77, 4: 0, 5: 0, 6: 59, 7: 17, 8: 14, 9: 9, 10: 10, 11: 0, 12: 5}, '2020-09-01': {0: 46, 1: 84, 2: 11, 3: 37, 4: 10, 5: 7, 6: 70, 7: 41, 8: 119, 9: 62, 10: 13, 11: 0, 12: 9}, '2020-10-01': {0: 16, 1: 93, 2: 33, 3: 115, 4: 77, 5: 9, 6: 85, 7: 44, 8: 183, 9: 27, 10: 32, 11: 14, 12: 71}, '2020-11-01': {0: 29, 1: 55, 2: 43, 3: 67, 4: 25, 5: 105, 6: 87, 7: 93, 8: 235, 9: 72, 10: 21, 11: 9, 12: 31}, '2020-12-01': {0: 16, 1: 78, 2: 53, 3: 462, 4: 7, 5: 118, 6: 152, 7: 59, 8: 190, 9: 88, 10: 18, 11: 1, 12: 64}, '2021-01-01': {0: 17, 1: 58, 2: 134, 3: 303, 4: 94, 5: 118, 6: 196, 7: 57, 8: 124, 9: 50, 10: 18, 11: 31, 12: 8}, '2021-02-01': {0: 25, 1: 42, 2: 46, 3: 83, 4: 81, 5: 98, 6: 79, 7: 36, 8: 60, 9: 31, 10: 16, 11: 26, 12: 34}, '2021-03-01': {0: 14, 1: 74, 2: 6, 3: 106, 4: 43, 5: 55, 6: 92, 7: 25, 8: 147, 9: 46, 10: 32, 11: 33, 12: 51}, '2021-04-01': {0: 46, 1: 142, 2: 45, 3: 105, 4: 22, 5: 40, 6: 108, 7: 119, 8: 210, 9: 43, 10: 26, 11: 21, 12: 40}, '2021-05-01': {0: 43, 1: 49, 2: 20, 3: 52, 4: 16, 5: 14, 6: 7, 7: 63, 8: 75, 9: 21, 10: 20, 11: 19, 12: 34}, '2021-06-01': {0: 31, 1: 83, 2: 15, 3: 40, 4: 19, 5: 1, 6: 24, 7: 84, 8: 209, 9: 38, 10: 16, 11: 21, 12: 18}, '2021-07-01': {0: 22, 1: 36, 2: 8, 3: 31, 4: 9, 5: 2, 6: 43, 7: 28, 8: 115, 9: 9, 10: 20, 11: 24, 12: 27}, '2021-08-01': {0: 15, 1: 42, 2: 24, 3: 26, 4: 28, 5: 2, 6: 45, 7: 28, 8: 89, 9: 9, 10: 7, 11: 27, 12: 26}, '2021-09-01': {0: 13, 1: 29, 2: 6, 3: 32, 4: 12, 5: 12, 6: 30, 7: 19, 8: 72, 9: 5, 10: 16, 11: 16, 12: 22}, '2021-10-01': {0: 25, 1: 20, 2: 21, 3: 26, 4: 38, 5: 6, 6: 67, 7: 23, 8: 107, 9: 10, 10: 9, 11: 36, 12: 30}, '2021-11-01': {0: 34, 1: 30, 2: 73, 3: 121, 4: 62, 5: 66, 6: 245, 7: 48, 8: 201, 9: 23, 10: 20, 11: 23, 12: 77}, '2021-12-01': {0: 30, 1: 41, 2: 210, 3: 321, 4: 95, 5: 169, 6: 278, 7: 42, 8: 191, 9: 57, 10: 51, 11: 51, 12: 72}, '2022-01-01': {0: 16, 1: 20, 2: 80, 3: 180, 4: 96, 5: 113, 6: 348, 7: 25, 8: 136, 9: 49, 10: 11, 11: 74, 12: 59}, '2022-02-01': {0: 11, 1: 23, 2: 48, 3: 72, 4: 45, 5: 56, 6: 129, 7: 20, 8: 113, 9: 3, 10: 6, 11: 47, 12: 33}, '2022-03-01': {0: 6, 1: 54, 2: 37, 3: 47, 4: 29, 5: 27, 6: 75, 7: 43, 8: 117, 9: 10, 10: 3, 11: 22, 12: 15}, '2022-04-01': {0: 6, 1: 17, 2: 9, 3: 29, 4: 7, 5: 9, 6: 40, 7: 20, 8: 81, 9: 4, 10: 3, 11: 12, 12: 10}, '2022-05-01': {0: 12, 1: 29, 2: 7, 3: 10, 4: 3, 5: 5, 6: 33, 7: 59, 8: 65, 9: 19, 10: 12, 11: 2, 12: 3}, '2022-06-01': {0: 12, 1: 34, 2: 21, 3: 28, 4: 19, 5: 16, 6: 122, 7: 67, 8: 68, 9: 12, 10: 5, 11: 5, 12: 9}, '2022-07-01': {0: 3, 1: 13, 2: 2, 3: 14, 4: 2, 5: 1, 6: 9, 7: 11, 8: 17, 9: 1, 10: 2, 11: 7, 12: 5}, '2022-08-01': {0: 3, 1: 42, 2: 2, 3: 19, 4: 1, 5: 7, 6: 26, 7: 28, 8: 26, 9: 42, 10: 2, 11: 5, 12: 9}, '2022-09-01': {0: 2, 1: 7, 2: 4, 3: 9, 4: 2, 5: 3, 6: 22, 7: 1, 8: 13, 9: 6, 10: 2, 11: 12, 12: 2}}

Visually the dataframe looks like this:

enter image description here

I need to melt the columns such that the columns are in rows.

The code I have written is:

df2 =  pd.melt(df2,id_vars='Key',value_vars=[])

I am not sure how to add all the column names into value_vars here.? Can someone help me out.

CodePudding user response:

By combing everything which the community has mentioned on the chat, also reading the fact that you need a column name as well after melting, I believe both the codes shared below would work.

df2 = pd.melt(df2,id_vars='Key',value_vars=list(df1.columns[1:]),var_name ='ds')

Also we could get the same in another easier way which is:

df2 = pd.melt(df2, id_vars='Key',var_name='ds')

Very simple as our friend mentioned above, just added the column name by including var_name='ds'.

CodePudding user response:

Considering that the dataframe df looks like the following

          Key 2019-01-01  ... 2022-07-01 2022-08-01 2022-09-01
 0   DE|12345         51  ...          3          3          2
 1  DE|292956         29  ...         13         42          7
 2  DE|339231         33  ...          2          2          4
 3  DE|281014         28  ...         14         19          9
 4  DE|121274         12  ...          2          1          2
 5  DE|924952         92  ...          1          7          3
 6  DE|935053         93  ...          9         26         22
 7  DE|945154         94  ...         11         28          1
 8  DE|955255         95  ...         17         26         13
 9  DE|965356         96  ...          1         42          6
10  DE|975457         97  ...          2          2          2
11  DE|985558         98  ...          7          5         12
12  DE|995659         99  ...          5          9          2

And given OP's goal

melt a dataframe and change it from pivot wide to pivot long.

This would be the difference between wide and long (enter image description here

In order to do what OP wants, one can use pandas.melt as follows

df_new = pd.melt(df, id_vars=['Key'], value_vars=df.columns[1:]).rename(columns={'variable': 'date'})

[Out]:
           Key        date value
0     DE|12345  2019-01-01    51
1    DE|292956  2019-01-01    29
2    DE|339231  2019-01-01    33
3    DE|281014  2019-01-01    28
4    DE|121274  2019-01-01    12
..         ...         ...   ...
580  DE|955255  2022-09-01    13
581  DE|965356  2022-09-01     6
582  DE|975457  2022-09-01     2
583  DE|985558  2022-09-01    12
584  DE|995659  2022-09-01     2

Note:

  • As after the operation the column date was getting the name variable, one had to pass .rename() to change the column name to date.
  • Related