Home > front end >  Convert multiple columns into a single column in Python
Convert multiple columns into a single column in Python

Time:12-04

Here is the dataset I have:

Employee Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Year Index Program Status Bid Category Account
E_123456789 #missing #missing #missing #missing 1 1 1 1 1 1 1 1 FY22 I_1234 NA Entity Final Janitor 1 Overhead
E_123456799 #missing 0 0 0 0 0 0 .5 .9 .1 0 0 FY22 I_1255 P12345123 Extended Cook 3 Utilization

This is what I'd like to have:

Employee Month Year Index Program Status Bid Category Account Percentage
E_123456789 Jul FY22 I_1234 NA Entity Final Janitor 1 Overhead #missing
E_123456789 Aug FY22 I_1234 NA Entity Final Janitor 1 Overhead #missing
E_123456789 Sep FY22 I_1234 NA Entity Final Janitor 1 Overhead #missing
E_123456789 Oct FY22 I_1234 NA Entity Final Janitor 1 Overhead #missing
E_123456789 Nov FY22 I_1234 NA Entity Final Janitor 1 Overhead 1
E_123456789 Dec FY22 I_1234 NA Entity Final Janitor 1 Overhead 1
E_123456789 Jan FY22 I_1234 NA Entity Final Janitor 1 Overhead 1
E_123456789 Feb FY22 I_1234 NA Entity Final Janitor 1 Overhead 1
E_123456789 Mar FY22 I_1234 NA Entity Final Janitor 1 Overhead 1
E_123456789 Apr FY22 I_1234 NA Entity Final Janitor 1 Overhead 1
E_123456789 May FY22 I_1234 NA Entity Final Janitor 1 Overhead 1
E_123456789 Jun FY22 I_1234 NA Entity Final Janitor 1 Overhead 1
E_123456799 Jul FY22 I_1255 P12345123 Extended Cook 3 Utilization #missing
E_123456799 Aug FY22 I_1255 P12345123 Extended Cook 3 Utilization 0
E_123456799 Sep FY22 I_1255 P12345123 Extended Cook 3 Utilization 0
E_123456799 Oct FY22 I_1255 P12345123 Extended Cook 3 Utilization 0
E_123456799 Nov FY22 I_1255 P12345123 Extended Cook 3 Utilization 0
E_123456799 Dec FY22 I_1255 P12345123 Extended Cook 3 Utilization 0
E_123456799 Jan FY22 I_1255 P12345123 Extended Cook 3 Utilization 0
E_123456799 Feb FY22 I_1255 P12345123 Extended Cook 3 Utilization .5
E_123456799 Mar FY22 I_1255 P12345123 Extended Cook 3 Utilization .9
E_123456799 Apr FY22 I_1255 P12345123 Extended Cook 3 Utilization .1
E_123456799 May FY22 I_1255 P12345123 Extended Cook 3 Utilization 0
E_123456799 Jun FY22 I_1255 P12345123 Extended Cook 3 Utilization 0

I found steps that seemed to do the reverse of this (convert the "Month" column into Jul/Aug/Sep/etc) but I can't find anything related to this. I am very new to Python (read: a couple weeks of use) and am sure there is a relatively easy way to do this that I just don't know.

Here is my file:

import pandas as pd

data = pd.read_csv('SMART_Export_Test.csv')

Using pivot does the opposite of what I want (it spreads the values in the Jul column into new column names):

data.pivot(columns = 'Jul', values='Jul')

Any help/direction would be greatly appreciated!

CodePudding user response:

Try with melt:

>>> df.melt(["Employee", "Year", "Index", "Program", "Status", "Bid Category", "Account"], 
            ["Jan", "Feb", "Mar", "Apr", "May", "Jun", 
             "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"],
            "Percentage")

       Employee  Year   Index  ...      Account Percentage     value
0   E_123456789  FY22  I_1234  ...     Overhead        Jan         1
1   E_123456799  FY22  I_1255  ...  Utilization        Jan         0
2   E_123456789  FY22  I_1234  ...     Overhead        Feb       1.0
3   E_123456799  FY22  I_1255  ...  Utilization        Feb       0.5
4   E_123456789  FY22  I_1234  ...     Overhead        Mar       1.0
5   E_123456799  FY22  I_1255  ...  Utilization        Mar       0.9
6   E_123456789  FY22  I_1234  ...     Overhead        Apr       1.0
7   E_123456799  FY22  I_1255  ...  Utilization        Apr       0.1
8   E_123456789  FY22  I_1234  ...     Overhead        May         1
9   E_123456799  FY22  I_1255  ...  Utilization        May         0
10  E_123456789  FY22  I_1234  ...     Overhead        Jun         1
11  E_123456799  FY22  I_1255  ...  Utilization        Jun         0
12  E_123456789  FY22  I_1234  ...     Overhead        Jul  #missing
13  E_123456799  FY22  I_1255  ...  Utilization        Jul  #missing
14  E_123456789  FY22  I_1234  ...     Overhead        Aug  #missing
15  E_123456799  FY22  I_1255  ...  Utilization        Aug         0
16  E_123456789  FY22  I_1234  ...     Overhead        Sep  #missing
17  E_123456799  FY22  I_1255  ...  Utilization        Sep         0
18  E_123456789  FY22  I_1234  ...     Overhead        Oct  #missing
19  E_123456799  FY22  I_1255  ...  Utilization        Oct         0
20  E_123456789  FY22  I_1234  ...     Overhead        Nov         1
21  E_123456799  FY22  I_1255  ...  Utilization        Nov         0
22  E_123456789  FY22  I_1234  ...     Overhead        Dec         1
23  E_123456799  FY22  I_1255  ...  Utilization        Dec         0

[24 rows x 9 columns]
  • Related