Home > Software design >  Transforming unstructured data from excel into structured in python
Transforming unstructured data from excel into structured in python

Time:09-29

This is basically a "current output vs desired output" question. I'm sure this type of question has been asked before but I did not know what keywords to look up.

Currently, I have data like:

| CostCenter |Monday|Tuesday|Wednesday|
|:-----------|:-----------|:-----------|:-----------|
|A21|$1000|$2000|$3000|
|B21|$1000|$2000|$3000|
|C21|$1000|$2000|$3000|
Desired output is:
| CostCenter |DAY|VALUE| 
|:-----------|:-----------|:-----|
|A21|Monday|$1000|
|A21|Tuesday|$2000|
|A21|Wednesday|$3000|
|B21|Monday|$1000|
|B21|Tuesday|$2000|
|B21|Wednesday|$3000|

etc... Any ideas on how to achieve this using Python Pandas would be appreciated! Thank you! I was thinking of using crossjoin to densify the data and transposing the columns with days but haven't succeed in coding it.

CodePudding user response:

stack and reset the index after setting the index for CostCenter. then reset_index and rename columns

(df.set_index('CostCenter')
    .stack()
    .reset_index()
 .rename(columns={'level_1': 'Day', 0:'Value'})
)
CostCenter  Day         Value
0   A21     Monday      $1000
1   A21     Tuesday     $2000
2   A21     Wednesday   $3000
3   B21     Monday      $1000
4   B21     Tuesday     $2000
5   B21     Wednesday   $3000
6   C21     Monday      $1000
7   C21     Tuesday     $2000
8   C21     Wednesday   $3000
  • Related