Home > Back-end >  Reformatting Pandas DataFrame with multiple date and value columns
Reformatting Pandas DataFrame with multiple date and value columns

Time:09-17

I have data in the format:

data = [
    ["true","penguin","canary","01/01/2000","3","01/02/2000","10","01/03/2000", "4", "01/04/2000","181"],
    ["false","tiger","prod","02/01/2000","9","02/02/2000","101","02/03/2000","43","02/04/2000","11"]
]

df = pd.DataFrame(
    data,
    columns=[
       "status","team","env","date_1","value_1","date_2","value_2","date_3","value_3","date_4","value_4"])

And I'd like to get it into the format:

Status team env Date Value
true penguin canary 01/01/2000 3
false tiger prod 02/01/2000 9
true penguin canary 01/02/2000 10
false tiger prod 02/02/2000 101
true penguin canary 01/03/2000 4
false tiger prod 02/03/2000 43
true penguin canary 01/04/2000 181
false tiger prod 02/04/2000 11

I've been trying to use melt() and stack() and unstack(), but the closest I've come has been: df = df.melt(id_vars=["status", "team", "env"]) Which leaves me with a variable column containing 'date_1','date_1','value_1','value_1' and a value column with their respective values.

CodePudding user response:

You can try pandas.wide_to_long

out = (pd.wide_to_long(df, ['date_', 'value_'], i=['status','team','env'], j='idx')
       .rename(columns={'date_': 'Date', 'value_': 'Value'})
       .reset_index()
       .drop(columns='idx'))
print(out)

  status     team     env        Date Value
0   true  penguin  canary  01/01/2000     3
1   true  penguin  canary  01/02/2000    10
2   true  penguin  canary  01/03/2000     4
3   true  penguin  canary  01/04/2000   181
4  false    tiger    prod  02/01/2000     9
5  false    tiger    prod  02/02/2000   101
6  false    tiger    prod  02/03/2000    43
7  false    tiger    prod  02/04/2000    11
  • Related