Home > Software engineering >  Creating t-1 column with lag values in Pandas
Creating t-1 column with lag values in Pandas

Time:03-03

I have a dataframe

{'Year': {0: 2017, 1: 2018, 2: 2019, 3: 2017, 4: 2018, 5: 2019, 6: 2017, 7: 2018, 8: 2019}, 'Total_articles_per_year': {0: 2, 1: 1, 2: 0, 3: 4, 4: 7, 5: 3, 6: 2, 7: 3, 8: 1}, 'Total_articles_2017': {0: 2, 1: 2, 2: 2, 3: 4, 4: 4, 5: 4, 6: 2, 7: 2, 8: 2}, 'Total_articles_2018': {0: 1, 1: 1, 2: 1, 3: 7, 4: 7, 5: 7, 6: 3, 7: 3, 8: 3}, 'Total_articles_2019 ': {0: 0, 1: 0, 2: 0, 3: 3, 4: 3, 5: 3, 6: 1, 7: 1, 8: 1}}

I want to create a new column t-1.

Condition for this column will be:

  • If df[Year'] == 2017, df[t-1] = NaN
  • If df[Year'] == 2018, for value if df[t-1] take value from Total_articles_per_year2017
  • If df[Year'] == 2019, for value if df[t-1] take value from Total_articles_per_year2018

Expected output:

{'Year': {0: 2017, 1: 2018, 2: 2019, 3: 2017, 4: 2018, 5: 2019, 6: 2017, 7: 2018, 8: 2019}, 'Total_articles_per_year': {0: 2, 1: 1, 2: 0, 3: 4, 4: 7, 5: 3, 6: 2, 7: 3, 8: 1}, 'Total_articles_2017': {0: 2, 1: 2, 2: 2, 3: 4, 4: 4, 5: 4, 6: 2, 7: 2, 8: 2}, 'Total_articles_2018': {0: 1, 1: 1, 2: 1, 3: 7, 4: 7, 5: 7, 6: 3, 7: 3, 8: 3}, 'Total_articles_2019 ': {0: 0, 1: 0, 2: 0, 3: 3, 4: 3, 5: 3, 6: 1, 7: 1, 8: 1}, 't-1': {0: nan, 1: 2.0, 2: 1.0, 3: nan, 4: 4.0, 5: 7.0, 6: nan, 7: 2.0, 8: 3.0}}

CodePudding user response:

You can reshape values by DataFrame.melt, so possible compare Year subtracted by 1 for previus year with add Total_articles_:

melt = df.melt(['Year'], ignore_index=False)
mask = 'Total_articles_'   melt['Year'].sub(1).astype(str) == melt['variable']
df['t-1'] = melt.loc[mask, 'value']
print (df)
   Year  Total_articles_per_year  Total_articles_2017  Total_articles_2018  \
0  2017                        2                    2                    1   
1  2018                        1                    2                    1   
2  2019                        0                    2                    1   
3  2017                        4                    4                    7   
4  2018                        7                    4                    7   
5  2019                        3                    4                    7   
6  2017                        2                    2                    3   
7  2018                        3                    2                    3   
8  2019                        1                    2                    3   

   Total_articles_2019   t-1  
0                     0  NaN  
1                     0  2.0  
2                     0  1.0  
3                     3  NaN  
4                     3  4.0  
5                     3  7.0  
6                     1  NaN  
7                     1  2.0  
8                     1  3.0  
    
  • Related