I am trying to extract the last year (YY) of a fiscal date string in the format of YYYY-YY. e.g The last year of this '1999-00' would be 2000.
Current code seems to cover most cases other than this.
import pandas as pd
import numpy as np
test_df = pd.DataFrame(data={'Season':['1996-97', '1997-98', '1998-99',
'1999-00', '2000-01', '2001-02',
'2002-03','2003-04','2004-05',
'2005-06','2006-07','2007-08',
'2008-09', '2009-10', '2010-11', '2011-12'],
'Height':np.random.randint(20, size=16),
'Weight':np.random.randint(40, size=16)})
I need a logic to include a case where if it is the end of the century then my apply method should add to the first two digits, I believe this is the only case I am missing.
Current code is as follows:
test_df['Season'] = test_df['Season'].apply(lambda x: x[0:2] x[5:7])
CodePudding user response:
This should work too:
pd.to_numeric(test_df['Season'].str.split('-').str[0]) 1
Output:
0 1997
1 1998
2 1999
3 2000
4 2001
5 2002
6 2003
7 2004
8 2005
9 2006
10 2007
11 2008
12 2009
13 2010
14 2011
15 2012
CodePudding user response:
Here you go! Use the following function instead of the lambda:
def get_season(string):
century = int(string[:2])
preyear = int(string[2:4])
postyear = int(string[5:7])
if postyear < preyear:
century = 1
# zfill is so that "1" becomes "01"
return str(century).zfill(2) str(postyear).zfill(2)
CodePudding user response:
I use the fiscalyear module.
import numpy as np
import pandas as pd
import fiscalyear as fy
...
test_df['Season'] = test_df['Season'].apply(lambda x : fy.FiscalYear(int(x[0:4]) 1).fiscal_year)
print(test_df)