I have a dataframe of NBA players from years 1996 to 2012. The season column is in this format "1996-97" but I would like to convert it from that to '1996 - 1997'. I noticed that if I go further down to year 2000, it appears as "1999-00'. This does not look good. I tried using the following code`
all_seasons['Date'] = pd.to_datetime(all_seasons['Date'], format='%Y-%Y').dt.strftime('%Y-%Y')
It returned an error: redefinition of group name 'Y' as group 2; was group 1.
I also used this code
all_seasons['Date'] = all_seasons['Date'].dt.strftime('%Y-%Y')
but it returned the error
AttributeError: Can only use .dt accessor with datetimelike values
What do I do?
CodePudding user response:
You can convert to datetimes last 2 numbers and then add to first 4 numbers:
print (df)
Date
0 1996-97
1 1999-00
df['Date'] = (df.Date.str[:4]
pd.to_datetime(df.Date.str[-2:], format='%y').dt.strftime(' - %Y'))
print (df)
Date
0 1996 - 1997
1 1999 - 2000
Or if use %Y-%y
format it parse second year, so added to first 4 digits:
df['Date'] = df.Date.str[:4] pd.to_datetime(df.Date, format='%Y-%y').dt.strftime(' - %Y')
print (df)
Date
0 1996 - 1997
1 1999 - 2000
CodePudding user response:
You can use a regex with str.replace
and a custom function.
For example, here we replace xx
with 19xx
if xx
is ≥ 60, else 20xx
:
df = pd.DataFrame({'Date': ['1996-97', '1999-00']})
df['Date2'] = df['Date'].str.replace(r'(\d\d)$',
lambda m: ('19' if int(m.group())>=60 else '20') m.group(),
regex=True)
output (as separate column Date2 for clarity):
Date Date2
0 1996-97 1996-1997
1 1999-00 1999-2000