Home > Back-end >  Convert date format from '1996-97' to 1996-1997'
Convert date format from '1996-97' to 1996-1997'

Time:08-10

I have a dataframe of NBA players from years 1996 to 2012. The season column is in this format "1996-97" but I would like to convert it from that to '1996 - 1997'. I noticed that if I go further down to year 2000, it appears as "1999-00'. This does not look good. I tried using the following code`

all_seasons['Date'] = pd.to_datetime(all_seasons['Date'], format='%Y-%Y').dt.strftime('%Y-%Y')

It returned an error: redefinition of group name 'Y' as group 2; was group 1.

I also used this code

all_seasons['Date'] = all_seasons['Date'].dt.strftime('%Y-%Y')

but it returned the error

AttributeError: Can only use .dt accessor with datetimelike values

What do I do?

CodePudding user response:

You can convert to datetimes last 2 numbers and then add to first 4 numbers:

print (df)
      Date
0  1996-97
1  1999-00

df['Date'] = (df.Date.str[:4]   
                     pd.to_datetime(df.Date.str[-2:], format='%y').dt.strftime(' - %Y'))
print (df)
          Date
0  1996 - 1997
1  1999 - 2000

Or if use %Y-%y format it parse second year, so added to first 4 digits:

df['Date'] = df.Date.str[:4]   pd.to_datetime(df.Date, format='%Y-%y').dt.strftime(' - %Y')
print (df)
          Date
0  1996 - 1997
1  1999 - 2000

CodePudding user response:

You can use a regex with str.replace and a custom function.

For example, here we replace xx with 19xx if xx is ≥ 60, else 20xx:

df = pd.DataFrame({'Date': ['1996-97', '1999-00']})

df['Date2'] = df['Date'].str.replace(r'(\d\d)$',
              lambda m: ('19' if int(m.group())>=60 else '20') m.group(),
              regex=True)

output (as separate column Date2 for clarity):

      Date      Date2
0  1996-97  1996-1997
1  1999-00  1999-2000
  • Related