Home > Back-end >  Change strings in a dataframe-column by inserting a char on a specific position
Change strings in a dataframe-column by inserting a char on a specific position

Time:09-29

In column 'date' i have strings in format yyyymm

I want to change these to format yyyy0mm

The code below solves the problem but is there a shorter,better way to do this ?

import pandas as pd
data = {
  "date": ['202201','202202','202203'],
  "duration": [50, 40, 45]
}
df = pd.DataFrame(data)
def change_date(row):
    str=row['date']
    str = str[:4]   "0"   str[4:]
    return str
df.date=df.apply (lambda row: change_date(row), axis=1)

CodePudding user response:

Using a regex:

# insert a 0 before the last 2 digits
df['date'] = df['date'].str.replace(r'(?=\d{2}$)', '0', regex=True)
# variant: insert after first 4 digits
df['date'] = df['date'].str.replace('(?<=^\d{4})', '0', regex=True)

Using concatenation:

# concatenate the first 4, 0, the rest
df['date'] = df['date'].str[:4]   '0'   df['date'].str[4:]

output:

      date  duration
0  2022001        50
1  2022002        40
2  2022003        45

regex demo

CodePudding user response:

try this:

df["data"]=df["date"].apply(lambda x:x[:-2] "0" x[-2:])
  • Related