Remove '-' and two following characters using regex?-CodePudding

I have set of data from laboratory where data column looks like this:

12-15.11.12
19-22.11.12
26-29.11.12
03-06.12.12
10-13.12.12
17-20.12.12
19-23.12.12
27-30.12.12
02-05.01.13

I only want the first value (the day of sampling) so I can convert it into pandas datetime series etc. and continue working with data.

I know I can manually delete it in Excel but I would like to do it with the use of code. So my goal is for example:

12-15.11.12 -> 12.11.2012, '-15' gets deleted.

CodePudding user response：

You can use re.sub with -\d pattern (regex101):

import re

data = '''\
12-15.11.12
19-22.11.12
26-29.11.12
03-06.12.12
10-13.12.12
17-20.12.12
19-23.12.12
27-30.12.12
02-05.01.13'''

data = re.sub(r'-\d ', '', data)
print(data)

Prints:

12.11.12
19.11.12
26.11.12
03.12.12
10.12.12
17.12.12
19.12.12
27.12.12
02.01.13

CodePudding user response：

import re

dates = [ "12-15.11.12", "19-22.11.12", "26-29.11.12", "03-06.12.12", "10-13.12.12", "17-20.12.12", "19-23.12.12", "27-30.12.12", "02-05.01.13" ]

cleaned_dates = [] for date in dates: date = re.sub(r"(\d )-\d ", r"\1", date) cleaned_dates.append(date)

print(cleaned_dates)