I have a list that contains some dates with missing days. The list is stored as a pandas series. Here's what it looks like:
0 11-04-2022
1 -03-2022
2 -03-2022
3 -03-2022
4 -03-2022
5 12-04-2022
6 -11-2021
7 9-04-2022
8 8-04-2022
9 8-04-2022
10 -03-2022
11 -02-2022
12 -11-2021
13 -11-2021
14 -11-2021
15 7-04-2022
16 6-04-2022
17 5-04-2022
I'm using the following code to replace the missing days with '01':
for xyz in dates_wo_space:
if xyz.startswith(('-' ' ')):
dates_final = ('01') str(dates_wo_space)
print(dates_final)
And here's the error I get:
NameError: name 'dates_final' is not defined
Can someone please show me how I can add '01' to rows with missing days?
CodePudding user response:
Assuming s
the input string you could use a regex (^-
= match -
if in the beginning of the string):
s = s.str.replace(r'^-', '01-', regex=True)
you can take the opportunity to fill the single digit days with 0s (10 chars in a date)
s = s.str.replace(r'^-', '01-', regex=True).str.zfill(10)
NB. if it is possible to have spaces before the -
use:
s = s.str.replace(r'^\s*-', '01-', regex=True)
output:
0 11-04-2022
1 01-03-2022
2 01-03-2022
3 01-03-2022
4 01-03-2022
5 12-04-2022
6 01-11-2021
7 09-04-2022
8 08-04-2022
9 08-04-2022
10 01-03-2022
11 01-02-2022
12 01-11-2021
13 01-11-2021
14 01-11-2021
15 07-04-2022
16 06-04-2022
Name: date, dtype: object
CodePudding user response:
Append "01" to values that start with "-":
>>> srs.where(~srs.str.startswith("-"),"01" srs)
0 11-04-2022
1 01-03-2022
2 01-03-2022
3 01-03-2022
4 01-03-2022
5 12-04-2022
6 01-11-2021
7 9-04-2022
8 8-04-2022
9 8-04-2022
10 01-03-2022
11 01-02-2022
12 01-11-2021
13 01-11-2021
14 01-11-2021
15 7-04-2022
16 6-04-2022
17 5-04-2022
dtype: object
Alternatively, with to_datetime
:
srs = pd.to_datetime(srs,format="%d-%m-%Y",errors="coerce").fillna(pd.to_datetime(srs,format="-%m-%Y",errors="ignore"))
#convert back to strings if needed
srs = srs.dt.strftime("%d-%m-%Y")
>>> srs
0 11-04-2022
1 01-03-2022
2 01-03-2022
3 01-03-2022
4 01-03-2022
5 12-04-2022
6 01-11-2021
7 09-04-2022
8 08-04-2022
9 08-04-2022
10 01-03-2022
11 01-02-2022
12 01-11-2021
13 01-11-2021
14 01-11-2021
15 07-04-2022
16 06-04-2022
17 05-04-2022
dtype: object