pandas: Can't convert class 'pandas.core.series.Series' to date error-CodePudding

I'm trying to include a third column in my dataset to characterize whether the start_date date is a weekday or not.

So I have the following dataset:

start_date	count
2018-10-01	1043
2018-10-02	1062
2018-10-03	1068
2018-10-04	1003
2018-10-05	1122
2021-12-27	1053

And used this code below to generate this third column

from bdateutil import isbday
import holidays
df1['business_day']=isbday(df1["start_date"], holidays=holidays.US())

But I'm getting the following error:

TypeError: Can't convert <class 'pandas.core.series.Series'> to date.

I've already tried the following codes to adjust the start_date format but I still can't get it to work.

df1['start_date'] = pd.to_datetime(df1['start_date']).dt.date

df1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1188 entries, 0 to 1187
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   start_date  1188 non-null   object
 1   count       1188 non-null   int64 
dtypes: int64(1), object(1)

and this one:

df1['start_date'] = pd.to_datetime(df1['start_date'])
df1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1188 entries, 0 to 1187
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype         
---  ------      --------------  -----         
 0   start_date  1188 non-null   datetime64[ns]
 1   count       1188 non-null   int64         
dtypes: datetime64[ns](1), int64(1)

And I still get the same error.

CodePudding user response：

The error message you received is giving you a big clue.

TypeError: Can't convert <class 'pandas.core.series.Series'> to date.

It tells you that the function raising the error, which you should be able to find in the stacktrace, expects certain kinds of data types and pandas.core.series.Series is not one of them. My guess is the function from bdateutil import isbday expects certain kinds of values, perhaps a string representation of a date. It is worth noting that bdateutil has not been updated since 2014 and the repo is not available on Pypi which indicates the repo may not be up to date.

If you are able, load everything in an iPython session and issue df1["start_date"] at the REPL. It returns a Series object not a single object. If you would like to operate on each value in the series, you need to loop over the rows in the dataframe. There are a few ways to do this. One option is calling apply() with an anonymous function (also called a lambda function) and assigning the output of that function to a new column.

df1['business_day'] = df1["start_date"].apply(lambda x: isbday(x, holidays=holidays.US()))

You could also iterate the rows of the Dataframe, build a new series object and merge or concatenate the new object to the Dataframe. One thing you do not want to do, is change the value you are iterating over. Doing so is bad practice. I hope that helps solve your problem and give you a better grasp of working with panda's dataframe objects.

CodePudding user response：

bday takes a date as a parameter, not a Series.

from bdateutil import isbday
import holidays

df1['business_day'] = df1["start_date"].apply(lambda x: isbday(x, holidays=holidays.US()))