I have a pandas data frame that has a date column. Each row in the frame is considered a record.
I have 10000 records, and 10000 dates ranging between 10 years.
I want to create another column that will contain a certain string value for the corresponding date range.
For example:
If the record is between 2008-01-03 - 2012-03-23, I want to add to the new column: 'person a' If the record is between 2012-03-24 - 2014-05-07, I want to add it to the new column: 'person b' etc.
My date column is in DateTime format.
Currently, what I have done is created a new column for each person, and marked true or false if it fell within the range. But this is becoming difficult to do analysis on.
I know there is a way to do this, but I am new to pandas. Thanks!
CodePudding user response:
It is very easy
import numpy as np
df['new']= np.select([df.date.between(date1, date2)], ['person a'], 'person b')
select method is very easy and you can read more about it.
Also you can use a for loop for this but it is not optimum sulotion