I need to subtract dates with integer as the return so I can reference that integer fo another calculated column.
Orignal "io" csv file provides columns "ID", "CreatedDate", and "IFSPDate". I need to add two calculated columns.
Calculated Column 1: "Days". I used the code below to subtract dates in a calculated column
io['CreatedDate'] = pd.to_datetime(io.CreatedDate)
io['IFSPDate'] = pd.to_datetime(io.IFSPDate)
for ind, row in io.iterrows():
dataset.loc[ind,"Days to Table"] = row['CreatedDate'] - row['IFSPDate']
I get this outcome
| ID | CreatedDate | IFSPDate | Days |
|----|-------------|------------|------------------|
| 1 | 2021-09-17 | 2021-09-17 | 0 days 00:00:00 |
| 2 | 2021-08-05 | 2021-01-13 | 204 days 00:00:00|
| 3 | 2021-09-03 | 2041-08-31 | 3 days 00:00:00 |
| 4 | 2021-09-16 | 2021-07-27 | 51 days 00:00:00 |
Calculated Column 2: "Timeliness" In this column I want to answer the question "is cell in days column <= 4" then return 1 if yes and 0 if no.
The problem: Using the current code, the outcome of "Days" column is not an integer, so I can't use <= operator in "Timeliness" column to return 0 or 1.
The outcome I want
| ID | CreatedDate | IFSPDate | Days | Timeliness |
|----|-------------|------------|------|------------|
| 1 | 2021-09-17 | 2021-09-17 | 0 | 1 |
| 2 | 2021-08-05 | 2021-01-13 | 204 | 0 |
| 3 | 2021-09-03 | 2041-08-31 | 3 | 1 |
| 4 | 2021-09-16 | 2021-07-27 | 51 | 0 |
UPDATE:
Using io["Days"] = (io["Created_Date"] - io["IFSP_Date"]).dt.days returned an integer when subtracting the dates
Using io["Timeliness"] = np.where(io['Days'] <= 4, '1', '0') created the new column and returned a 1 or 0.
CodePudding user response:
No need to iterate. This should work, but I can't confirm without sample data:
io["Days"] = (io["CreatedDate"] - io["IFSPDate"]).dt.days
io["Timelines"] = (io["Days"] <= 4).astype(int)
I assume your "created" and "ifsp" columns are proper Pandas Timestamp
or datetime columns.