Home > Software design >  Subtracting dates and returning 1 or 0
Subtracting dates and returning 1 or 0

Time:09-21

I need to subtract dates with integer as the return so I can reference that integer fo another calculated column.

Orignal "io" csv file provides columns "ID", "CreatedDate", and "IFSPDate". I need to add two calculated columns.

Calculated Column 1: "Days". I used the code below to subtract dates in a calculated column

io['CreatedDate'] = pd.to_datetime(io.CreatedDate)

io['IFSPDate'] = pd.to_datetime(io.IFSPDate)

for ind, row in io.iterrows():
    dataset.loc[ind,"Days to Table"] = row['CreatedDate'] - row['IFSPDate']

I get this outcome

| ID | CreatedDate | IFSPDate   | Days             |
|----|-------------|------------|------------------|
| 1  | 2021-09-17  | 2021-09-17 | 0 days 00:00:00  |
| 2  | 2021-08-05  | 2021-01-13 | 204 days 00:00:00|
| 3  | 2021-09-03  | 2041-08-31 | 3 days 00:00:00  |
| 4  | 2021-09-16  | 2021-07-27 | 51 days 00:00:00 |

Calculated Column 2: "Timeliness" In this column I want to answer the question "is cell in days column <= 4" then return 1 if yes and 0 if no.

The problem: Using the current code, the outcome of "Days" column is not an integer, so I can't use <= operator in "Timeliness" column to return 0 or 1.

The outcome I want

| ID | CreatedDate | IFSPDate   | Days | Timeliness |
|----|-------------|------------|------|------------|
| 1  | 2021-09-17  | 2021-09-17 | 0    | 1          |
| 2  | 2021-08-05  | 2021-01-13 | 204  | 0          |
| 3  | 2021-09-03  | 2041-08-31 | 3    | 1          |
| 4  | 2021-09-16  | 2021-07-27 | 51   | 0          |

UPDATE:

Using io["Days"] = (io["Created_Date"] - io["IFSP_Date"]).dt.days returned an integer when subtracting the dates

Using io["Timeliness"] = np.where(io['Days'] <= 4, '1', '0') created the new column and returned a 1 or 0.

CodePudding user response:

No need to iterate. This should work, but I can't confirm without sample data:

io["Days"] = (io["CreatedDate"] - io["IFSPDate"]).dt.days
io["Timelines"] = (io["Days"] <= 4).astype(int)

I assume your "created" and "ifsp" columns are proper Pandas Timestamp or datetime columns.

  • Related