Home > Enterprise >  Pandas how to delete a cell in a column and shift the column up?
Pandas how to delete a cell in a column and shift the column up?

Time:04-21

I have a following dataframe:

Start Finish
16.11.2021 21:36:31 16.11.2021 21:37:31
16.11.2021 21:39:31 16.11.2021 21:40:28
16.11.2021 21:41:39 16.11.2021 21:40:31
16.11.2021 21:56:31 16.11.2021 21:48:31
16.11.2021 21:50:35
16.11.2021 21:58:56

I have errors in my dataset - finish time occurs before start time in some rows due to bad data. In rows where that occurs, I'd like to remove the cell in the 'Finish' column and shift other cells up until start time <= finish time. Therefore, the expected output would be:

Start Finish
16.11.2021 21:36:31 16.11.2021 21:37:31
16.11.2021 21:39:31 16.11.2021 21:40:28
16.11.2021 21:41:39 16.11.2021 21:48:31
16.11.2021 21:56:31 16.11.2021 21:58:56

My attempt:

df['start_time'] = pd.to_datetime(df['start_time'])
df['finish_time'] = pd.to_datetime(df['finish_time'])
    
    
for i,row in df.iterrows():
    if row["start_time"] > row["finish_time"]:
        row["finish_time"] = ""
        row.dropna()
    df.at[i, 'finish_time'] = row["finish_time"]

I appreciate any help in advance.

CodePudding user response:

You can use pandas.merge_asof.

This is assuming that the values are sorted in both columns (if not, add a sort_values() step).

df['Start'] = pd.to_datetime(df['Start'])
df['Finish'] = pd.to_datetime(df['Finish'])


pd.merge_asof(df['Start'].dropna(), df['Finish'],
              left_on='Start', right_on='Finish', direction='forward')

Output:

                Start              Finish
0 2021-11-16 21:36:31 2021-11-16 21:37:31
1 2021-11-16 21:39:31 2021-11-16 21:40:28
2 2021-11-16 21:41:39 2021-11-16 21:48:31
3 2021-11-16 21:56:31 2021-11-16 21:58:56
  • Related