Home > OS >  Add a row to a dataframe based on a condition
Add a row to a dataframe based on a condition

Time:12-21

I have a table of dates and a condition:

Start       End         Condition
03.10.2022  03.10.2022  0
03.10.2022  04.10.2022  1
03.10.2022  03.10.2022  0

How to add and transform an additional row based on a condition, like this:

Start       End         
03.10.2022  03.10.2022  
03.10.2022  03.10.2022
04.10.2022  04.10.2022  
03.10.2022  03.10.2022  

My thoughts were to use pd.explode - so first I need to add a new column with a list of values.

I've tried something like this, which resulted in shape errors:

df["new_col"] = np.where(df['Condition'] == 1, 
                         df[['Start', 'End']].values.tolist(),
                         df['Start'])

CodePudding user response:

This is what I came up with:

# df initialization
data = pd.to_datetime('2022/10/03')
df = pd.DataFrame({'start':[data]*3,'end'[data,data pd.DateOffset(days=1),data],'condition':[0,1,0]})

# find the rule used to add new rows
s = (df.end - df.start).dt.days * df.condition  1

#repeat index
df = df.loc[df.index.repeat(s)].copy()

# system datetimes
add = pd.to_timedelta(df.groupby(level=0).cumcount(), unit='d')
df['start'] = df["start"].add(add)
df['end'] = df["start"]

#default index
df = df.reset_index(drop=True)

CodePudding user response:

If your table is in csv format, you can utilize pandas library to read and extract information.

import pandas as pd

df = pd.read_csv('table.csv')
df2 = df.iloc[:, 0:2] # remove the unwanted column in the dataframe
df2.loc[len(df.index)] = ['03.10.2022','03.10.2022'] # add a new row to the dataframe

And voila, df2 is your desired dataframe.

  • Related