I have a table of dates and a condition:
Start End Condition
03.10.2022 03.10.2022 0
03.10.2022 04.10.2022 1
03.10.2022 03.10.2022 0
How to add and transform an additional row based on a condition, like this:
Start End
03.10.2022 03.10.2022
03.10.2022 03.10.2022
04.10.2022 04.10.2022
03.10.2022 03.10.2022
My thoughts were to use pd.explode
- so first I need to add a new column with a list of values.
I've tried something like this, which resulted in shape errors:
df["new_col"] = np.where(df['Condition'] == 1,
df[['Start', 'End']].values.tolist(),
df['Start'])
CodePudding user response:
This is what I came up with:
# df initialization
data = pd.to_datetime('2022/10/03')
df = pd.DataFrame({'start':[data]*3,'end'[data,data pd.DateOffset(days=1),data],'condition':[0,1,0]})
# find the rule used to add new rows
s = (df.end - df.start).dt.days * df.condition 1
#repeat index
df = df.loc[df.index.repeat(s)].copy()
# system datetimes
add = pd.to_timedelta(df.groupby(level=0).cumcount(), unit='d')
df['start'] = df["start"].add(add)
df['end'] = df["start"]
#default index
df = df.reset_index(drop=True)
CodePudding user response:
If your table is in csv format, you can utilize pandas library to read and extract information.
import pandas as pd
df = pd.read_csv('table.csv')
df2 = df.iloc[:, 0:2] # remove the unwanted column in the dataframe
df2.loc[len(df.index)] = ['03.10.2022','03.10.2022'] # add a new row to the dataframe
And voila, df2 is your desired dataframe.