I have a data frame as follows:
mydata_frame:
IDs days f1 f2 f3 ... fn
Id1 day1 5 6 21 ... 5
Id1 delta_day0&day1 0 0 0 ... 0
Id1 day2 21 1 4 ... 2
Id1 delta_day1&day2 16 5 17 ... 3
Id1 day3 30 2 5 ... 2
Id1 delta_day2&day3 9 1 1 ... 0
... ... .. .. .. ... ...
Id2 day1 8 9 10 ... 2
Id2 delta_day0&day1 0 0 0 ... 0
Id2 day2 12 10 7 ... 6
Id2 delta_day1&day2 4 1 3 ... 4
Id2 day3 22 25 3 ... 1
Id2 delta_day2&day3 10 15 4 ... 5
I would like to transfer the value of all the rows which includes delta into columns. So my desired output should be like this:
output:
IDs days f1 del_f1 f2 del_f2 f3 del_f3 ... fn del_fn
Id1 day1 5 0 6 0 21 0 ... 5 0
Id1 day2 21 16 1 5 4 17 ... 2 3
Id1 day3 30 9 2 1 5 1 ... 2 0
... ... .. .. .. ... ...
Id2 day1 8 0 9 0 10 0 ... 2 0
Id2 day2 12 4 10 1 7 3 ... 6 4
Id2 day3 22 10 25 15 3 4 ... 1 5
Any idea to do this conversion in an efficient way?
CodePudding user response:
You can start by building a Dataframe without delay (only days) i.e. keeping only every two rows starting from index 0.
Then insert new columns (from the 4th column i.e. index 3) with delay values of the preceding column (every two rows starting from 1):
df2 = df.iloc[::2]
new_index = df2.index
i = 3
for col in df.columns[2:]:
new_col = "del_" col
series = pd.Series(df[col].iloc[1::2])
series.index = new_index
df2.insert(i, new_col, series)
i =2
print(df2)
CodePudding user response:
You could do:
df1 = df[df.days.str.startswith("day")].reset_index(drop=True)
df2 = (
df.loc[df.days.str.startswith("delta"), [f"f{i}" for i in range(1, n 1)]]
.rename(columns={f"f{i}": f"del_f{i}" for i in range(1, n 1)})
.reset_index(drop=True)
)
df = pd.concat([df1, df2], axis="columns")
Result for n == 3
and the following sample dataframe df
IDs days f1 f2 f3
0 Id1 day1 5 6 21
1 Id1 delta_day0&day1 0 0 0
2 Id1 day2 21 1 4
3 Id1 delta_day1&day2 16 5 17
4 Id1 day3 30 2 5
5 Id1 delta_day2&day3 9 1 1
6 Id2 day1 8 9 10
7 Id2 delta_day0&day1 0 0 0
8 Id2 day2 12 10 7
9 Id2 delta_day1&day2 4 1 3
10 Id2 day3 22 25 3
11 Id2 delta_day2&day3 10 15 4
is
IDs days f1 f2 f3 del_f1 del_f2 del_f3
0 Id1 day1 5 6 21 0 0 0
1 Id1 day2 21 1 4 16 5 17
2 Id1 day3 30 2 5 9 1 1
3 Id2 day1 8 9 10 0 0 0
4 Id2 day2 12 10 7 4 1 3
5 Id2 day3 22 25 3 10 15 4