I need to duplicate each row 3 times and add two new columns. The new column values are different for each row.
import pandas as pd
df = {'A': [ 8,9,12],
'B': [ 1,11,3],
'C': [ 7,9,13],
'D': [81,92,121]}
df = pd.DataFrame(df)
#####################################################
#input
A B C D
8 1 7 81
9 11 9 92
12 3 13 121
####################################################
#expected output
A B C D E F
8 1 7 81 9 8 E=A 1, F= C 1
8 1 7 81 8 7 E=A, F= C
8 1 7 81 7 6 E=A-1, F= C-1
9 11 9 92 10 10
9 11 9 92 9 9
9 11 9 92 8 8
12 3 13 121 13 14
12 3 13 121 12 13
12 3 13 121 11 12
CodePudding user response:
To repeat the DataFrame you can use np.repeat()
.
Afterwards you can create a list to add to "A"
and "C"
.
df = pd.DataFrame(np.repeat(df.to_numpy(), 3, axis=0), columns=df.columns)
extra = [1,0, -1]*3
df['E'] = df['A'] extra
df['F'] = df['C'] extra
This gives:
A B C D E F
0 8 1 7 81 9 8
1 8 1 7 81 8 7
2 8 1 7 81 7 6
3 9 11 9 92 10 10
4 9 11 9 92 9 9
5 9 11 9 92 8 8
6 12 3 13 121 13 14
7 12 3 13 121 12 13
8 12 3 13 121 11 12
CodePudding user response:
Use Index.repeat
with DataFrame.loc
for repeat rows, then repeat integers [1,0,-1] by numpy.tile
and create new columns E, F
:
df1 = df.loc[df.index.repeat(3)]
g = np.tile([1,0,-1], len(df))
df1[['E','F']] = df1[['A','C']].add(g, axis=0).to_numpy()
df1 = df1.reset_index(drop=True)
print (df1)
A B C D E F
0 8 1 7 81 9 8
1 8 1 7 81 8 7
2 8 1 7 81 7 6
3 9 11 9 92 10 10
4 9 11 9 92 9 9
5 9 11 9 92 8 8
6 12 3 13 121 13 14
7 12 3 13 121 12 13
8 12 3 13 121 11 12