At the replication of a dataframe using concat
with index (see example here), is there a way I can assign a count variable for each iteration in column c (where column c is the count variable)?
Orig df:
a | b | |
---|---|---|
0 | 1 | 2 |
1 | 2 | 3 |
df replicated with pd.concat[df]*5
and with an additional Column c:
a | b | c | |
---|---|---|---|
0 | 1 | 2 | 1 |
1 | 2 | 3 | 1 |
0 | 1 | 2 | 2 |
1 | 2 | 3 | 2 |
0 | 1 | 2 | 3 |
1 | 2 | 3 | 3 |
0 | 1 | 2 | 4 |
1 | 2 | 3 | 4 |
0 | 1 | 2 | 5 |
1 | 2 | 3 | 5 |
This is a multi-row dataframe where the count variable would have to be applied to multiple rows.
Thanks for your thoughts!
CodePudding user response:
You could use np.arange
and np.repeat
:
N = 5
new_df = pd.concat([df] * N)
new_df['c'] = np.repeat(np.arange(N), df.shape[0]) 1
Output:
>>> new_df
a b c
0 1 2 1
1 2 3 1
0 1 2 2
1 2 3 2
0 1 2 3
1 2 3 3
0 1 2 4
1 2 3 4
0 1 2 5
1 2 3 5