I have following dataframe(short sample):
cond_ID tow1_ID tow2_ID
0 10 0 4
1 17 6 12
3 13 14 15
4 13 16 16
5 13 17 18
I want to extend it based on range between tow1_ID
and tow2_ID
. For instance, I want to add records with values 1,2,3 and 4 below value 0. Here is desired output:
cond_ID tow1_ID
0 10 0
0 10 1
0 10 2
0 10 3
0 10 4
1 17 6
1 17 7
1 17 8
1 17 9
1 17 10
1 17 11
1 17 12
1 13 14
1 13 15
1 13 16
1 13 17
1 13 18
How can I do this with vectorized approach ( without using apply ) ? Any help is highly appreciated.
CodePudding user response:
Try this:
df.assign(tow1_ID=[np.arange(s,f 1) for s, f in zip(df['tow1_ID'], df['tow2_ID'])])\
.explode('tow1_ID')\
.drop(['tow2_ID'], axis=1)
Output:
cond_ID tow1_ID
0 10 0
0 10 1
0 10 2
0 10 3
0 10 4
1 17 6
1 17 7
1 17 8
1 17 9
1 17 10
1 17 11
1 17 12
3 13 14
3 13 15
4 13 16
5 13 17
5 13 18
CodePudding user response:
def foo(r):
return pd.DataFrame({"cond_ID": r.cond_ID,
"tow_ID": range(r.tow1_ID, r.tow2_ID 1),
"index": r.name}).set_index("index")
print(pd.concat([foo(r) for _, r in df.iterrows()]))
# cond_ID tow_ID
# index
# 0 10 0
# 0 10 1
# 0 10 2
# 0 10 3
# 0 10 4
# 1 17 6
# 1 17 7
# 1 17 8
# 1 17 9
# 1 17 10
# 1 17 11
# 1 17 12
# 3 13 14
# 3 13 15
# 4 13 16
# 5 13 17
# 5 13 18