I tried to takeout decimal values based on value present before decimal point .I have data frame like below,
data flow
1.5 parallel
1.2 parallel
1.3 parallel
2 sequence
2.5 parallel
2.4 parallel
2.8 parallel
3 sequence
3.2 parallel
3.1 parallel
3.5 parallel
4 sequence
4.1 parallel
4.5 parallel
4.3 parallel
1 sequence
5 sequence
6 sequence
Expected output,
data flow
1.5 Parallel1
1.2 Parallel1
1.3 Parallel1
2 sequence
2.5 Parallel2
2.4 Parallel2
2.8 Parallel2
3 sequence
3.2 Parallel3
3.1 Parallel3
3.5 Parallel3
4 sequence
4.1 Parallel4
4.5 Parallel4
4.3 Parallel4
1 sequence
5 sequence
6 sequence
How can i achieve this using pands,...
CodePudding user response:
If data is a string:
df.loc[df['flow'].ne('sequence'), 'flow'] = df['data'].str.extract('(\d )',
expand=False)
if it is a float:
df.loc[df['flow'].ne('sequence'), 'flow'] = df['data'].astype(int).astype(str)
output:
data flow
0 1.5 parallel1
1 1.2 parallel1
2 1.3 parallel1
3 2.0 sequence
4 2.5 parallel2
5 2.4 parallel2
6 2.8 parallel2
7 3.0 sequence
8 3.2 parallel3
9 3.1 parallel3
10 3.5 parallel3
11 4.0 sequence
12 4.1 parallel4
13 4.5 parallel4
14 4.3 parallel4
15 1.0 sequence
16 5.0 sequence
17 6.0 sequence