A dataframe has 3 Columns
A B C
^0hand(%s)leg$ 27;30 42;54
^-(%s)hand0leg 39;30 47;57
^0hand(%s)leg$ 24;33 39;54
So column A has regex patterns like this if those patterns are similar for example now row 1 and row 3 is similar so it has to merge the two rows and output only the maximum as below:
Output:
A B C
^0hand(%s)leg$ 27;33 42;54
^-(%s)hand0leg 39;30 47;57
Any leads will be helpful
CodePudding user response:
You could use:
(df.set_index('A').stack()
.str.extract('(\d );(\d )').astype(int)
.groupby(level=[0,1]).agg(max).astype(str)
.assign(s=lambda d: d[0] ';' d[1])['s'] # OR # .apply(';'.join, axis=1)
.unstack(1)
.loc[df['A'].unique()] ## only if the order of rows matters
.reset_index()
)
output:
A B C
0 ^0hand(%s)leg$ 27;33 42;54
1 ^-(%s)hand0leg 39;30 47;57