Home > Mobile >  Pandas dataframe: change string values based on condition with regex
Pandas dataframe: change string values based on condition with regex

Time:09-23

I have a dataframe with numbers represented as strings. I need to remove the parentheses, if they exist, and add a negative sign. For example, (30) should become -30.

Positive numbers should not change.

df = pd.DataFrame({'a':['19','(30)','(1000)'],
                   'b':['(202)','200', '100'],
                   'c':['101','(30)', '40']})

        a      b     c
0      19  (202)   101
1    (30)    200  (30)
2  (1000)    100    40

For the regular expression, I can do this on a single value:

neg_pat = r'\((\d )\)'
num = '(30)'
new_num = '-'   re.search(neg_pat, num).group(1)
print(new_num)
-30

Now, how can I apply this to the dataframe? I've used apply() and lambda expressions before but I'm stuck on putting this together.

CodePudding user response:

You can do with replace

out = df.replace({'\((.*)\)':'-\\1'},regex=True).astype(int)
Out[280]: 
      a    b    c
0    19 -202  101
1   -30  200  -30
2 -1000  100   40
  • Related