Home > Net >  Python - remove '$' and ','
Python - remove '$' and ','

Time:09-04

I know this is a very basic question but I just can't do it.

My data has 20k rows, 10 rows have '$' and ','.

1797       5500
1798        255
1799       3600
1800         $0
1801     $2,620
1802     $6,000
1803     $2,200
1804       $390
1805     $2,688
1806    $13,069
1807     $7,000
1808     $5,000
1809     $2,089
1810        936
1811       5468
1812      20000

I tried the following code but its taking way too long. And is giving me SettingWithCopyWarning warning.

i = 0
df['R3_v2'] = ''

while i < len(df['ID'])   1:
    if "$" in df['R3'][i]:
        df['R3_v2'][i] = df['R3'][i].replace('$', '')
    if "," in df['R3'][i]:
        df['R3_v2'][i] = df['R3'][i].replace(',', '')
    if "$" or "," not in df['R3'][i]:
        df['R3_v2'][i] = df['R3'][i]
    i = i   1
    if i == len(df['ID']):
        break

If I remove the last "if" statement (ie, not in), then the code will run fast. But my R3_v2 for those rows would be empty unless I replace it directly.

Can someone tell me what's wrong? Thanks

CodePudding user response:

Use regex with character class:

df['R3_v2'] = df['R3'].str.replace('[$,]', '', regex=True)

Output:

     idx       R3  R3_v2
0   1797     5500   5500
1   1798      255    255
2   1799     3600   3600
3   1800       $0      0
4   1801   $2,620   2620
5   1802   $6,000   6000
6   1803   $2,200   2200
7   1804     $390    390
8   1805   $2,688   2688
9   1806  $13,069  13069
10  1807   $7,000   7000
11  1808   $5,000   5000
12  1809   $2,089   2089
13  1810      936    936
14  1811     5468   5468
15  1812    20000  20000

CodePudding user response:

Try this:

df['R3'] = df['R3'].str.replace('$', '')
df['R3'] = df['R3'].str.replace(',', '')

Also covered here.

  • Related