Home > Software engineering >  apply a function to a column in pandas giving issues
apply a function to a column in pandas giving issues

Time:03-07

I am trying to convert string to int for a column in my dataframe.

I have amount column contains values like this:

123,123
(343,344)

I am conveting this:

123123
343344

For that my code I worte:

def strToInt(str2):
    '''print(type(str2))'''
    if type(str2) == str:
        temp = str2.replace("(", "").replace(")","").replace(",","")
        '''print("GOT :"   str2   "     RETURN :"   str(int(temp)))'''
        if checkInt(temp):
              return int(temp)
    return None

def checkInt(s):
    if s[0] in ('-', ' '):
        return s[1:].isdigit()
    return s.isdigit()



print(df['amount'])

df['amount'] = df[['amount']].apply(lambda a: strToInt(a))
print(df['amount'])
print(df.columns)

print(df['amount'])

But I am getting all null values: I check the function strToInt individually, it giving correct output.

But after apply I am getting all NaN values.

Before:

0            45,105 
1            24,250 
2          8,35,440 
3          3,00,900 
4          1,69,920 

After:

0         NaN
1         NaN
2         NaN
3         NaN
4         NaN

Need your help please.

CodePudding user response:

You can probably use a regex to make things more efficient:

df = pd.DataFrame({'amount': ['123,456', '(123,456)', '-123,465', '(-123,456)']})

df = df['amount'].str.replace(r'[^-\d]', '', regex=True).astype(int)
```
output:
```
0    123456
1    123456
2   -123465
3   -123456
Name: amount, dtype: int64
```

CodePudding user response:

Pass function to column df['amount'], not one column DataFrame - df[['amount']]:

df['amount'] = df['amount'].apply(strToInt)

print (df)
   amount
0   45105
1   24250
2  835440
3  300900
4  169920

Solution from comments wotking with negative sign:

a = np.where(df['amount'].str.startswith('-'), -1, 1)
df['amount'] = df['amount'].str.replace(r'\D', '', regex=True).astype(int).mul(a)
  • Related