Home > Software engineering >  converting only the floats of a column to integer
converting only the floats of a column to integer

Time:10-01

I have a data-frame:

df = pd.DataFrame({"ID": [1,2,3,4,5,6,7], "value": [10,10.00,"123JK",20,10-11,11.00,12.00]})
ID   value
1      10
2    10.00
3    123JK
4     20
5    10-11
6    11.00
7    12.00

I want to convert only the floating value to integer such that:

ID   value
1      10
2      10
3    123JK
4      20
5    10-11
6      11
7      12

I tried following code:

df['ID'] = df['ID'].apply(pd.to_numeric, errors='ignore')
df['ID'].astype(np.int64,errors='ignore')

But It does not convert all the floating value to integer.

CodePudding user response:

If need integers only for floats integers like 10.0 use custom function:

def f(x):
    try:
        x = pd.to_numeric(x, errors='ignore')
        if int(x) == x:
            return int(x)
        else:
            return x
    except:
        return x


df['value'] = df['value'].apply(f)
print (df)
   ID  value
0   1     10
1   2     10
2   3  123JK
3   4     20
4   5  10-11
5   6     11
6   7     12

CodePudding user response:

Well assuming that the value column be text, you could just do a regex replacement here:

df["value"] = df["value"].str.replace(r'^(\d )\.\d $', r'\1')

CodePudding user response:

I prefer pd.to_numeric with fillna:

df['value'] = pd.to_numeric(df['value'], errors='coerce').fillna(df['value'])

This will convert to numerical dtypes and replace non-numeric data to NaN. Then I fill the NaNs with the original column with strings, this will keep the numerical data numeric.

  • Related