Home > Back-end >  Cleaning string column that contains float number
Cleaning string column that contains float number

Time:08-03

I am trying to remove the point and zero from every float value within this dataset

  index     CIP
    1        DF5TY34
    2        12342.0
    3        de44dW

(CIP is casted as String)

I wrote this line to resolve the problem but its not doing anything and I'm recieving only a warning no errors:

 pro1[pro1['CIP'].str.contains('\..')]["CIP"] = pro1.loc[pro1['CIP'].str.contains('\..')]["CIP"].astype(float).astype(int).astype(str)

this is the warning:

/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas- 
docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
"""Entry point for launching an IPython kernel.

CodePudding user response:

For a strict replacement of .0, you can use removesuffix:

df['CIP'] = df['CIP'].str.removesuffix('.0')

For a more flexible approach, use a regex with str.replace:

df['CIP'] = df['CIP'].str.replace('\.0*$', '', regex=True)

output:

   index      CIP
0      1  DF5TY34
1      2    12342
2      3   de44dW

regex:

\.   # match a dot
0*   # match any number of 0 (including none)
$    # match end of line

CodePudding user response:

Use below code if you want to convert the float value in string

pro1['CIP'] = pro1['CIP'].apply(lambda x: (str(int(x)) if isinstance(x, float) else x))

if you don't want to convert into string - then remove the str()

pro1['CIP'] = pro1['CIP'].apply(lambda x: (int(x) if isinstance(x, float) else x))

for round off

pro1['CIP'] = pro1['CIP'].apply(lambda x: (round(x) if isinstance(x, float) else x))

You can do anything you want on float values by selecting in this way

  • Related