Remove characters from a cell and divide remaining float by 2 in python pandas-CodePudding

I have a dataframe with multiple columns where some cells contain the characters "DL" and a float. The other cells contain floats only.

For example:

	Column1	Column2
row1	DL10.4	5.6
row2	4.7	DL8.8

I want use python to remove the characters "DL" and divide the remaining floats by 2. The cells without characters should be unchanged and not divided by 2.

Expected result:

	Column1	Column2
row1	5.2	5.6
row2	4.7	4.4

CodePudding user response：

Use Series.str.extractSeries.str.extractall for values after DL, divide by 2 and replace non DL valeus by original DataFrame:

df1 = df.apply(lambda x: x.str.extract('DL(\d \.\d )', expand=False))
df = df1.astype(float).div(2).fillna(df).astype(float)
print (df)
      Column1  Column2
row1      5.2      5.6
row2      4.7      4.4

CodePudding user response：

I will assume you have a way of looping through each row in the dataset. With that in mind, something like this should work for you:

import re
for row in df.iterrows():
    for col_value in row:
        reg_match = re.match("^DL([0-9] \.[0-9])", col_value)
        if reg_match:
            num = float(reg_match.group(1))
            col_value = num

I'm not sure whether or not Python will complain about modifying the col_value in this case. If it does, you probably want to save the new values into a new data frame instead.