Lets say i have a dataframe that looks like this:
Col1
0 100,000,000
1 234,343,000
2 NaN
3 213,564,545
4 NaN
5 678,000
6 200
how can divide the non NaN values in Col1 with 1000 and have this desirable output:
Col1 Col2
0 100,000,000 100,000
1 234,343,000 234,343
2 NaN NaN
3 213,564,545 213,564
4 NaN NaN
5 678,000 678
6 200 0.2
I tried to do df['Col1'].div(1000)
but it is not working because of the NaN values
CodePudding user response:
It looks like you have strings, so dividing by 1000 is just removing the last comma and last 3 digits. You can use a regex:
df['Col2'] = df['Col1'].str.replace(',?\d{,3}$', '', regex=True)
output:
Col1 Col2
0 100,000,000 100,000
1 234,343,000 234,343
2 NaN NaN
3 213,564,545 213,564
4 NaN NaN
5 678,000 678
emulating a true division:
only small numbers as float
df['Col2'] = df['Col1'].str.replace(r'(.*?)(,?(\d{1,3}))$',
lambda m: m.group(1) if m.group(1) else ('0.' m.group(3).zfill(3)),
regex=True)
output:
Col1 Col2
0 100,000,000 100,000
1 234,343,000 234,343
2 NaN NaN
3 213,564,545 213,564
4 NaN NaN
5 678,000 678
6 23 0.023
all floats
df['Col2'] = df['Col1'].str.replace(r'(.*?)(,?(\d{1,3}))$',
lambda m: (m.group(1) if m.group(1) else '0') '.' m.group(3).zfill(3),
regex=True)
output:
Col1 Col2
0 100,000,000 100,000.000
1 234,343,000 234,343.000
2 NaN NaN
3 213,564,545 213,564.545
4 NaN NaN
5 678,000 678.000
6 23 0.023