I need to build a function named format_adjustment
that can remove a "%" character, and multiply any value more than 0 by 100. (Also, any negative values remain the same).
Example dataset:
df = pd.DataFrame({'col1': ['A','B','C'], 'col2':[-0.42%,0.091,0.0023%], 'col3': [30, 10,20]})
col1 col2 col3
0 A -0.42% 30
1 B 0.091 10
2 C 0.0023% 20
Expected outcome for col2 would look like:
df = pd.DataFrame({'col1': ['A','B','C'], 'col2':[-0.42,0.091,0.0023], 'col3': [30, 10,20]})
col1 col2 col3
0 A -0.42 30
1 B 9.1 10
2 C 0.23 20
CodePudding user response:
def format_adjustment(col2):
# remove % and convert to float
col2 = float(col2.replace('%', ''))
# multiply by 100 if > 0
if col2 > 0:
col2 *= 100
return col2
df = pd.DataFrame({'col1': ['A','B','C'], 'col2':['-0.42%','0.091','0.0023%'], 'col3': [30, 10,20]})
# apply format_adjustment function
df['col2'] = df['col2'].apply(lambda x: format_adjustment(x))
output:
>>> df col1 col2 col3 0 A -0.42 30 1 B 9.10 10 2 C 0.23 20
CodePudding user response:
df['col2'] = df['col2'].astype('str').str.replace('%', '').astype('float')
df['col2'] *= np.where(df['col2'] > 0, 100, 1)
Output:
>>> df
col1 col2 col3
0 A -0.42 30
1 B 9.10 10
2 C 0.23 20