I have the the following dataframe:
# dictionary with list object in values
details = {
'Name' : ['D', 'C', 'F', 'G','A','N'],
'values' : ['21%','45%','10%',12,14,15],
}
df = pd.DataFrame(details)
The column value has values in %, however, some were originally saves as string with symbol % and some as number. I would like to get rid of the % and have them all as int type. For that I have used replace and then as_type. however, when I repalce the '%', the values that son't have % change to Nan values:
df['values']=df['values'].str.replace('%', '')
df
>>> Name values
0 D 21
1 C 45
2 F 10
3 G NaN
4 A NaN
5 N NaN
My reuired output should be:
>>> Name values
0 D 21
1 C 45
2 F 10
3 G 12
4 A 14
5 N 15
My question is, how can I get rid of the % and get the column with the values , without getting these NaN values? and why is this happenning?
CodePudding user response:
There are numeric values, so if use str
function get missing values for numeric, possible solution is use Series.replace
with regex=True
for replace by substring and then because get numeric with strings convert output to integers:
df['values']=df['values'].replace('%', '', regex=True).astype(int)
print (df)
Name values
0 D 21
1 C 45
2 F 10
3 G 12
4 A 14
5 N 15
Or your solution with replace missing values:
df['values']=df['values'].str.replace('%', '').fillna(df['values']).astype(int)