Create a function that can remove a comma from any given column illustrated.
Essentially I need a piece of code that removes a comma from all the values within a column, and in addition, the code also becomes a function that means the end-user can identify different column names to run this command for.
My code so far:
df = df.apply(lambda x: x.replace(',', ''))
print (df)
The code above is how far I have gotten. Python seems to accept this piece of code, however when I print the df, the commas still show.
Once I get this to work, my next battle is understanding how I can target just one specific column rather than the whole dataset, and make this an interchangeable function for the end-user.
Baring in mind that I am very new to Python coding, any explanations would be much appreciated.
Thanks!
CodePudding user response:
Assuming this toy example:
df = pd.DataFrame([['123,2', 'ab,c', 'd', ',']], columns=list('ABCD'))
A B C D
0 123,2 ab,c d ,
You can use str.replace
(replace
would only replace if the full content of the cell is ,
):
df = df.apply(lambda col: col.str.replace(',', ''))
output:
A B C D
0 1232 abc d
To target only one column:
df['A'] = df['A'].str.replace(',', '')
output:
A B C D
0 1232 ab,c d ,