Home > front end >  How can I remove '\n 1' from a DataFrame?
How can I remove '\n 1' from a DataFrame?

Time:06-12

I have the following dataframe:

Senior Location
False Warszawa
True Warszawa\n 1

I try to remove that "\n 1", which looks like a hidden character to me. At first, I tried with:

df['Location']=df['Location'].str.replace('Warszawa\n 1','Warszawa')

but nothing happened.

I managed to remove those characters manually, with a long row of splits and replaces, but it is not a viable solution, because it gives me some weird results in subsequent part of the program: although I have "Warszawa" in both rows of the df, they are treated as being two different locations, although there is only one location.

What I want is this:

Senior Location
False Warszawa
True Warszawa

How can I correctly remove that "\n 1"? And what character is it?

CodePudding user response:

df['Location'] = df['Location'].str.replace(r'Warszawa\n   1','Warszawa', regex = False)

CodePudding user response:

When using str.replace() the regex parameter is set to True by default. Since you just want to replace the literal string you either want to do what @Amir Py has done and turn regex=False or you can use the replace() method and do an inplace literal string replacement. The regex parameter is replace() is set to False by default.

Code:

df['Location'].replace('Warszawa\n   1', 'Warszawa', inplace=True)

It can also be useful if you have other similar issues in other columns of your dataframe. For more information there is a great question and answer on stack: str.replace v replace

  • Related