I have a dataframe(1000,3). I am interested for one of the columns because it contains strings but they have this part which I want to remove --> "\n" . For example
df
A B C
1 d\n aa
2 \nc gg
3 m\nm hh
I want the outcome to be
df
A B C
1 d aa
2 c gg
3 mm hh
I have tried the following
df['B'] = df['B'].replace('\n', '')
and
df['B'] = df['B'].str.replace(r'\n', '', regex=False)
I put a breaking point after it to inspect the outcome but nothing changes for both methods. Then I tried
df['B'] = df['B'].replace('\n', '', regex=True)
however when I have the breaking point I get the following error
File "path\pandas\core\internals\managers.py", line 304, in apply
applied = getattr(b, f)(**kwargs)
File "path\pandas\core\internals\blocks.py", line 761, in _replace_regex
replace_regex(new_values, rx, value, mask)
File "path\pandas\core\array_algos\replace.py", line 153, in replace_regex
f = np.vectorize(re_replacer, otypes=[np.object_])
File "path\numpy\lib\function_base.py", line 2261, in __init__
otypes = ''.join([_nx.dtype(x).char for x in otypes])
File "path\numpy\lib\function_base.py", line 2261, in <listcomp>
otypes = ''.join([_nx.dtype(x).char for x in otypes])
TypeError: 'NoneType' object is not callable
(I replaced the path with the word "path")
but the code runs. Of course I can run the code and save the outcome to a csv file and check from there, but I do not understand the problem
CodePudding user response:
I suspect you don't have newlines, but rather literal \n
.
You should try:
df['B'] = df['B'].str.replace(r'\n', '', regex=False)
CodePudding user response:
just a quick update: Based on your feedback I found the solution. It was as simple as
df['B'] = df['B'].str.replace('\n', ' ', regex= False)
Thank you very much! Just posted it as an answer in case someone faces the same issue in the future!
CodePudding user response:
In [21]: d = {'A': [1, 2,3 ], 'B': ['d\n', '\nc', 'm\nm']}
In [22]: df = pd.DataFrame(data=d)
In [23]: df
Out[23]:
A B
0 1 d\n
1 2 \nc
2 3 m\nm
In [24]: df['B'].replace('\n', '', regex=True)
Out[24]:
0 d
1 c
2 mm
Name: B, dtype: object
In [25]: