input:
"""""""NW_020998607.1""" 397418
"""""""NW_020998607.1""" 2583299
"""""""NW_020998607.1""" 2742463
"""""""NW_020998607.1""" 9131893
"""""""NW_020998607.1""" 11763556
"""""""NW_020998607.1""" 11763572
expected output:
NW_020998607.1 397418
NW_020998607.1 2583299
NW_020998607.1 2742463
NW_020998607.1 9131893
NW_020998607.1 11763556
NW_020998607.1 11763572
output:
"""""""NW_020998607.1""" 397418
"""""""NW_020998607.1""" 2583299
"""""""NW_020998607.1""" 2742463
"""""""NW_020998607.1""" 9131893
"""""""NW_020998607.1""" 11763556
"""""""NW_020998607.1""" 11763572
code:
import pandas as pd
with open(input, 'r') as aaa:
lines_1 = [line.rstrip('\n').split('\t') for line in aaa]
df = pd.DataFrame(lines_1)
df_replace[0] = df.replace[0]('"', '')
I tried to replace '"' to '', but nothing happened with pandas. Could you help me to remove the double quotation marks?
CodePudding user response:
You can use pandas.Series.str.strip("\"")
.
>>> import pandas as pd
>>>
>>> with open("input.txt") as f:
... df = pd.read_csv(f, sep="\s ", header=None)
... df[0] = df[0].str.strip("\"")
... print(df)
...
0 1
0 NW_020998607.1 397418
1 NW_020998607.1 2583299
2 NW_020998607.1 2742463
3 NW_020998607.1 9131893
4 NW_020998607.1 11763556
5 NW_020998607.1 11763572
Note: You can use pd.read_csv
to read the data directly from file object with separator as \s
.
CodePudding user response:
You can use string replace methods.
name = '"""""""NW_020998607.1""" 397418'
print(name.replace("\"",""))
output
NW_020998607.1 397418