I have the following results in a variable called results:
0 a b this is my first file
1 c d this is my second file
2 e f this is my third file
3 g h this is my fourth file
4 i j this is my fifth file
I want to parse the results into a pandas DataFrame. The result I want is
Calling read_csv
0 | a | b | this is my first file |
1 | c | d | this is my second file |
2 | e | f | this is my third file |
Instead, when I called:
read_csv(StringIO(results), delim_whitespace=True), I get :
0 | a | b | this | is | my | first | file |
1 | c | d | this | is | my | second | file |
2 | e | f | this | is | my | third | file |
Is there any way to specify the max number of delimiter while using delim_whitespace ?
CodePudding user response:
# Data
results = """a b this is my first file
c d this is my second file
e f this is my third file
g h this is my fourth file
i j this is my fifth file"""
lines = results.split("\n")
words = [line.split(maxsplit=2) for line in lines]
df = pd.DataFrame(words)
CodePudding user response:
You can use:
data = """0 a b this is my first file
1 c d this is my second file
2 e f this is my third file
3 g h this is my fourth file
4 i j this is my fifth file"""
data = [line.split(' ', 3) for line in data.split('\n')]
df = pd.DataFrame(data=data)
OUTPUT:
0 1 2 3
0 0 a b this is my first file
1 1 c d this is my second file
2 2 e f this is my third file
3 3 g h this is my fourth file
4 4 i j this is my fifth file