Home > Enterprise >  Specify max delimiter with delim_whitespace, read_csv
Specify max delimiter with delim_whitespace, read_csv

Time:10-19

I have the following results in a variable called results:

0 a b this is my first file
1 c d this is my second file
2 e f this is my third file
3 g h this is my fourth file
4 i j this is my fifth file

I want to parse the results into a pandas DataFrame. The result I want is

Calling read_csv
0 a b this is my first file
1 c d this is my second file
2 e f this is my third file

Instead, when I called:

read_csv(StringIO(results), delim_whitespace=True), I get :

0 a b this is my first file
1 c d this is my second file
2 e f this is my third file

Is there any way to specify the max number of delimiter while using delim_whitespace ?

CodePudding user response:

# Data
results = """a b this is my first file
c d this is my second file
e f this is my third file
g h this is my fourth file
i j this is my fifth file"""

lines = results.split("\n")
words = [line.split(maxsplit=2) for line in lines]
df = pd.DataFrame(words)

CodePudding user response:

You can use:

data = """0 a b this is my first file
1 c d this is my second file
2 e f this is my third file
3 g h this is my fourth file
4 i j this is my fifth file"""

data = [line.split(' ', 3) for line in data.split('\n')]
df = pd.DataFrame(data=data)

OUTPUT:

   0  1  2                       3
0  0  a  b   this is my first file
1  1  c  d  this is my second file
2  2  e  f   this is my third file
3  3  g  h  this is my fourth file
4  4  i  j   this is my fifth file
  • Related