Split into columns using pandas-CodePudding

I've a data file with 4 visible columns that I'm trying to split using pandas. I'm getting ParserError: Error tokenizing data. C error: Expected 3 fields in line 3, saw 8

This is my data

0.001155672            259,439      branch-instructions                                         
 0.001155672          1,266,239      instructions              #    1.10  insn per cycle         
 0.001155672             24,148      cache-references                                            
 0.001155672             11,586      cache-misses              #   47.979 % of all cache refs    
 0.001155672          1,150,999      cpu-cycles                                                  
 0.001155672              8,888      branch-misses             #    3.43% of all branches        
 0.002370509            381,074      branch-instructions                                         
 0.002370509          1,908,560      instructions              #    1.12  insn per cycle         
 0.002370509             29,034      cache-references                                            
 0.002370509             15,362      cache-misses              #   52.910 % of all cache refs

I've tried using data = pd.read_table('repg.txt',sep='\s ', header=None, thousands=',') and delim_whitespace=True

CodePudding user response：

Just add comment='#' as a parameter of pd.read_table:

data = pd.read_table('repg.txt',sep='\s ', header=None, thousands=',', comment='#')
print(data)

# Output
          0        1                    2
0  0.001156   259439  branch-instructions
1  0.001156  1266239         instructions
2  0.001156    24148     cache-references
3  0.001156    11586         cache-misses
4  0.001156  1150999           cpu-cycles
5  0.001156     8888        branch-misses
6  0.002371   381074  branch-instructions
7  0.002371  1908560         instructions
8  0.002371    29034     cache-references
9  0.002371    15362         cache-misses