Home > Mobile >  Reading funny delimited txt file
Reading funny delimited txt file

Time:10-06

I have a txt file that looks like this. When reading it, it reads it as one column. I have tried to use a lot of different sep=

$Eid, X, Y, Z, Mass
  856395   3.4694275e-01  -9.7051837e-02   6.4922004e 00   7.3136240e-03
  856396   3.4694746e-01  -9.7053476e-02   6.5071974e 00   7.3139570e-03
  856397   3.4695095e-01  -9.7054794e-02   6.5221949e 00   7.3139421e-03
  856398   3.4695303e-01  -9.7055703e-02   6.5371923e 00   7.3139500e-03
df_data = pd.read_csv("ElCEntroid kopi-kopi.txt", skiprows=2, sep="t")

CodePudding user response:

I imagine t it not your separator but rather tabulations (\t):

you can try:

df_data = pd.read_csv('ElCEntroid kopi-kopi.txt',
                      skiprows=1, header=None,
                      sep='\s '                    # or sep='\t'
                     )

output:

        0         1         2         3               4
0  856395  0.346943 -0.097052  6.492200   7.3136240e-03
1  856396  0.346947 -0.097053  6.507197   7.3139570e-03
2  856397  0.346951 -0.097055  6.522195   7.3139421e-03
3  856398  0.346953 -0.097056  6.537192   7.3139500e-03

Btw, if you're interested in the header, you could also use:

df_data = pd.read_csv('ElCEntroid kopi-kopi.txt', sep=",?\s ", engine='python')

output:

     $Eid         X         Y         Z            Mass
0  856395  0.346943 -0.097052  6.492200   7.3136240e-03
1  856396  0.346947 -0.097053  6.507197   7.3139570e-03
2  856397  0.346951 -0.097055  6.522195   7.3139421e-03
3  856398  0.346953 -0.097056  6.537192   7.3139500e-03

CodePudding user response:

Is the file a fixed width file? If so you can use pd.read_fwf() instead. This is really helpful if you know in advance the width of each column. Im assuming here that Eid, X, Y, Z, Mass have width 8, 16, 16, 16, and 16 respectively, but you can change it if its different.

df = pd.read_fwf("ElCEntroid kopi-kopi.txt", skiprows=2, widths=[8, 16, 16, 16, 16])
  • Related