Home > Software design >  Pandas reads the txt file as 1 column except header
Pandas reads the txt file as 1 column except header

Time:02-23

I have a txt file with following data;

"uid" "raum_code" "tuerschild" "raumbezeichung" "raumflaeche" "raumumfang" "raumhoehe" "Comments"
"2" "222" "0.111" "Büro" "19.11" "17.96" "2.85" ""
"3" "333" "0.123" "Besprechungsraum" "42.79" "26.68" "2.85" ""
"4" "444" "0.105" "Büro" "17.28" "16.50" "2.85" ""
"5" "555" "0.106" "Büro" "18.74" "19.78" "2.88" ""
"6" "666" "0.107" "Fernmeldetechnik" "5.79" "9.93" "2.88" ""

I am trying to read file with pandas with following code;

df = pd.read_csv("test.txt",sep="\t",quotechar='"',quoting=csv.QUOTE_NONE)

When I try to get columns with room_file.columns it print this;

Index(['"uid" "raum_code" "tuerschild" "raumbezeichung" "raumflaeche" "raumumfang" "raumhoehe" "Comments"'], dtype='object')

But the df output looks just 1 column;

    "uid" "raum_code" "tuerschild" "raumbezeichung" "raumflaeche" "raumumfang" "raumhoehe" "Comments"
0   "2" "222" "0.111" "Büro" "19.11" "17.96" "2.8...
1   "3" "333" "0.123" "Besprechungsraum" "42.79" ...
2   "4" "444" "0.105" "Büro" "17.28" "16.50" "2...
3   "5" "555" "0.106" "Büro" "18.74" "19.78" "2...
4   "6" "666" "0.107" "Fernmeldetechnik" "5.79"...

5 rows × 1 columns

But it supposed to be 5 rows × 8 columns

I already try:

Solution 1 Solution 2

And I also try like this:

df = pd.read_csv("test.txt",sep='\t',header=0)

But everything same. Can you please help to read the txt file as a dataframe.

CodePudding user response:

Let file.txt content be

"uid" "raum_code" "tuerschild" "raumbezeichung" "raumflaeche" "raumumfang" "raumhoehe" "Comments"
"2" "222" "0.111" "Büro" "19.11" "17.96" "2.85" ""
"3" "333" "0.123" "Besprechungsraum" "42.79" "26.68" "2.85" ""
"4" "444" "0.105" "Büro" "17.28" "16.50" "2.85" ""
"5" "555" "0.106" "Büro" "18.74" "19.78" "2.88" ""
"6" "666" "0.107" "Fernmeldetechnik" "5.79" "9.93" "2.88" ""

observe that spaces are used for separating and everything (both string and numeric values) are quoted and first line is header, considering that suitable reading is

import csv
import pandas as pd
df = pd.read_csv("file.txt",sep=' ',quotechar='"',quoting=csv.QUOTE_ALL)
print(df.shape)  # (5, 8)
print(df)

output

   uid  raum_code  tuerschild    raumbezeichung  raumflaeche  raumumfang  raumhoehe  Comments
0    2        222       0.111              Büro        19.11       17.96       2.85       NaN
1    3        333       0.123  Besprechungsraum        42.79       26.68       2.85       NaN
2    4        444       0.105              Büro        17.28       16.50       2.85       NaN
3    5        555       0.106              Büro        18.74       19.78       2.88       NaN
4    6        666       0.107  Fernmeldetechnik         5.79        9.93       2.88       NaN

CodePudding user response:

i try this and it shows correct result

df = pd.read_csv("test.txt", header=0, quotechar="\"", sep=" ")

print(df.columns)
print(df.shape) 
print(df)
  • Related