Home > OS >  Reading double quoted string values from text in Pandas
Reading double quoted string values from text in Pandas

Time:03-28

I have a text file with the value formatted as follows, please note there there is no separator between values.

"col1""col2""col3""col4"
"val1""date1""number1""val1"
"val2""date2""number2""val2"
"val3""date3""number3""val3"
"val4""date4""number4""val4"

How do I read this file as csv in pandas ?

CodePudding user response:

Here's a working solution - not the most elegant, but does the job:

mylist = []
with open("filename.csv", 'r') as infile:
    for line in infile.readlines():
        mylist.append(line[1:-2].split('""'))
data = pd.Dataframe(mylist)

CodePudding user response:

You can do this by using the python engine for read_csv and specifying a regex of " (i.e. one or more quotes) as the separator. This yields a DataFrame with empty first and last columns, which can be removed with pd.iloc:

df = pd.read_csv('test.csv', sep='" ', engine='python').iloc[:, 1:-1]
df

Output:

   col1   col2     col3  col4
0  val1  date1  number1  val1
1  val2  date2  number2  val2
2  val3  date3  number3  val3
3  val4  date4  number4  val4
  • Related