Home > front end >  Is it possible to read a csv with `\r\n` line terminators in pandas?
Is it possible to read a csv with `\r\n` line terminators in pandas?

Time:04-27

I'm using pandas==1.1.5 to read a CSV file. I'm running the following code:

import pandas as pd
import csv

csv_kwargs = dict(
    delimiter="\t",
    lineterminator="\r\n",
    quoting=csv.QUOTE_MINIMAL,
    escapechar="!",
)
pd.read_csv("...", **csv_kwargs)

It raises the following error: ValueError: Only length-1 line terminators supported.
Pandas documentation confirms that line terminators should be length-1 (I suppose single character).

Is there any way to read this CSV with Pandas or should I read it some other way?
Note that the docs suggest length-1 for C parsers, maybe I can plugin some other parser?

EDIT: Not specifying the line terminator raises a parse error in the middle of the file. Specifically ParserError: Error tokenizing data., it expects the correct number of fields but gets too many.

EDIT2: I'm confident the kwargs above were used to created the csv file I'm trying to read.

CodePudding user response:

The problem might be in the escapchar, since ! is a common text character.

Python's csv module defines a very strict use of escapechar:

A one-character string used by the writer to escape the delimiter if quoting is set to QUOTE_NONE and the quotechar if doublequote is False.

but it's possible that pandas interprets it differently:

One-character string used to escape other characters.

It's possible that you have a row that contains something like:

...\t"some important text!"\t...

which would escape the quote character and continue parsing text into that column.

  • Related