Home > OS >  How to use record separator as delimiter in Pandas
How to use record separator as delimiter in Pandas

Time:12-17

I am trying to use the record separator (0x1E) as the separator in the Pandas read_table() function, but it is instead it seems to be splitting on \n (0x0A).

This is my code:

df = pandas.read_table( "separator.log", sep = "[\x1E]", engine = 'python' )
print( df )

This is my input file (separator.log):

{
"a": 1
}{
"b": 2
}{
"c": 3
}

The record separator is after each closing brace, but may not show up in your browser.

The output looks like this:

           {
"a": 1      
}          {
"b": 2  None
}          {
"c": 3  None
}       None

When I try

df = pandas.read_table( "separator.log", sep = chr(0x1E), engine = 'python' )

the error '' expected after '"' is given. Inside the first '' is the record separator character, but it does not show up in the S/O editor.

Is there a way to force read_table to use 0x1E for the delimiter?

CodePudding user response:

It sounds like you want to separate records/lines on \x1e (or chr(30)).

sep is used to seperate/delimate the columns of the table and lineterminator is used to seperate/delimate the rows of the table.

Try:

pd.read_table("separator.log", lineterminator=chr(30))
  • Related