I had a method that detects line endings
def getLineEnding(filename):
ret = "\r\n"
with open(filename, 'r') as f:
f.readline()
ret = f.newlines
return ret
In order to be able to test it without using real files, I changed it to:
def getLineEnding(filehandle):
filehandle.readline()
return filehandle.newlines
And this works fine with files. But when I do this:
f = StringIO()
f.write('test\r\n')
f.seek(0)
f.readline()
print(f.newlines)
I get None
The reason I'm checking the file ending is that I'm writing a program that process a text file, and I want to keep the original line endings.
CodePudding user response:
To answer your question, the default value of the newline
parameter is different for io.StringIO
than for io.TextIOWrapper
(which is returned by open(..., 'r')
). For StringIO
the default is '\n'
while for TextIOWrapper
the default is None
. The documentation explains the behavior:
newline controls how line endings are handled. It can be
None
,''
,'\n'
,'\r'
, and'\r\n'
. It works as follows:
- When reading input from the stream, if newline is
None
, universal newlines mode is enabled. Lines in the input can end in'\n'
,'\r'
, or'\r\n'
, and these are translated into'\n'
before being returned to the caller. If newline is''
, universal newlines mode is enabled, but line endings are returned to the caller untranslated. If newline has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.- [...]
So that means that TextIOWrapper
will translate the line endings while StringIO
will not, by default. Then the documentation of the newlines
attribute is:
A string, a tuple of strings, or None, indicating the newlines translated so far. [...]
Hence, if no translation is performed, this attribute will not be set (which is the case for StringIO
).
The solution is to construct the StringIO
object by passing newline=None
, i.e.
f = StringIO(newline=None)
Then the behavior w.r.t. line endings will be similar to TextIOWrapper
.
However, if the goal is to have line endings unchanged, one can use newline=''
directly to return the line endings untranslated, as explained in the above quote from the docs.