public class StreamTestMain{
public static void main(String args[]) throws IOException{
BufferedInputStream bis=new BufferedInputStream(new FileInputStream("C:\\File\\java_StreamTestMain.txt"));
ArrayList<String> arr=new ArrayList<String>();
byte[] bys=new byte[1024];
int len;
while((len=bis.read())!=-1) {
arr.add((new String(bys,0,len)));
}
bis.close();
System.out.print(arr.toString());
}
}
- Why I can't use Stream to add String(byte) into ArrayList?
- Why do I need to use the Writer to do that?
CodePudding user response:
You can, certainly, but you've implemented this wrong. In particular, you haven't done anything to make sure bis
reads into bys
.
CodePudding user response:
Why I can't use Stream to add String(byte) into ArrayList?
Because there is a major bug in your code:
while ((len = bis.read()) != -1) {
arr.add((new String(bys, 0, len)));
}
should be:
while ((len = bis.read(bys)) != -1) {
arr.add((new String(bys, 0, len)));
}
You are using the wrong read
method. The no-args read
method reads ONE byte and returns it as the result. It doesn't update for bys
buffer. (How could it? You didn't pass it as a parameter!)
(But read on. The above is not correct ... for many character encodings!)
Why do I need to use the Writer to do that?
Well ... because fixing the above does not fix all of the bugs. There are some more subtle ones, including the following:
When you call
new String(bys, 0, len)
you are converting the bytes to characters using the JVM's default character encoding. That's not necessarily the correct encoding scheme. It depends on the file that you are reading.If you use the wrong encoding, you are liable to end up with garbage characters.
You are reading in chunks of 1024 bytes and converting to strings. But what happens to the last character in the buffer:
- If the character is represented by one byte you are OK
- If the character is represented by multiple bytes and all of the bytes are in the buffer, you are OK
- If the character is represented by multiple bytes and you only have some of the bytes (because the rest are in the next 1024 byte chunk), then ... BAD THINGS. (You may get a couple of garbage / garbled characters ... or an exception. Try it and see.)
Having your data split into a list of ~1024 character strings is not helpful ... for most use-cases.
The first problem may be solved by providing the charset as an extra argument when converting the bytes to a string. Or to the FileReader
constructor.
The second problem is difficult to solve using the InputStream
API, but the Reader
API will just handle it for you. This is one is reason you should use the Reader
API. (Not Writer
. You are not writing to the file ...)
The third problem requires you to consider your actual use-case and what you are going to do with the data next. If you use the BufferedReader
API you can read as lines instead. And there are other more concise ways to load all lines or all characters in a file; e.g. see the Files.readAll...
methods.
A main
method that throws exceptions is sloppy, and your code will leak resources if main
was called repeatedly from some other method. But if this is "just a test" you can ignore these issues.