Home > other >  How to read a File character-by-character in reverse without running out-of-memory?
How to read a File character-by-character in reverse without running out-of-memory?

Time:07-02


The Story


I've been having a problem lately...

I have to read a file in reverse character by character without running out of memory.

I can't read it line-by-line and reverse it with StringBuilder because it's a one-line file that takes up to a gig (GB) of I/O space.

Hence it would take up too much of the JVM's (and the System's) Memory.

I've decided to just read it character by character from end-to-start (back-to-front) so that I could process as much as I can without consuming too much memory.


What I've Tried


I know how to read a file in one go:

(MappedByteBuffer FileChannel Charset which gave me OutOfMemoryExceptions)

and read a file character-by-character with UTF-8 character support

(FileInputStream InputStreamReader).

The problem is that FileInputStream's #read() only calls #read0() which is a native method!

Because of that I have no idea about the underlying code...

Which is why I'm here today (or at least until this is done)!

CodePudding user response:

This will do it (but as written it is not very efficient).

  • just skip to the last location read less one and read and print the character.
  • then reset the location to the mark, adjust size and continue.
File f = new File("Some File name");
int size = (int) f.length();
int bsize = 1;
byte[] buf = new byte[bsize];
try (BufferedInputStream b =
        new BufferedInputStream(new FileInputStream(f))) {
    while (size > 0) {
        b.mark(size);
        b.skip(size - bsize);
        int k = b.read(buf);
        System.out.print((char) buf[0]);
        size -= k;
        b.reset();
    }
    
} catch (IOException ioe) {
    ioe.printStackTrace();
}

This could be improved by increasing the buffer size and making equivalent adjustments in the mark and skip arguments.

CodePudding user response:

This doesn't support multi byte UTF-8 characters

Using a RandomAccessFile you can easily read a file in chunks from the end to the beginning, and reverse each of the chunks.

Here's a simple example:

import java.io.FileWriter;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.util.stream.IntStream;

class Test {
    private static final int BUF_SIZE = 10;
    private static final int FILE_LINE_COUNT = 105;

    public static void main(String[] args) throws Exception {
        // create a large file
        try (FileWriter fw = new FileWriter("largeFile.txt")) {
            IntStream.range(1, FILE_LINE_COUNT).mapToObj(Integer::toString).forEach(s -> {
                try {
                    fw.write(s   "\n");
                } catch (IOException e) {
                    throw new RuntimeException(e);
                }
            });
        }
        // reverse the file
        try (RandomAccessFile raf = new RandomAccessFile("largeFile.txt", "r")) {
            long size = raf.length();
            byte[] buf = new byte[BUF_SIZE];

            for (long i = size - BUF_SIZE; i > -BUF_SIZE; i -= BUF_SIZE) {
                long offset = Math.max(0, i);
                long readSize = Math.min(i   BUF_SIZE, BUF_SIZE);
                raf.seek(offset);
                raf.read(buf, 0, (int) readSize);
                for (int j = (int) readSize - 1; j >= 0; j--) {
                    System.out.print((char) buf[j]);
                }
            }
        }
    }
}

This uses a very small file and very small chunks so that you can test it easily. Increase those constants to see it work on a larger scale.

The input file contains newlines to make it easy to read the output, but the reversal doesn't depend on the file "having lines".

  • Related