I am running a long running operation, Say 100k jobs. i want to update the progress of it in a file once every 100 such jobs are completed.
i am opening the file using bufferedWriter with append mode as false. Writing it and then closing it. this is done once every 100 jobs are completed. So the file open and close would have happened 1000 times. Can i optimise it further by opening and closing the file only once?
public static void writeMetaData(String writeDir, JSONObject jsonObject) throws Exception {
String filePath = writeDir.concat("/").concat("metadata.txt");
BufferedWriter metaDataWriter = Files.newBufferedWriter(Paths.get(filePath), StandardCharsets.UTF_8, StandardOpenOption.TRUNCATE_EXISTING);
metaDataWriter.write(jsonObject.toString());
IOUtils.closeQuietly(metaDataWriter);
}
for(int i =0 ; i < 100000; i ) {
// do Something;
if(i % 100 == 0) {
writeMetaData(writeDir, jsonObject);
}
}
File should only have a single line.
Expected file content after 100 jobs:
progress: 100
Expected file content after 200 jobs:
progress: 200
Can this be optimised further?
CodePudding user response:
A Stream does not allow to go back and rewrite content. A way to achieve what you want is using a RandomAccessFile
.
Its setLength()
method will truncate the file if you pass 0
.
Here is a simple example:
import java.io.*;
public class Test
{
public static void updateFile(RandomAccessFile raf, String content) throws IOException
{
raf.setLength(0);
raf.write(content.getBytes("UTF-8"));
}
public static void main(String[] args) throws IOException
{
try(RandomAccessFile raf = new RandomAccessFile("metadata.txt", "rw"))
{
updateFile(raf, "progress: 100");
updateFile(raf, "progress: 200");
}
}
}
CodePudding user response:
First of all, an expression like writeDir.concat("/").concat("metadata.txt")
is reducing readability and performance. A straight-forward writeDir "/" "metadata.txt"
will provide better performance. But since you’re constructing a string merely for constructing a Path
, it’s even more straight-forward not to do the Path
’s job in your code but rather use Paths.get(writeDir, "metadata.txt")
.
You can not rewind a BufferedWriter
but you can rewind a FileChannel
. Therefore, to keep the channel open and rewind it when needed, you have to construct a new writer after rewinding:
public static void writeMetaData(FileChannel ch, JSONObject jsonObj) throws IOException {
ch.position(0);
if(ch.size() > 0) ch.truncate(0);
Writer w = Channels.newWriter(ch, StandardCharsets.UTF_8.newEncoder(), 8192);
w.write(jsonObj.toString());
w.flush();
}
try(FileChannel ch = FileChannel.open(Paths.get(writeDir, "metadata.txt"),
StandardOpenOption.WRITE, StandardOpenOption.CREATE)) {
for(int i = 0; i < 100000; i ) {
// do Something;
if(i % 100 == 0) {
writeMetaData(ch, jsonObject);
}
}
}
It’s important that the use of the Writer
ends with flush()
to force the write of all buffered data, but not close()
as that would also close the underlying channel. Note that this code does not wrap the writer into a BufferedWriter
; encoding text as UTF-8 is already a buffered operation and by requesting a larger buffer for the encoder, matching BufferedWriter
’s default buffer size, we get the same effect of buffering without the copying overhead.
Since writing is not an end in itself, there’s a question left regarding your reading side. If the reader is trying to read the data in some intervals, there’s the risk of overlapping with the write, getting incomplete data.
You could use
public static void writeMetaData(FileChannel ch, JSONObject jsonObj) throws IOException {
try(FileLock lock = ch.lock()) {
ch.position(0);
if(ch.size() > 0) ch.truncate(0);
Writer w = Channels.newWriter(ch, StandardCharsets.UTF_8.newEncoder(), 8192);
w.write(jsonObj.toString());
w.flush();
}
}
to lock the file during the write. But depending on the system, file locking might not be mandatory but only affect readers also trying to get a read lock.
When you use JDK 11 or newer, you may consider using
for(int i = 0; i < 100000; i ) {
// do Something;
if(i % 100 == 0) {
Files.writeString(Paths.get(writeDir, "metadata.txt"), jsonObject.toString());
}
}
which clearly wins on simplicity (yes, that’s the complete code, no additional method required). The default options do already include the desired StandardCharsets.UTF_8
and StandardOpenOption.TRUNCATE_EXISTING
.
While it does open and close the file internally, it has some other performance tweaks which may compensate. Especially in the likely case that the string consists of ASCII characters only, as the implementation will simply write the string’s internal array directly to the file then.