I have a Word document (docx); I want to make changes to that document and save the result as another file, leaving the original in place. I have the following code illustrating my current problem:
package sandbox.word.doccopy;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.List;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
public class CopyTest
{
public static void main(String[] args) throws Exception
{
String sourceFilename = "CopyTestSource.docx";
String destinationFilename = "CopyTestResult.docx";
CopyTest docCopy = new CopyTest();
docCopy.copyTesting(sourceFilename, destinationFilename);
System.out.println("done");
}
public void copyTesting(String source, String destination)
throws IOException, InvalidFormatException
{
XWPFDocument doc = new XWPFDocument(OPCPackage.open(source));
// for each paragraph that has runs,
// put an exclamation at the end of the first run.
for (XWPFParagraph par : doc.getParagraphs())
{
List<XWPFRun> runs = par.getRuns();
if (runs.size() > 0)
{ XWPFRun run = par.getRuns().get(0);
String text = run.getText(0);
text = text "!";
run.setText(text, 0);
}
}
// FileOutputStream fos = new FileOutputStream(destination);
// doc.write(fos);
// fos.close();
doc.close();
}
}
There are three ways I've run this, changing commented lines at the bottom of the class file. As you see, there are three lines that create a file output stream with the destination filename, write to it, and close it, and one line that just closes the current document.
If I comment out the 3 lines and leave the 1 line, no changes are written to the current document (and, of course, the copy document is not created).
If I leave all 4 lines uncommented, the copy document is created with changes, and the changes are also written to the source document.
If I comment out the 4th line, I get a destination document with changes, and the source document is left unchanged.
The last one is what I want, I can write my code to do that. But I would expect that closing the document after it is changed would either change it or not change it, and that changing it wouldn't depend on whether I had written the changes to another file.
Can anyone shed any light on this?
CodePudding user response:
The culprit is this: XWPFDocument doc = new XWPFDocument(OPCPackage.open(source));
. And specially this: OPCPackage.open(source)
.
While static OPCPackage open(java.lang.String path) the OPCPackage
gets opened from the underlying file of file path path
with read/write permission. Additional it stays directly connected to the underlying file. This saves some memory but has disadvantages too, as you will see now.
All changes in XWPFDocument
are made in that OPCPackage
but in random access memory first.
While calling doc.write
, which calls POIXMLDocument.write(java.io.OutputStream stream), at first the underlying OPCPackage
gets updated. Then the changed OPCPackage
gets saved in the destination document through the given OutputStream stream
. So without calling doc.write
nothing gets changed in files but stays in random access memory only.
Then while doc.close()
gets called also OPCPackage.close gets called. This closes the open, writable package and saves its content. And since the OPCPackage
is directly connected to the underlying file, it saves the content into that file. That's why the changes are also written to the source document.
This should explain your observations.
The XWPFDocument
also provides constructor
XWPFDocument(java.io.InputStream is). This internally calls OPCPackage.open(java.io.InputStream in). And this opens the OPCPackage
from the InputStream
. The OPCPackage
then is in random access memory only and is independent form the source file. That uses some more memory as the whole OPCPackage
needs to be in random access memory but OPCPackage.close
will not lead to changes in source file.
So what I would do is:
...
XWPFDocument doc = new XWPFDocument(new FileInputStream(source));
...
FileOutputStream fos = new FileOutputStream(destination);
doc.write(fos);
fos.close();
doc.close();
...