I am currently writing a simple script that is helpful for my job. Precisely, it should remove the comment's author in the .docx file. I am doing it manually by adjusting the comments.xml file in the \word folder inside the file. I was thinking a bit about the possible algorithm and came up with the following solution:
- Extract comments.xml file from .docx.
- Find and replace the text inside comments.xml.
- Update comments.xml file with an adjusted version.
Steps 1 and 3 are not too hard, these are simple commands. For example, extraction step can be done like this:
@ECHO OFF
SET Winrar=C:\Program Files\WinRAR\WinRAR.exe
FOR %%I IN (*.docx) DO (
"%WinRAR%" e "%%I" word\comments.xml
)
I had a plan to use the Find and Replace (fnr) tool to edit the extracted file.
"C:\Comments\fnr.exe" --cl --dir "" --fileMask "*.xml" --find "" --replace ""
My problem, however, is in the fact that the author's information is different in different files, so I still have to type in the exact text to be replaced. The line within comments.xml in general looks like this:
<w:comment w:id="0" w:author="Author_Name" w:date="YYYY-MM-DDTHH:MM:SS" w:initials="AN">
and what I'm doing manually is changing author="Author_Name"
to author=""
. Is there a way to apply a filter for this edit to be done automatically within the script, please? Thanks in advance!
Edit: So currently I use a temporary solution that replaces w:author with w:noauthor. LibreOffice and Word cannot properly read the information and indicate "No author" under the comment. The information however is still present in the comments.xml file, I will be really grateful for the advice or solution on how to remove it within the same script. It looks like this:
@ECHO OFF
SET Winrar=C:\Program Files\WinRAR\WinRAR.exe
SET fnr=C:\Comments\fnr.exe
FOR %%I IN (*.docx) DO (
"%WinRAR%" x "%%I" word\comments.xml
"%fnr%" --cl --dir "C:\Comments\word" --fileMask "*.xml" --find "w:author=" --replace "w:noauthor="
"%fnr%" --cl --dir "C:\Comments\word" --fileMask "*.xml" --find "w:date=" --replace "w:nodate="
"%fnr%" --cl --dir "C:\Comments\word" --fileMask "*.xml" --find "w:initials=" --replace "w:noinitials="
"%WinRAR%" u "%%I" word\
del C:\Comments\word\ /q
)
CodePudding user response:
The final solution looks like this:
@ECHO OFF
SET Winrar=C:\Program Files\WinRAR\WinRAR.exe
SET fnr=C:\Comments\fnr.exe
FOR %%I IN (*.docx) DO (
"%WinRAR%" x "%%I" word\comments.xml
"%fnr%" --cl --dir "C:\Comments\word" --fileMask "*.xml" --useRegex --find "w:author=\". ?\"" --replace "w:author=\"\"
"%fnr%" --cl --dir "C:\Comments\word" --fileMask "*.xml" --useRegex --find "w:date=\". ?\"" --replace "w:date=\"\"
"%fnr%" --cl --dir "C:\Comments\word" --fileMask "*.xml" --useRegex --find "w:initials=\". ?\"" --replace "w:initials=\"\"
"%WinRAR%" u "%%I" word\
del C:\Comments\word\ /q
)