I'm trying to access a table in a docx file with powershell. I tried using PSWriteWord to get the contents but it only shows me the whole unformated 'backround code' like this (see end).
Is there a way I can acces the table contents, so that I could use them to fill a diffrent table?
I'm trying to format a journal I'm getting from a tutor into a format I'm using in my own Journal respectively paste the contens of one table into another but automated.
Import-Module PSWriteWord
Get-WordTable -WordDocument (Get-WordDocument C:\Users\Administrator\Desktop\Journal_VU6-FIAE_KW11_14032022-20032022.docx) -TableID 0
Paragraphs : {Normal, Normal, Normal, Normal...}
Pictures : {}
Hyperlinks : {}
RowCount : 20
ColumnCount : 5
Rows : {Xceed.Document.NET.Row, Xceed.Document.NET.Row, Xceed.Document.NET.Row, Xceed.Document.NET.Row...}
Alignment : left
AutoFit : Fixed
Design : None
Index : 224
CustomTableDesignName :
TableCaption :
TableDescription :
TableLook : Xceed.Document.NET.TableLook
ColumnWidths : {1346, 709, 4851, 1669...}
Xml : <w:tbl xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:tblPr>
<w:tblW w:w="9568" w:type="dxa" />
<w:tblBorders>
<w:top w:val="single" w:sz="4" w:space="0" w:color="auto" />
<w:left w:val="single" w:sz="4" w:space="0" w:color="auto" />
<w:bottom w:val="single" w:sz="4" w:space="0" w:color="auto" />
<w:right w:val="single" w:sz="4" w:space="0" w:color="auto" />
<w:insideH w:val="single" w:sz="4" w:space="0" w:color="auto" />
<w:insideV w:val="single" w:sz="4" w:space="0" w:color="auto" />
</w:tblBorders>
<w:tblLayout w:type="fixed" />
<w:tblCellMar>
<w:left w:w="70" w:type="dxa" />
<w:right w:w="70" w:type="dxa" />
</w:tblCellMar>
<w:tblLook w:val="0000" w:firstRow="0" w:lastRow="0" w:firstColumn="0" w:lastColumn="0" w:noHBand="0" w:noVBand="0" />
</w:tblPr>
<w:tblGrid>
<w:gridCol w:w="1346" />
<w:gridCol w:w="709" />
<w:gridCol w:w="4851" />
<w:gridCol w:w="1669" />
<w:gridCol w:w="993" />
</w:tblGrid>
<w:tr w:rsidR="00B11417" w:rsidRPr="00761CEC" w:rsidTr="009C5764">
<w:trPr>
<w:cantSplit />
<w:trHeight w:hRule="exact" w:val="600" />
</w:trPr>
<w:tc>
<w:tcPr>
<w:tcW w:w="1346" w:type="dxa" />
<w:vMerge w:val="restart" />
<w:vAlign w:val="center" />
</w:tcPr>
<w:p w:rsidR="00B11417" w:rsidRPr="00DB574D" w:rsidRDefault="00B11417">
<w:pPr>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial" />
<w:sz w:val="22" />
<w:szCs w:val="22" />
</w:rPr>
</w:pPr>
<w:r w:rsidRPr="00DB574D">
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial" />
<w:sz w:val="22" />
<w:szCs w:val="22" />
</w:rPr>
<w:t>Montag</w:t>
</w:r>
</w:p>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:w="709" w:type="dxa" />
<w:vAlign w:val="center" />
</w:tcPr>
<w:p w:rsidR="00B11417" w:rsidRPr="00761CEC" w:rsidRDefault="00B11417">
<w:pPr>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial" />
</w:rPr>
</w:pPr>
<w:r w:rsidRPr="00761CEC">
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial" />
</w:rPr>
<w:t>1.</w:t>
</w:r>
<w:r w:rsidR="00252AB3" w:rsidRPr="00761CEC">
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial" />
</w:rPr>
<w:t>/2.</w:t>
</w:r>
<w:r w:rsidRPr="00761CEC">
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial" />
</w:rPr>
<w:t xml:space="preserve"> UE</w:t>
</w:r>
</w:p>
</w:tc>
<w:tc>
<w:tcPr>
<w:tcW w:w="4851" w:type="dxa" />
<w:vAlign w:val="center" />
</w:tcPr>
<w:p w:rsidR="00A93021" w:rsidRDefault="00A93021" w:rsidP="00A93021">
<w:pPr>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial" />
<w:sz w:val="18" />
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial" />
<w:sz w:val="18" />
</w:rPr>
<w:t>Begrüßung / Vorstellung</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00B11417" w:rsidRPr="00761CEC" w:rsidRDefault="00807D31" w:rsidP="00A93021">
<w:pPr>
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial" />
</w:rPr>
</w:pPr>
<w:r w:rsidRPr="00A93021">
<w:rPr>
<w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:cs="Arial" />
<w:sz w:val="18" />
</w:rPr>
<w:t>Strom, Stromrichtung, Reihenschaltung, Parallelschaltung</w:t>
</w:r>
</w:p>
</w:tc>
<w:tc> ....
CodePudding user response:
I don't have module PSWriteWord
, but you could use the COM Word.Application object for this if you have Word installed.
Read the data from the table and create an in-memory CSV from that.
$filename = 'C:\Users\Administrator\Desktop\Journal_VU6-FIAE_KW11_14032022-20032022.docx'
$objWord = New-Object -ComObject Word.Application
$objWord.Visible = $false
$objDocument = $objWord.Documents.Open($filename)
$objTable = $objDocument.Tables.Item(1)
$maxCols = $objTable.Columns.Count
$maxRows = $objTable.Rows.Count
# read the data from each cell and join the cell values with a comma to create CSV format
$data = for ($row = 1; $row -le $maxRows; $row ) {
$line = for ($col = 1; $col -le $maxCols; $col ) {
# remove the `0D07` control characters Word adds and double all quotes in the field if applicable
'"{0}"' -f ($objTable.Cell($row,$col).Range.Text -replace '\p{Cc} ' -replace '"', '""')
}
$line -join ','
}
$objDocument.Close()
$objWord.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objDocument)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objTable)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objWord)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
# finally convert the in-memory CSV data into an array of objects
$result = $data | ConvertFrom-Csv
Regex Unicode Category \p{Cc}
means a character with the Unicode property “control” (an ASCII 0x00..0x1F or Latin-1 0x80..0x9F control character)