When converting a table from PDF to word, I ended up with a format similar to the following:
([space] is a space character)
Text [space.spacing 10pts] Text [space.spacing 30pts] Text
Text [space.spacing 14pts] Text [space.spacing 31pts] Text
Text [space.spacing 12pts] Text [space.spacing 33pts] Text
Instead of a regular table with 3 columns and 3 rows containing each « Text » such as below
Text | Text | Text |
---|---|---|
Text | Text | Text |
Text | Text | Text |
In other words, instead of creating a column, the PDF conversion has created a regular paragraph, mimicking columns by adjusting [spaces].spacing according to the length of the text within the column.
So my inital thought was that it should be possible to recreate a table by identifing the spacing of each space for each paragraph of the converted table, eventually replacing them with identifiable symbols so I can convert the text into a table later on.
My idea was somewhat the following :
' For each paragraph of the selected text (which is the converted table)
' Find all [space] within the paragraph range
' If a [space] is found, check its spacing
' 1st case : [space].spacing is <= 1 pts (so a normal space)
' Do nothing
' 2nd case : [space].spacing is >= 10 pts (so previous Text is supposed to be within a small column)
' insert ££ (symbol for small column)
' 3rd case [space].spacing is >= 30 pts (so previous Text is supposed to be within a small column)
' insert §§ (symbol for large column)
' Once all [space] are found within the current paragraph, do the same with the next paragraph, until the last paragraph of the selected text
My current code is the following :
Private Sub Test()
Dim RngSearch As Range
Dim RngCurrent As Range
Dim Paragraph As Paragraph
For Each Paragraph In ActiveDocument.Paragraphs
Set RngCurrent = Paragraph.Range
RngCurrent.Select 'For testing purposes
With RngCurrent.Find
.Text = " "
Do While RngCurrent.Find.Execute
RngCurrent.Select 'For testing purposes
Select Case RngCurrent.Font.Spacing
Case Is >= 30
RngCurrent.Font.Spacing = 1
RngCurrent.InsertAfter ("§§")
Case Is >= 10
RngCurrent.Font.Spacing = 1
RngCurrent.InsertAfter ("¤")
Case Else
' Do Nothing
End Select
Loop
End With
Next Paragraph
End Sub
So it kinda word with one issue : it loops infinitely. Each time the text is finished, it goes back again indefinitely.
I managed to track the issue to the following code :
With RngCurrent.Find
.Text = " "
Do While RngCurrent.Find.Execute
RngCurrent.Select
' Use Case function
Loop
End With
Without it, the looping through paragraphs works normally (it ends at the last paragraph)
For Each Paragraph In ActiveDocument.Paragraphs
Set RngCurrent = Paragraph.Range
RngCurrent.Select
' Code here
Next Paragraph
But once .find.text (" ")
is injected, it actually doesn't look within each Paragraphs.Range anymore as I supposed Do While RngCurrent.Find.Execute
should have established.
I feel like the solution is something very stupid, but I've been searching for the reason why or alternatives for 2 days now. Everytime, it stops acting as per my understading when I'm using .find(" ")
.
I already tried using .wrap = wdFindStop
, but it stops at the first match within the paragraph, and goes to the next paragraph prematurely.
With RngCurrent.Find
.Text = " "
.wrap = wdFindStop
Do While RngCurrent.Find.Execute
RngCurrent.Select
' Use Case function
Loop
End With
Strangely .wrap = wdFindAsk
doesn't ask me anything... maybe that means something.
I believe it's because there are always spaces within each paragraph ? So it can loops indefinitely?
CodePudding user response:
You're way over-complicating things:
Sub MakeTable()
Application.ScreenUpdating = False
Dim i As Single
With Selection
i = .Characters.First.Font.Size
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Wrap = wdFindStop
.Text = " "
.Replacement.Text = "^t"
.Replacement.Font.Size = i
.Font.Size = 10
.Execute Replace:=wdReplaceAll
.Font.Size = 30
.Execute Replace:=wdReplaceAll
End With
.ConvertToTable Separator:=vbTab
End With
Application.ScreenUpdating = True
End Sub
CodePudding user response:
So I finally found not exactly a solution but a workaround for anyone who may need a similar solution. Instead of using a .find =" ", I decided to go the "hard" path and check for every word in a paragraph (which in MS Word, seems to end with a [space] character). Then, I check for the last character of a word (which is often a space) if its spacing is superior to a value. It the case, do something.
For Each RngWord In Paragraph.Range.Words
Set RngChar = RngWord.Characters.Last
Select Case RngChar.Font.Spacing
Case Is > 300
RngChar.Font.Spacing = 1
RngChar.InsertAfter ("£")
Case Is > 100
RngChar.Font.Spacing = 1
RngChar.InsertAfter ("#")
Case Is > 15
RngChar.Font.Spacing = 1
RngChar.InsertAfter ("¤")
Case Else
' Do Nothing
End Select
Next RngWord
It does the job, and isn't that slow, but I guess there are better solution :)