Home > front end >  Search paths on document and save them in array
Search paths on document and save them in array

Time:11-10

I want to search for Windows paths on complete HTML document and save them in a array. The paths can be completely different. Except for the drive letter, everything that comes after it is uncertain.

For example my HTML:

<p>Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
<br>
<br>
C:\Users\max\Documents<br>
S:\Data\Customer<br>
<br>
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.</p>

CodePudding user response:

You could use the following regex to find matches in your document.body :

/[a-zA-Z]:[\\\/](?:[a-zA-Z0-9] [\\\/])*[a-zA-Z0-9] /gm

let matches = [...document.body.innerHTML.matchAll(/[a-zA-Z]:[\\\/](?:[a-zA-Z0-9] [\\\/])*[a-zA-Z0-9] /gm)];
console.log(matches);
<p>Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
<br>
<br>
C:\Users\max\Documents<br>
S:\Data\Customer<br>
<br>
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.</p>

CodePudding user response:

How about this:

const body = document.getElementsByTagName('body')[0].innerHTML
const foundPaths = body.match(/^([a-zA-Z]:[\\\/](?:[a-zA-Z0-9] [\\\/])*[a-zA-Z0-9] )/gm)
console.log(foundPaths)

// Expected output:
//   [object Array] (2) ["C:\Users\max\Documents","S:\Data\Customer"]

This is Javascript snippet you could paste in the page in a <script> tag.

Sources:
regex for windows path
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions
Usefull:
https://regex101.com/

  • Related