Replacing url by a value taking from the url with another url-CodePudding

I have a markdown text file with links like that:

[Text](https://docs.google.com/document/d/unique-doc-id-here/edit)
 
or    

[Text2](https://docs.google.com/document/d/unique-doc-id-here")

I want to replace the whole href with another one by taking the unique-doc-id-here, passing that to a function that will return a new href, so in result my urls would look something like that:

[Text](https://new-url-here.com/fragment-unique-id)

or

[Text2](https://new-url-here.com/fragment-unique-id)

I think my problem is to select the unique-doc-id-here, I think I have to use the regex for that.

So the solution could be looking like this:

text.replace(/https:\/\/docs.google.com\/document\/d\/(.*?)*/gm, (x) =>
  this.getNewHref(x)
);

However it seems that the regex does not looks quite right, because it does not much all the cases. Any ideas how to fix?

Here is an input text example:

# Title


Text text text.


Text 1 text 1 text 1, abc.


More text

Bullet points


 - [abc]
 - [bla] 
 - [cba]

## Title 2


More text:


 - A
 - B
 - C
 - D


    Text text text text [url1](https://docs.google.com/document/d/2x2my-DRqfSidOsdve4m9bF_eEOJ7RqIWP7tk7PM4qEr) text.
    
    
    **BOLD.**
    
    
    ## Title 
    
    Text2 text1 text3 text 

[url2](https://docs.google.com/document/d/4x2mrhsqfGSidOsdve4m9bb_wEOJ7RqsWP7tk7PMPqEb/edit#bookmark=id.mbnek2bdkj8c) text.
    
    More text here
    
    
    [bla](https://docs.google.com/document/d/6an7_b4Mb0OdxNZdfD3KedfvFtdf2OeGzG40ztfDhi5o9uU/edit)

I've try this regex \w :\/\/.*?(?=\s) but it does select the last ) symbol

I've applied a proposed solution by @The fourth bird:

function getNewHref(id: string) {
    const data = getText();

    const element = data.find((x: any) => x.id === id);

    if(element?.url) {
      return element.url;
    } else {
      return 'unknown-url'      
    }
  }

data = data.replace(
          /\[[^\][]*]\(https?:\/\/docs\.google\.com\/document\/d\/([^\s\\\/)] )[^\s)]*\)/gm,
          (x, g1) => getNewHref(g1)
        );

The problem is that the replace function replace the whole thing so what was [...](...) becomes ./new-url or unknown-url but needs to me [original text](new result)

CodePudding user response：

You can make the pattern more specific, and then use the group 1 value.

(\[[^\][]*]\()https?:\/\/docs\.google\.com\/document\/d\/([^\s\\\/)] )[^\s)]*\)

The pattern in parts matches:

(\[[^\][]*]\() Capture group 1, match from [...]( using a negated character class
https?:\/\/docs\.google\.com\/document\/d\/ Match the leading part of the url
( Capture group 2
- [^\s\\\/)] Match 1 chars other than a whitespace char, \ or /
) Close group 1
[^\s)]* Match optional chars other than a whitespace char or )
\) Match )

Regex demo

For example, a happy case scenario where all the keys to be replaced exist (note that you can omit the /m flag as there are no anchors in the pattern)

const text = "[Text](https://docs.google.com/document/d/unique-doc-id-here/edit)";
const regex = /(\[[^\][]*]\()https?:\/\/docs\.google\.com\/document\/d\/([^\s\\\/)] )[^\s)]*\)/g;

function getNewHref(id) {
  const replacements = {
    "unique-doc-id-here": `https://docs.google.com/document/d/${id}`
  }

  return replacements[id];
}

const replacedText = text.replace(regex, (x, g1, g2) => g1   getNewHref(g2))   ")";

console.log(replacedText);

CodePudding user response：

You can achieve this by getting the href link from a string by using RegEx and then by splitting that up using forward slash.

Try this (Descriptive comments has been added in the below code snippet) :

const text = '<a href="https://docs.google.com/document/d/unique-doc-id-here/edit">Text</a>';

// Get the href link using regex
const link = text.match(/"([^"]*)"/)[1];

// Split the string and get the array of link based on the forward slash.
const linkArr = link.split('/')

// get the unique ID from an array.
const uniqueID = linkArr[linkArr.indexOf('d')   1]

console.log(uniqueID);