I have a markdown text file with links like that:
[Text](https://docs.google.com/document/d/unique-doc-id-here/edit)
or
[Text2](https://docs.google.com/document/d/unique-doc-id-here")
I want to replace the whole href
with another one by taking the unique-doc-id-here
, passing that to a function that will return a new href, so in result my urls would look something like that:
[Text](https://new-url-here.com/fragment-unique-id)
or
[Text2](https://new-url-here.com/fragment-unique-id)
I think my problem is to select the unique-doc-id-here
, I think I have to use the regex for that.
So the solution could be looking like this:
text.replace(/https:\/\/docs.google.com\/document\/d\/(.*?)*/gm, (x) =>
this.getNewHref(x)
);
However it seems that the regex does not looks quite right, because it does not much all the cases. Any ideas how to fix?
Here is an input text example:
# Title
Text text text.
Text 1 text 1 text 1, abc.
More text
Bullet points
- [abc]
- [bla]
- [cba]
## Title 2
More text:
- A
- B
- C
- D
Text text text text [url1](https://docs.google.com/document/d/2x2my-DRqfSidOsdve4m9bF_eEOJ7RqIWP7tk7PM4qEr) text.
**BOLD.**
## Title
Text2 text1 text3 text
[url2](https://docs.google.com/document/d/4x2mrhsqfGSidOsdve4m9bb_wEOJ7RqsWP7tk7PMPqEb/edit#bookmark=id.mbnek2bdkj8c) text.
More text here
[bla](https://docs.google.com/document/d/6an7_b4Mb0OdxNZdfD3KedfvFtdf2OeGzG40ztfDhi5o9uU/edit)
I've try this regex \w :\/\/.*?(?=\s)
but it does select the last )
symbol
I've applied a proposed solution by @The fourth bird
:
function getNewHref(id: string) {
const data = getText();
const element = data.find((x: any) => x.id === id);
if(element?.url) {
return element.url;
} else {
return 'unknown-url'
}
}
data = data.replace(
/\[[^\][]*]\(https?:\/\/docs\.google\.com\/document\/d\/([^\s\\\/)] )[^\s)]*\)/gm,
(x, g1) => getNewHref(g1)
);
The problem is that the replace function replace the whole thing so what was [...](...)
becomes ./new-url
or unknown-url
but needs to me [original text](new result)
CodePudding user response:
You can make the pattern more specific, and then use the group 1 value.
(\[[^\][]*]\()https?:\/\/docs\.google\.com\/document\/d\/([^\s\\\/)] )[^\s)]*\)
The pattern in parts matches:
(\[[^\][]*]\()
Capture group 1, match from[...](
using a negated character classhttps?:\/\/docs\.google\.com\/document\/d\/
Match the leading part of the url(
Capture group 2[^\s\\\/)]
Match 1 chars other than a whitespace char,\
or/
)
Close group 1[^\s)]*
Match optional chars other than a whitespace char or)
\)
Match)
For example, a happy case scenario where all the keys to be replaced exist (note that you can omit the /m
flag as there are no anchors in the pattern)
const text = "[Text](https://docs.google.com/document/d/unique-doc-id-here/edit)";
const regex = /(\[[^\][]*]\()https?:\/\/docs\.google\.com\/document\/d\/([^\s\\\/)] )[^\s)]*\)/g;
function getNewHref(id) {
const replacements = {
"unique-doc-id-here": `https://docs.google.com/document/d/${id}`
}
return replacements[id];
}
const replacedText = text.replace(regex, (x, g1, g2) => g1 getNewHref(g2)) ")";
console.log(replacedText);
CodePudding user response:
You can achieve this by getting the href
link from a string by using RegEx
and then by splitting that up using forward slash.
Try this (Descriptive comments has been added in the below code snippet) :
const text = '<a href="https://docs.google.com/document/d/unique-doc-id-here/edit">Text</a>';
// Get the href link using regex
const link = text.match(/"([^"]*)"/)[1];
// Split the string and get the array of link based on the forward slash.
const linkArr = link.split('/')
// get the unique ID from an array.
const uniqueID = linkArr[linkArr.indexOf('d') 1]
console.log(uniqueID);