Home > OS >  How to convert Github-style Wiki page link to Markdown-style link in Bash script
How to convert Github-style Wiki page link to Markdown-style link in Bash script

Time:09-24

first question for me on Stack Overflow.

I am trying to write a Bash script to convert the kind of Github Wiki links generated for other internal Github Wiki pages into conventional Markdown-style links.

The Github Wiki link strings look like this:

[[An example of another page]]

I want to convert it to look like this:

[An example of another page](An-example-of-another-page.htm)

Documents have an unknown number of these links and I don't know the content.

Currently I have been playing around with one-line sed solutions given to other problems, like this one:

https://askubuntu.com/questions/1283471/inserting-text-to-existing-text-within-brackets

... with absolutely no success. I'm not even sure where to start with it.

Thanks.

CodePudding user response:

You can use bash's internal regular expression support to find and replace instances of wiki linked [[text]] with [text](text.htm). The pattern you want to use is \[\[([^\]]*)\]\]

  • \[ and \] - escapes the left and right square brackets so that they aren't interpreted as meta-characters that let you match character classes

  • ([^\]]*) captures all text inside the double brackets until the first right square bracket

From there you can evaluate this regex and use the $BASH_REMATCH array to check if any matches are made. You'll need to run this multiple times in order to match all instances in the string and then replace the string inline using the / and // operators.

Here's a sample script:

#!/usr/bin/env bash

wiki_string="Now, this is [[a story]] all about how
My life [[got flipped-turned upside down]]
And I'd [[like to take a minute]]
Just [[sit]] right there
I'll [[tell you]] how I [[became the prince]] of a town called Bel-Air"

printf 'Original: %s\n' "$wiki_string"

# find the first instance of [[text]] and capture the text inside 
# the square brackets
[[ "$wiki_string" =~ \[\[([^\]]*)\]\] ]]

# if successful, BASH_REMATCH will contain the matched text and the
# captured value inside the parentheses
while [[ ${#BASH_REMATCH[@]} == 2 ]]; do
    # escape the [ and ] characters so we can replace [[text]]
    # with our modified value
    replace_text="${BASH_REMATCH[0]}"
    replace_text="${replace_text/\[\[/\\[\\[}"
    replace_text="${replace_text/\]\]/\\]\\]}"

    # Get the matched value inside the brackets
    link_text="${BASH_REMATCH[1]}"

    # store another copy of the text with the spaces replaced
    # with dashes and appending .htm
    link_target="${link_text// /-}.htm"

    # Finally, replace the matched [[text]] with [text](text.htm)
    wiki_string="${wiki_string//$replace_text/[$link_text]($link_target)}"

    # Search the string again for the next instance of [[text]]
    [[ "$wiki_string" =~ \[\[([^\]]*)\]\] ]]
done

printf '\nUpdated: %s\n' "$wiki_string"

CodePudding user response:

You can try this sed

$ sed -E 's/\[//;s/\]//;s/(.)(.[^]]*)(.)/\1\2\3(\2)/;s/(.[^\(]*)(\S*)\s(\S*)\s/\1\2-\3-/g' input_file
[An example of another page](An-example-of-another-page)
  • Related