Home > OS >  Whitespace characters messes up a shell script grep pattern for extracting markdown links (macOS)
Whitespace characters messes up a shell script grep pattern for extracting markdown links (macOS)

Time:03-02

I am working on a tool to convert markdown files to text bundles, revising a great piece of code from Zett to use on macOS since I will be porting my Apple Notes files to Craft.

I have problems parsing all links into an array using grep. No matter how hard I try using options like --null and | xargs -0, the result ends up being split by whitespace characters:

targets=($(grep '!\[.*\](.*)' "$inFile"))

An example: I have a small markdown test file containing the following:

# Allan Falk - ComicWiki

**Allan Falk - ComicWiki**

![Allan Falk - ComicWiki](images/Allan Falk - ComicWiki.png)

http://comicwiki.dk/wiki/Allan_Falk

Running the above code creates the following array in where the markdown link is split up like so:

![Allan
Falk
-
ComicWiki](images/Allan Falk - ComicWiki.png)

How can I get complete links as individual array entries (they will be processed later individually, using sed for copying files etc.)?

CodePudding user response:

You can set IFS= (null value) and use read like this:

IFS= read -ra arr < <(grep '!\[.*\]' file)

# examine array
declare -p arr
declare -a arr='([0]="![Allan Falk - ComicWiki](images/Allan Falk - ComicWiki.png)")'

<(grep '!\[.*\]' file) runs grep using process substitution and < before that sends output of this command to read

Working Demo

CodePudding user response:

After doing some digging, I found out that I was missing quotations in my statement. So instead of writing:

targets=($(grep '!\[.*\](.*)' "$inFile"))

I needed to add quotation marks inside the first set of brackets:

targets=( "$(grep '!\[.*\](.*)' "$inFile")" )

Now the array works fine – no whitespace splitting occurs.

  • Related