Home > Mobile >  How do I append a suffix to ids in a HTML file using standard unix commands?
How do I append a suffix to ids in a HTML file using standard unix commands?

Time:11-27

Assume I have an HTML file like this:

<body>
    <div id="a">
       content of div a
       <div id="b"> content of div b </div>
       <div id="c"> content of div c </div>
    </div>
    <style>
      #a {color: red; }
      #b {color: green; }
      #c {color: blue; }
    </style>
</body>

I want to append a unique suffix (say, -suffix) to all ids, which would include attributes id="..." and selectors #..., and result in a file like this:

<body>
    <div id="a-suffix">
       content of div a
       <div id="b-suffix"> content of div b </div>
       <div id="c-suffix"> content of div c </div>
    </div>
    <style>
      #a-suffix {color: red; }
      #b-suffix {color: green; }
      #c-suffix {color: blue; }
    </style>
</body>

How do I accomplish this with standard unix shell tools like sed, grep, awk in a way that would cover as many situations as possible?

My attempt:

I came up with the following sed command:

sed -e 's/id="\([-_a-zA-Z0-9]*\)"/id="\1-suffix"/g;s/#\([-_a-zA-Z0-9]*\)/#\1-suffix/g' index.html

Which is actually two commands in one:

  • s/id="\([-_a-zA-Z0-9]*\)"/id="\1-suffix"/g - substitutes id attributes id="..."
  • s/#\(\[-_a-zA-Z0-9]*\)/#\1-suffix/g - substitutes id selectors #...

However it's far from perfect. First, it only supports double attribute values in double quotes id="..." and id values are limited in that they have to match [-_a-zA-Z0-9]*. Second, this clashes with hex colors, so a white color like #ffffff would get a suffix #ffffff-suffix; An id selector like #... should only get a suffix if an appropriate attribute id="..." exists.

What is the best way to accomplish this?

CodePudding user response:

There are a lot of cases in your file, as you mentionned with the colour problem My approach would be to treat the file line by line using

cat inputfile.html | while read a; do
some code
echo "$a" >> outputfile.html
done

This being said, you may use

b=$(expr "$a" : "regex")

To precisely filter what you want to modify and only then, use some

sed

on $b to get what you want and push $b into $a

  • Related