I have a folder in a ftp with a hundred of subfolders, each have it's own index.html
I want to add a <link rel="stylesheet" href="https://subdomain.domain.fr/vad/client/build/iconfont.css">
in each index.html
The subdomain is variable and can be captured from another stylesheet link ex :
<link rel="stylesheet" href="https://subdomain.domain.fr/vad/client/build/theme.css">
I tried this :
find . -type f -name index.html -exec sed -i 's/<link rel="stylesheet" href="https:\/\/\(*\).domain.fr\/vad\/client\/build\/theme.css">/<link rel="stylesheet" href="https:\/\/\1.domain.fr\/vad\/client\/build\/theme.css"><link rel="stylesheet" href="https:\/\/\1.domain.fr\/vad\/client\/build\/iconfont.css">/g' {} \;
With capturing and copy groups but it's not working
CodePudding user response:
For ease and readability, change the delimiter from /
to let's say #
You also have to escape real dots in search pattern…
sed -i 's#<link rel="stylesheet" href="https://\(*\)\.domain\.fr/vad/client/build/theme\.css">#<link rel="stylesheet" href="https://\1.domain.fr/vad/client/build/theme.css"><link rel="stylesheet" href="https://\1.domain.fr/vad/client/build/iconfont.css">#g'
From there, I can see there's a mistake in your regexp capturing group… You wrote \(*\)
, but I suspect you mean \(.*\)
:) (otherwise, you where trying to capture nothing …or by chance opening parenthesis only…)
Now, it's look like you are replacing one word with another one, in order to change the CSS file? As it's appearing in a specific kind of line, you can perform a simple replacement in line matching that pattern ;)
sed -i '/\<link rel="stylesheet" href="https:\/\/.*\.domain\.fr\/vad\/client\/build/s#theme#iconfont#'
CodePudding user response:
Using Perl
and a Mojo::DOM
HTML
Parser to edit your HTML
:
use strict; use warnings;
use Mojo::DOM;
# Slurp the whole HTML as string
my $html = join "", <>;
my $dom = Mojo::DOM->new($html);
# Fetch domain name
$_ = $dom
->find('link[href][rel="stylesheet"]')
->map(attr => 'href')
->last;
my ($domain) = m|^https?://([^/] )/|
or die "No match https?!\n";
# Find/append
$dom
->find('head > link[href][rel="stylesheet"]')
->last
->append(
"\n" .
'<link rel="stylesheet" href="https://' .
$domain .
'/build/iconfont.css" />'
);
# Render
print "$dom";
Output
Example of one file:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="fr" xml:lang="fr" xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
<link href="https://subdomain.domain.fr/build/iconfont.css" rel="stylesheet">
<link href="https://subdomain.domain.fr/build/iconfont.css" rel="stylesheet">
<title></title>
</head>
<body>
POUET
</body>
</html>
Usage
First test the script against some files without sponge
.
Then, if tests are satisfactory:
#!/bin/bash
shopt -s globstar # enable recursion **
for h in **/*.html; do
perl Mojo::DOM.pl "$h" | sponge "$h"
done