How can I write a regex in a shell script that would target only the targeted substring between two given values? Give the example
https://www.stackoverflow.com
How can I match only the ":" in between "https" and "//". If possible please also explain the approach.
The context is that I need to prepare a file that would fetch a config from the server and append it to the .env file. The response comes as JSON
{
"GRAPHQL_URL": "https://someurl/xyz",
"PUBLIC_TOKEN": "skml2JdJyOcrVdfEJ3Bj1bs472wY8aSyprO2DsZbHIiBRqEIPBNg9S7yXBbYkndX2Lk8UuHoZ9JPdJEWaiqlIyGdwU6O5",
"SUPER_SECRET": "MY_SUPER_SECRET"
}
so I need to adjust it to the .env
syntax. What I managed to do this far is
#!/bin/bash
CURL_RESPONSE="$(curl -s url)"
cat <<< ${CURL_RESPONSE} | jq -r '.property.source' | sed -r 's/:/=/g;s/[^a-zA-Z0-9=:_/-]//g' > .env.test
so basically I fetch the data, then extract the key I am after with jq
, and then I use sed
to first replace all ":" to "=" and after that I remove all the quotations and semicolons and white spaces that comes from JSON and leave some characters that are necessary.
I am almost there but the problem is that now my graphql url (and only other) would look like so
https=//someurl/xyz
so I need to replace this = that is in between https and // back with the colon.
Thank you very much @Nic3500 for the response, not sure why but I get error saying that
sed: 1: "s/:/=/g;s#https\(.*\)// ...": \1 not defined in the RE
I searched SO and it seems that it should work since the brackets are escaped and I use -r
flag (tried -E
but no difference) and I don't know how to apply it. To be honest I assume that the replacement block is this part
#\1#
so how can I let this know to what character should it be replaced?
This is how I tried to use it
#!/bin/bash
CURL_RESPONSE="$(curl -s url)"
cat <<< ${CURL_RESPONSE} | jq -r '.property.source' | sed -r 's/:/=/g;s#https\(.*\)//.*#\1#;s/[^a-zA-Z0-9=:_/-]//g' > .env.test
Hope with this context you would be able to help me.
CodePudding user response:
echo "https://www.stackoverflow.com" | sed 's#https\(.*\)//.*#\1#'
:
sed
operator s/regexp/replacement/- regexp:
https\(.*)//.*
. So "https" followed by something (.*
), followed by "//", followed by anything else.*
- the parenthesis are back slashed since they are not part of the pattern. They are used to group a part of the regex for the replacement part of the
s###
operator. - replacement: \1, means the first group found in the regex
\(.*\)
- I used
s###
, but the usual form iss///
. Any character can take the place of the/
with thes
operator. I used#
as using/
would have been confusing since you use/
in the url.
CodePudding user response:
The problem is that your sed
substitutions are terribly imprecise. Anyway, you want to do it in jq
instead, where you have more control over which parts you are substituting, and avoid spawning a separate process for something jq
quite easily does natively in the first place.
curl -s url |
jq -r '.property.source | to_entries[] |
"\(.key)=\"\(.value\)\""' > .env.test
Tangentially, capturing the output of curl
into a variable just so you can immediately cat
it once to standard output is just a waste of memory.