i wonder how can i filter just domains (without protocol:// and /path). Example i need something like this: echo "Hi it's my web site (https://1stsubdomain.2ndsubdomain.example.com/welcome/)" | grep "some regex"
I want get this output: Hi it's my web site (1stsubdomain.2ndsubdomain.example.com)
And domain can be with 3 subdomains or just 1 or without
Extensions i want: .com|net|org|ru|xyz|co|tr|uk|vn|intedu|mil|lnc|is|dev|travel|info|biz|email|build|agency|zone|bid|condos|dating|events|maiso|partners|properties|productions|social|reviews|techgov|au
CodePudding user response:
Use -o
to print only the matched (non-empty) parts
Please make sure to include all top-level domains you want
echo "Hi it's my web site (https://1stsubdomain.2ndsubdomain.example.com/welcome/)" | grep -Eo '[A-Za-z0-9_\.-]*\.(com|net|org)'
CodePudding user response:
Using sed
if applicable
$ echo "Hi it's my web site (https://1stsubdomain.2ndsubdomain.example.bid/welcome/)" \
| sed -En 's#https://(.*\.)(com|net|org|ru|xyz|co|tr|uk|vn|intedu|mil|lnc|is|dev|travel|info|biz|email|build|agency|zone|bid|condos|dating|events|maiso|partners|properties|productions|social|reviews|techgov|au).*/#\1\2#p'
Hi it's my web site (1stsubdomain.2ndsubdomain.example.bid)