I have a bunch of strings formatted like this:
myFolder/myUrl.com.zip
I need a bash script that takes this input, and returns
myUrl.com
How can I do this?
CodePudding user response:
If the URLs in question are just hostnames like the example provided, and all of the 'encapsulated' URLs have a single file extension at the end, then Bash's remove longest/ shortest matching substrings at beginning and end of string expansions should help:
${VAR%pattern}
- remove shortest matching string from beginning${VAR%%pattern}
- remove longest matching string from beginning${VAR#pattern}
- remove shortest matching string from end${VAR##pattern}
- remove longest matching string from end
Assuming an input variable of URI
(which could be read
from a file or input stream), the following would remove any directories from the beginning, and one file extension from the end (result is returned back in URI
):
# Remove chars from the beginning up to and including the last forward slash (folders)
URI="${URI##*/}"
# Remove chars from the end back to and including the closest dot to the end (file ext.)
URI="${URI%.*}"
See https://tldp.org/LDP/abs/html/parameter-substitution.html for more detail
CodePudding user response:
If the strings are located in files, then use
grep -Po "[^/] \.com" file.txt
CodePudding user response:
$ basename -s .zip myFolder/myUrl.com.zip
myUrl.com