I was trying to filter all the files from the URLs and get only paths.
echo -e "http://sub.domain.tld/secured/database_connect.php\nhttp://sub.domain.tld/section/files/image.jpg\nhttp://sub.domain.tld/.git/audio-files/top-secret/audio.mp3" | grep -Ei "(http|https)://[^/\"] " | sort -u
http://sub.domain.tld
But I want the result like this
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/
Is there any way to do it with sed
or grep
CodePudding user response:
Using grep
$ echo ... | grep -o '.*/'
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/
CodePudding user response:
with grep
If your grep has the -o
option:
... | grep -Eio 'https?://.*/'
If there could be multiple URLs per line:
... | grep -Eio 'https?://[^[:space:]] /'
with sed
If the input is always precisely one URL per line and nothing else, you can just delete the filename part:
... | sed 's/[^/]*$//'
CodePudding user response:
GNU Awk
$ echo ... | awk 'match($0,/.*\//,a){print a[0]}'
$ echo ... | awk '{print gensub(/(.*\/).*/,"\\1",1)}'
$ echo ... | awk 'sub(/[^/]*$/,"")'
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/
xargs
$ echo ... | xargs -i sh -c 'echo $(dirname "{}")/'
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/
CodePudding user response:
You could use match
function of awk
, will work in any version of awk
. Simple explanation would be, passing echo
command's output to awk
program. Using match
matching everything till last occurrence of /
and then printing the sub-string to print just before /
(with -1 to RLENGTH
).
your_echo_command | awk 'match($0,/.*\//){print substr($0,RSTART,RLENGTH-1)}'