Home > Software engineering >  How to break down paths in URL?
How to break down paths in URL?

Time:05-04

I don't really know how to explain this but I'll try to with some examples. So I have a list of urls with paths in a file. I want to break down paths recursively by using bash.

Example;

https://sub.example.com/foo1/foo2/foo3/file.php

My desired output would be;

https://sub.example.com/foo1

https://sub.example.com/foo1/foo2

https://sub.example.com/foo1/foo2/foo3

I don't want to include files with extensions in output.

CodePudding user response:

Try this

breakdown.sh

#!/bin/sh
breakdown() {
    dir=`dirname $1`
    if [ "$dir" = "/" ] ; then
        return
    else
        echo "$protocol://$hostname/$dir"
        breakdown "$dir"
    fi
}

protocol=${1%%://*}
pathname=${1#*://}
hostname=${pathname%%/*}
pathname=${pathname#$hostname}
breakdown "$pathname" | tac

Execute as follows

breakdown.sh https://sub.example.com/foo1/foo2/foo3/file.php

CodePudding user response:

urls.txt

https://sub.example.com/foo1/foo2/foo3/file.php
https://sub.example.com/bar1/bar2/bar3/bar4/file2.php

code

awk -F/ 'BEGIN{OFS=FS}{p="";for(i=4;i<NF;i  ){p=p OFS $(i);print $1,$2,$3p}}' urls.txt

output

https://sub.example.com/foo1
https://sub.example.com/foo1/foo2
https://sub.example.com/foo1/foo2/foo3
https://sub.example.com/bar1
https://sub.example.com/bar1/bar2
https://sub.example.com/bar1/bar2/bar3
https://sub.example.com/bar1/bar2/bar3/bar4
  • Related