I have a list of Python files scattered over a directory structure that looks like this:
top\py\
html\
docs.html
foo\
v1\
a.py
b.py
bar\
v1\
c.py
d.py
baz\
aws\
sess.py
What I'm trying to here is extract the root directory names of any directories within top/py
that contain Python files, so foo
, bar
and baz
, using a shell script. Note that there may be files other than Python code here and I do not want those included in the output. I will admit that I'm somewhat of a novice at shell scripting and this is what I've come up with so far:
find top/py/. -name *py -exec echo {} \; -exec sh -c "echo $@ | grep -o '[^/]*' | sed -n 3p" {} \;
However, the output isn't quite right. This is what I'm getting:
top/py/./foo/v1/a.py
foo
top/py/./foo/v1/b.py
foo
top/py/./bar/v1/c.py
foo
top/py/./bar/v1/d.py
foo
top/py/./baz/aws/sess.py
foo
It appears that the inner variable to the grep
is not being updated but I'm not sure what to do about it. Any help would be appreciated.
CodePudding user response:
If your files don't contain newlines in their names:
find top/py -name '*.py' | awk -F / '1; { print $3 }'
Otherwise:
find top/py -name '*.py' -exec sh -c '
for py_path; do
pk_path=${py_path%/"${py_path#*/*/*/}"}
pk_name=${pk_path##*/}
printf '\''%s\n'\'' "$py_path" "$pk_name"
done' sh {}
For values other than 3, replace */*/*/
with as many */
s as n
.