How can I get all directories that include more than one file with specific extension?-CodePudding

I'm trying to get directories names of those which contain more than one file with .tf extension. Supposing this directory:

.
├── docs
│   ├── README.md
│   └── diagram.png
├── project
│   └── main.py
├── Makefile
├── terraform
│   ├── environments
│   │   ├── prod
│   │   │   └── main.tf
│   │   └── staging
│   │       └── main.tf
│   └── module
│       ├── ecs.tf
│       ├── rds.tf
│       ├── s3.tf
│       ├── security_group.tf
│       ├── sqs.tf
│       └── variable.tf
├── tests
|   └── test_main.py
└── .terraform
    └── ignore_me.tf

I expect terraform/module as a result. I tried all solutions at https://superuser.com/questions/899347/find-directories-that-contain-more-than-one-file-of-the-same-extension but nothing worked as expected.

CodePudding user response：

Those solution to that link almost have done what you wanted but you could try something like this.

find . -type d -exec bash -O nullglob -c 'a=("$1"/*.tf); (( ${#a[@]} > 1 ))' bash {} \; -print

The accepted answer in that link, just remove the -c, should give you the expected output.

See:

CodePudding user response：

This should work fine:

find . -type d -exec sh -c '
for d; do
  set -- "$d"/*.tf
  if test $# -gt 1; then
    printf "%s\n" "$d"
  fi
done' sh {}

CodePudding user response：

In pure bash:

#!/bin/bash

shopt -s globstar dotglob

for dir in ./ **/; do
    tf_files=("$dir"*.tf)
    (( ${#tf_files[*]} > 1 )) && echo "${dir%?}"
done

CodePudding user response：

There are many ways to achieve this. Here is one that filters using awk

$ find /path/to/root -type d -printf "\0%p\001" -o -type f -iname '*.tf' -printf 'c' | awk 'BEGIN{RS=ORS="\0";FS="\001"}(length($2)>1){print $1}'

The idea is to build a set of records of two fields.

The record separator is the null character \0
The field separator is the one-character \001 This is to avoid problems with special filenames.

The find command will then print the directory name in the first field and fill the second field with the character c for every matching file. If we would use newlines and spaces, it would look like

dir1 c
dir2 cc
dir3
dir4 cccc

The awk code is then just filtering the results based on the amount of characters in the second column.