Home > OS >  Breaking down a filename into lexicographic based folders
Breaking down a filename into lexicographic based folders

Time:02-22

Let's say I have thousands of images in a folder in the format filename_order.jpg.

  • filename are encoded as a 7 digits integer from 0000000 to 9999999
  • order is a number between 0 and 9
folder/
  6398305_0.jpg
  6398305_1.jpg
  6398305_2.jpg
  ...
  6399305_0.jpg

Is there an easy way to sort them into equality repartitioned folders based on the filenames?

folder/
  6/3/9/
     8/3/0/5/
        6398305_0.jpg
        6398305_1.jpg
        6398305_2.jpg
  ...
     9/3/0/7/
        6399307_0.jpg

Is there a way to do the reverse operation as well: given a nested tree structure bringing it back to level 1 only.

The goal is being able to store them in S3 in an efficient way for millions of images.

Thank you.

CodePudding user response:

This would do it in pure Bash:

#!/usr/bin/env bash

# extglob needed to expand number into a serie of folders path
shopt -s extglob

# Starting folder name
folder=folder
# Iterate all *.jpg files in folder
for file in "$folder/"*.jpg; do

  # Remove leading directory path from file to get basename
  basename="${file##*/}"

  # Remove everything ater first _ to get only numbers
  numbers="${basename%_*}"

  # Insert / before each number to create a directory path from numbers
  # Need Bash extglob
  dir="$folder${numbers//?()/\/}"

  # Create the directory path
  echo mkdir -p "$dir"

  # move file to its directory
  echo mv "$file" "$dir/"
done

Remove the echo if the output matches your expectations.

CodePudding user response:

Nesting a flat folder,

cp -R flat_folder/ nested_folder/
cd nested_folder/

for f in *_[0-9].jpg
do
    filename=${f%.*}
    extension=${f##*.}
    number=${filename%_*}
    index=${filename##*_}

    folder=$(echo $number | sed 's/\(.\)\(.\)\(.\)\(.\)\(.\)\(.\)\(.\)/\1\/\2\/\3\/\4\/\5\/\6\/\7/')
    mkdir -p $folder
    mv $f $folder/
done

Flattening a nested folder,

cd nested_folder/
find . -name "*.jpg" -exec cp {} ../flat_folder/ \;
  • Related