How to put specific filenames into a specific JSON format using bash or Perl?


Assuming I'm in the folder like this:

➜  tmp.lDrLPUOF ls
1.txt 2.txt 3.txt 1.zip 2.rb

I want to put all the filenames of text files into a specific JSON format like this:

  "": [
      "title": "",
      "file": "1"
      "title": "",
      "file": "2"
      "title": "",
      "file": "3"

Now I just know how to list all the filenames:

➜  tmp.lDrLPUOF ls *'.txt'
1.txt 2.txt 3.txt

Can I use bash or Perl to achieve this purpose? Thank you very much!


Thanks for @Charles Duffy and @Shawn 's great answers. But it's my fault to forget another important piece of information——time. I want to put the filenames into the resulting JSON per their creating time.

The creating time is as below:

➜  tmp.lDrLPUOF ls -lTr
total 0
-rw-r--r--  1 administrator  staff  0 Oct 12 09:35:05 2022 3.txt
-rw-r--r--  1 administrator  staff  0 Oct 12 09:35:08 2022 2.txt
-rw-r--r--  1 administrator  staff  0 Oct 12 09:35:12 2022 1.txt

So the resulting JSON I wanted should be like this:

  "": [
      "title": "",
      "file": "3"
      "title": "",
      "file": "2"
      "title": "",
      "file": "1"

If installed, tree can be a good alternative to list the contents of directories as it can encode its output as well-defined JSON which comes in handy when dealing with strange file names (and especially when your desired output is JSON anyways).

tree -JtL 1 -P '*.txt'
  • tree -J outputs JSON
  • tree -t sorts by last modification time
  • tree -L 1 recurses only 1 level deep
  • tree -P '*.txt' reduces the the list to file pattern *.txt

Of course, you can also add more details, if needed, such as

  • tree -p includes file permissions
  • tree -u and tree -g include user and group names
  • tree -s includes the file size in bytes
  • tree -D --timefmt '%F %T' includes the last modification time
tree -JtL 1 -P '*.txt' -pusD --timefmt='%F %T'
  {"type":"directory","name":".","mode":"0755","prot":"drwxr-xr-x","user":"hustnzj","size":4096,"time":"2022-10-12 09:35:00","contents":[
    {"type":"file","name":"3.txt","mode":"0644","prot":"-rw-r--r--","user":"hustnzj","size":123,"time":"2022-10-12 09:35:05"},
    {"type":"file","name":"2.txt","mode":"0644","prot":"-rw-r--r--","user":"hustnzj","size":456,"time":"2022-10-12 09:35:08"},
    {"type":"file","name":"1.txt","mode":"0644","prot":"-rw-r--r--","user":"hustnzj","size":789,"time":"2022-10-12 09:35:12"}

A note regarding this comment: tree -t sorts by last modification time. There's also an option tree -c to sort by (and with tree -D to show time as) last status change instead, but there's no dedicated option (I know of) that uses creation/birth times (if supported by the file system).

Then, using that JSON output as input, you can employ jq for further filtering and formatting:

tree … | jq --arg ext '.txt' '
  {"": (first.contents | map(
    select(.type == "file") | {title: "", file: .name | rtrimstr($ext)}
  "": [
      "title": "",
      "file": "3"
      "title": "",
      "file": "2"
      "title": "",
      "file": "1"


Note: This includes the filter select(.type == "file") as tree would also include the names of subdirectories. Drop it if you want them included.

{ shopt -s nullglob; set -- *.txt; printf '%s\0' "$@"; } | jq -Rn '
  {"": [ input
         | split("\u0000")[]
         | select(. != "")
         | {"title": "",
            "file": . | rtrimstr(".txt")

Let's break this down into pieces.

On the bash side:

  • shopt -s nullglob tells the shell that if *.txt has no arguments, it should emit nothing at all, instead of emitting the string *.txt as a result.
  • set -- overwrites the argument list in the current context (because this is a block on the left-hand side of the pipeline that context is transient and won't change "$@" in code outside the pipe).
  • printf '%s\0' "$@" prints our arguments, with a NUL character after each one; if there are no arguments at all, it prints only a NUL.

On the jq side:

  • -R specifies that the input is raw data, not json.
  • -n specifies that we don't automatically consume any inputs, but will instead use input or inputs to specify where input should be read.
  • split("\u0000") splits the input on NULs. (This is important because the NUL is the only character that can't exist in a filename, which is why we used printf '%s\0' on the shell end; that way we work correctly with filenames with newlines, literal quotes, whitespace, and all the other weirdness that's able to exist).
  • select(. != "") ignores empty strings.
  • rtrimstr(".txt") removes .txt from the name.

Addendum: Sorting by mtime

The jq parts don't need to be modified here: to sort by mtime you can adjust only the shell. On a system with GNU find, sort and sed, this might look like:

find . -maxdepth 1 -type f -name '*.txt' -printf '%T@ %P\0' |
  sort -zn |
  sed -z -re 's/^[[:digit:].]  //g' |
  jq -Rn '

...followed by the same jq given above.

Using just jq, any shell:

$ jq  -n --args '{"": [ $ARGS.positional[] | rtrimstr(".txt") | { title: "", file: . } ] }' *.txt 
  "": [
      "title": "",
      "file": "1"
      "title": "",
      "file": "2"
      "title": "",
      "file": "3"

The filenames passed on the command line (The expansion of *.txt are in the jq variable $ARGS.positional. For each one, remove the .txt extension and use the rest in a object of the desired structure.

Or with a perl one-liner:

$ perl -MJSON::PP -E 'say encode_json({"" => [ map { { title => "", file => s/\.txt$//r } } @ARGV ] })' *.txt

My take:

stat -c '%Y:%n' *.txt \
| sort -t: -n \
| cut -d: -f2- \
| xargs basename -s .txt \
| jq -s 'map({title: "", file: tostring}) | {"": .}'
