Assuming I'm in the folder like this:
➜ tmp.lDrLPUOF ls
1.txt 2.txt 3.txt 1.zip 2.rb
I want to put all the filenames of text files into a specific JSON format like this:
{
"": [
{
"title": "",
"file": "1"
},
{
"title": "",
"file": "2"
},
{
"title": "",
"file": "3"
}
]
}
Now I just know how to list all the filenames:
➜ tmp.lDrLPUOF ls *'.txt'
1.txt 2.txt 3.txt
Can I use bash or Perl to achieve this purpose? Thank you very much!
Edit
Thanks for @Charles Duffy and @Shawn 's great answers. But it's my fault to forget another important piece of information——time. I want to put the filenames into the resulting JSON per their creating time.
The creating time is as below:
➜ tmp.lDrLPUOF ls -lTr
total 0
-rw-r--r-- 1 administrator staff 0 Oct 12 09:35:05 2022 3.txt
-rw-r--r-- 1 administrator staff 0 Oct 12 09:35:08 2022 2.txt
-rw-r--r-- 1 administrator staff 0 Oct 12 09:35:12 2022 1.txt
So the resulting JSON I wanted should be like this:
{
"": [
{
"title": "",
"file": "3"
},
{
"title": "",
"file": "2"
},
{
"title": "",
"file": "1"
}
]
}
CodePudding user response:
If installed, tree
can be a good alternative to list the contents of directories as it can encode its output as well-defined JSON which comes in handy when dealing with strange file names (and especially when your desired output is JSON anyways).
tree -JtL 1 -P '*.txt'
[
{"type":"directory","name":".","contents":[
{"type":"file","name":"3.txt"},
{"type":"file","name":"2.txt"},
{"type":"file","name":"1.txt"}
]}
,
{"type":"report","directories":0,"files":3}
]
tree -J
outputs JSONtree -t
sorts by last modification timetree -L 1
recurses only1
level deeptree -P '*.txt'
reduces the the list to file pattern*.txt
Of course, you can also add more details, if needed, such as
tree -p
includes file permissionstree -u
andtree -g
include user and group namestree -s
includes the file size in bytestree -D --timefmt '%F %T'
includes the last modification time
tree -JtL 1 -P '*.txt' -pusD --timefmt='%F %T'
[
{"type":"directory","name":".","mode":"0755","prot":"drwxr-xr-x","user":"hustnzj","size":4096,"time":"2022-10-12 09:35:00","contents":[
{"type":"file","name":"3.txt","mode":"0644","prot":"-rw-r--r--","user":"hustnzj","size":123,"time":"2022-10-12 09:35:05"},
{"type":"file","name":"2.txt","mode":"0644","prot":"-rw-r--r--","user":"hustnzj","size":456,"time":"2022-10-12 09:35:08"},
{"type":"file","name":"1.txt","mode":"0644","prot":"-rw-r--r--","user":"hustnzj","size":789,"time":"2022-10-12 09:35:12"}
]}
,
{"type":"report","directories":0,"files":3}
]
A note regarding this comment:
tree -t
sorts by last modification time. There's also an optiontree -c
to sort by (and withtree -D
to show time as) last status change instead, but there's no dedicated option (I know of) that uses creation/birth times (if supported by the file system).
Then, using that JSON output as input, you can employ jq
for further filtering and formatting:
tree … | jq --arg ext '.txt' '
{"": (first.contents | map(
select(.type == "file") | {title: "", file: .name | rtrimstr($ext)}
))}
'
{
"": [
{
"title": "",
"file": "3"
},
{
"title": "",
"file": "2"
},
{
"title": "",
"file": "1"
}
]
}
Note: This includes the filter select(.type == "file")
as tree
would also include the names of subdirectories. Drop it if you want them included.
CodePudding user response:
{ shopt -s nullglob; set -- *.txt; printf '%s\0' "$@"; } | jq -Rn '
{"": [ input
| split("\u0000")[]
| select(. != "")
| {"title": "",
"file": . | rtrimstr(".txt")
}
]
}
'
Let's break this down into pieces.
On the bash side:
shopt -s nullglob
tells the shell that if*.txt
has no arguments, it should emit nothing at all, instead of emitting the string*.txt
as a result.set --
overwrites the argument list in the current context (because this is a block on the left-hand side of the pipeline that context is transient and won't change"$@"
in code outside the pipe).printf '%s\0' "$@"
prints our arguments, with a NUL character after each one; if there are no arguments at all, it prints only a NUL.
On the jq side:
-R
specifies that the input is raw data, not json.-n
specifies that we don't automatically consume any inputs, but will instead useinput
orinputs
to specify where input should be read.split("\u0000")
splits the input on NULs. (This is important because the NUL is the only character that can't exist in a filename, which is why we usedprintf '%s\0'
on the shell end; that way we work correctly with filenames with newlines, literal quotes, whitespace, and all the other weirdness that's able to exist).select(. != "")
ignores empty strings.rtrimstr(".txt")
removes.txt
from the name.
Addendum: Sorting by mtime
The jq parts don't need to be modified here: to sort by mtime you can adjust only the shell. On a system with GNU find
, sort
and sed
, this might look like:
find . -maxdepth 1 -type f -name '*.txt' -printf '%T@ %P\0' |
sort -zn |
sed -z -re 's/^[[:digit:].] //g' |
jq -Rn '
...followed by the same jq given above.
CodePudding user response:
Using just jq
, any shell:
$ jq -n --args '{"": [ $ARGS.positional[] | rtrimstr(".txt") | { title: "", file: . } ] }' *.txt
{
"": [
{
"title": "",
"file": "1"
},
{
"title": "",
"file": "2"
},
{
"title": "",
"file": "3"
}
]
}
The filenames passed on the command line (The expansion of *.txt
are in the jq variable $ARGS.positional
. For each one, remove the .txt extension and use the rest in a object of the desired structure.
Or with a perl one-liner:
$ perl -MJSON::PP -E 'say encode_json({"" => [ map { { title => "", file => s/\.txt$//r } } @ARGV ] })' *.txt
{"":[{"file":"1","title":""},{"title":"","file":"2"},{"file":"3","title":""}]}
CodePudding user response:
My take:
stat -c '%Y:%n' *.txt \
| sort -t: -n \
| cut -d: -f2- \
| xargs basename -s .txt \
| jq -s 'map({title: "", file: tostring}) | {"": .}'