The JSON files look like:
{
"name": "My Collection",
"description": "This is a great collection.",
"date": 1639717379161,
"attributes": [
{
"trait_type": "Background",
"value": "Sand"
},
{
"trait_type": "Skin",
"value": "Dark Brown"
},
{
"trait_type": "Mouth",
"value": "Smile Basic"
},
{
"trait_type": "Eyes",
"value": "Confused"
}
]
}
I found a shell script that uses jq
and has this code:
i=1
for eachFile in *.json; do
cat $i.json | jq -r '.[] | {column1: .name, column2: .description} | [.[] | tostring] | @csv' > extract-$i.csv
echo "converted $i of many json files..."
((i=i 1))
done
But its output is:
jq: error (at <stdin>:34): Cannot index string with string "name"
converted 1 of many json files...
Any suggestions on how I can make this work? Thank you!
CodePudding user response:
Quick jq lesson
===========
jq filters are applied like this:
jq -r '.name_of_json_field_0 <optional filter>, .name_of_json_field_1 <optional filter>'
and so on and so forth. A single dot is the simplest filter; it leaves the data field untouched.
jq -r '.name_of_field .'
You may also leave the filter field untouched for the same effect.
In your case:
jq -r '.name, .description'
will extract the values of both those fields.
.[]
will unwrap an array to have the next piped filter applied to each unwrapped value. Example:
jq -r '.attributes | .[]
extracts all trait_types objects.
You may sometime want to repackage objects in an array by surrounding the filter in brackets:
jq -r '[.name, .description, .date]
You may sometime want to repackage data in an object by surrounding the filter in curly braces:
`jq -r '{new_field_name: .name, super_new_field_name: .description}'
playing around with these, I was able to get
jq -r '[.name, .description, .date, (.attributes | [.[] | .trait_type] | @csv | gsub(",";";") | gsub("\"";"")), (.attributes | [.[] | .value] | .[]] | @csv | gsub(",";";") | gsub("\"";""))] | @csv'
to give us:
"My Collection","This is a great collection.",1639717379161,"Background;Skin;Mouth;Eyes","Sand;Dark Brown;Smile Basic;Confused"
Name, description, and date were left as is, so let's break down the weird parts, one step at a time.
.attributes | [.[] | .trait_type]
.[]
extracts each element of the attributes array and pipes the result of that into the next filter, which says to simply extract trait_type
, where they are re-packaged in an array.
.attributes | [.[] | .trait_type] | @csv
turn the array into a csv-parsable format.
(.attributes | [.[] | .trait_type] | @csv | gsub(",";";") | gsub("\"";""))
Parens separate this from the rest of the evaluations, obviously.
The first gsub
here replaces commas with semicolons so they don't get interpreted as a separate field, the second removes all extra double quotes.