I have some log files which contain mixed of JSON and non-JSON logs, I'd like to separate them into two files, one contains JSON logs only and the other contains non-JSON logs, I get some ideas from this to extract JSON logs with jq
, here are what I have tried using tee
to split log into two files (usage from here & here) and jq
to extract logs:
cat $logfile | tee >(jq -R -c 'fromjson? | select(type == "object") | not') > $plain_log_file) >(jq -R -c 'fromjson? | select(type == "object")' > $json_log_file)
This extracts JSON logs correctly but returns false
for each non-JSON log instead of the log content itself.
cat $logfile | tee >(jq -R -c 'try fromjson catch .') > $plain_log_file) >(jq -R -c 'fromjson? | select(type == "object")' > $json_log_file)
this gets jq syntax error "catch ."
Any suggestion on how to achieve this? Appreciate your help!
sample input:
{ "name": "joe"}
text line, this can be multi-line too
{ "xyz": 123 }
CodePudding user response:
Assuming each JSON log item occurs on a separate line:
For the JSON logs:
jq -nR -c 'inputs|fromjson?'
For the others, you could use:
jq -nRr 'inputs | . as $in | try (fromjson|empty) catch $in'
CodePudding user response:
If you only want to linewise separate the input into different files, go with @peak's solution. But if you want to further process the lines on conditions, you could turn them into an array using -Rn
and [inputs]
, and go from there. For instance, if you need the according line numbers (e.g. to feed them into another tool, e.g. sed
), use from_entries
which for arrays provides them in the .key
field:
jq -Rn 'reduce ([inputs] | to_entries[]) as $in ({};
.[($in.value | fromjson? | "json") // "plain"] = [$in.key]
)'
{
"json": [
0,
2
],
"plain": [
1
]
}