I have a file that was created and I need to replace the last "," with "" so it will be valid JSON. The problem is that I can't figure out how to do it with sed
or even with grep
/piping to something else. I am really stumped here. Any help would be appreciated.
test.json
[
{MANY OTHER RECORDS, MAKING FILE 3.5Gig (making sed fail because of memory, so newlines were added)},
{"ID":"57705e4a-158c-4d4e-9e07-94892acd98aa","USERNAME":"jmael","LOGINTIMESTAMP":"2021-11-30"},
{"ID":"b8b67609-50ed-4cdc-bbb4-622c7e6a8cd2","USERNAME":"henrydo","LOGINTIMESTAMP":"2021-12-15"},
{"ID":"a44973d0-0ec1-4252-b9e6-2fd7566c6f7d","USERNAME":"null","LOGINTIMESTAMP":"2021-10-31"},
]
Of course, using grep
with -P
matches what I need to replace
grep -Pzo '"},\n]' test.json
CodePudding user response:
Using GNU sed
$ sed -Ez 's/([^]]*),/\1/' test.json
[
{MANY OTHER RECORDS, MAKING FILE 3.5Gig (making sed fail because of memory, so newlines were added)},
{"ID":"57705e4a-158c-4d4e-9e07-94892acd98aa","USERNAME":"jmael","LOGINTIMESTAMP":"2021-11-30"},
{"ID":"b8b67609-50ed-4cdc-bbb4-622c7e6a8cd2","USERNAME":"henrydo","LOGINTIMESTAMP":"2021-12-15"},
{"ID":"a44973d0-0ec1-4252-b9e6-2fd7566c6f7d","USERNAME":"null","LOGINTIMESTAMP":"2021-10-31"}
]
CodePudding user response:
Remove last comma in a file with GNU sed:
sed -zE 's/,([^,]*)$/\1/' file
Output to stdout:
[ {MANY OTHER RECORDS, MAKING FILE 3.5Gig (making sed fail because of memory, so newlines were added)}, {"ID":"57705e4a-158c-4d4e-9e07-94892acd98aa","USERNAME":"jmael","LOGINTIMESTAMP":"2021-11-30"}, {"ID":"b8b67609-50ed-4cdc-bbb4-622c7e6a8cd2","USERNAME":"henrydo","LOGINTIMESTAMP":"2021-12-15"}, {"ID":"a44973d0-0ec1-4252-b9e6-2fd7566c6f7d","USERNAME":"null","LOGINTIMESTAMP":"2021-10-31"} ]
See: man sed
and The Stack Overflow Regular Expressions FAQ
CodePudding user response:
You can bufferize two lines and remove the comma when reaching the end of the file:
awk '
NR > 2 { print line0 }
{
line0 = line1
line1 = $0
}
END {
sub(/,$/,"",line0)
print line0
print line1
}
'
example:
printf '%s,\n' 1 2 3 4 | awk ...
1,
2,
3
4,
An other solution would be to use perl
to read the last n
bytes of the file, then find the position of the target comma and replace it in-place with a space character:
perl -e '
open $fh, " <", $ARGV[0];
$n = 16;
seek $fh, -$n, 2;
$n = read $fh, $str, $n;
if ( $str =~ /,\s*]\s*$/s ) {
seek $fh, -($n - $-[0]), 1;
print $fh " ";
}
close $fh;
' log.json
Aside: You should fix the code that generates the JSON upstream for at least making it output a stack of JSON objects instead of trying to build a broken array.