I have a file which looks like this (myfile.txt)
GYFUFGYO1 KMP-app [email protected] CODE_SMELL
GYFUFGYO2 KMP-app [email protected] CODE_SMELL
GYFUFGYG3 AFP-Login [email protected] BUG
GYFUFGYG4 AFP-Login [email protected] BUG
GYFUFGYO5 KMP-app [email protected] CODE_SMELL
GYFUFGYO6 KMP-app [email protected] CODE_SMELL
I have to write this text content to a JSON file (myfile.json). this is the expected output
[
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"},
{"ApplicationName":"AFP-Login","BuildBreakReason":"BUG","DefectAuthor": "[email protected]"},
{"ApplicationName":"AFP-Login","BuildBreakReason":"BUG","DefectAuthor": "[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"}
]
this is what I tried
I created a file called "textconvert.sh". then wrote a shell script like this.
echo"[" >> myfile.json
echo {"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"}, >> myfile.json
echo {"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"}, >> myfile.json
echo {"ApplicationName":"AFP-Login","BuildBreakReason":"BUG","DefectAuthor": "[email protected]"}, >> myfile.json
echo {"ApplicationName":"AFP-Login","BuildBreakReason":"BUG","DefectAuthor": "[email protected]"}, >> myfile.json
echo {"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"}, >> myfile.json
echo {"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"} >> myfile.json
echo"]" >> myfile.json
but I am not allowed to hard coding like this. what I am thinking now is,
write a loop to scan "myfile.txt" then assign column values to variables. then write a json file.
Can someone help me to figure out this? Thanks in advance
CodePudding user response:
Using sed
$ sed -E 's/[^ ]* ([^ ]*) ([^ ]*) (.*)/{"ApplicationName":"\1","BuildBreakReason":"\3","DefectAuthor": "\2"},/;$s/,$/\n]/;1i[' myfile.txt
[
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"},
{"ApplicationName":"AFP-Login","BuildBreakReason":"BUG","DefectAuthor": "[email protected]"},
{"ApplicationName":"AFP-Login","BuildBreakReason":"BUG","DefectAuthor": "[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"}
]
CodePudding user response:
Using any awk:
$ cat tst.awk
BEGIN {
fmt = "%s{\"ApplicationName\":\"%s\",\"BuildBreakReason\":\"%s\",\"DefectAuthor\": \"%s\"}"
print "["
}
{ printf fmt, sep, $2, $4, $3; sep="," ORS }
END { print ORS "]" }
$ awk -f tst.awk myfile.txt
[
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"},
{"ApplicationName":"AFP-Login","BuildBreakReason":"BUG","DefectAuthor": "[email protected]"},
{"ApplicationName":"AFP-Login","BuildBreakReason":"BUG","DefectAuthor": "[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor": "[email protected]"}
]
Regarding What I am thinking now is, write a loop to scan "myfile.txt"...
from your question - no, don't do that. See why-is-using-a-shell-loop-to-process-text-considered-bad-practice.
CodePudding user response:
I would harness GNU AWK
for this task following way, let file.txt
content be
GYFUFGYO1 KMP-app [email protected] CODE_SMELL
GYFUFGYO2 KMP-app [email protected] CODE_SMELL
GYFUFGYG3 AFP-Login [email protected] BUG
GYFUFGYG4 AFP-Login [email protected] BUG
GYFUFGYO5 KMP-app [email protected] CODE_SMELL
GYFUFGYO6 KMP-app [email protected] CODE_SMELL
then
awk 'BEGIN{print "["}NR>1{print ","}{printf "{\"ApplicationName\":\"%s\",\"BuildBreakReason\":\"%s\",\"DefectAuthor\":\"%s\"}",$2,$4,$3}END{print "\n]"}' file.txt
gives output
[
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor":"[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor":"[email protected]"},
{"ApplicationName":"AFP-Login","BuildBreakReason":"BUG","DefectAuthor":"[email protected]"},
{"ApplicationName":"AFP-Login","BuildBreakReason":"BUG","DefectAuthor":"[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor":"[email protected]"},
{"ApplicationName":"KMP-app","BuildBreakReason":"CODE_SMELL","DefectAuthor":"[email protected]"}
]
Explanation: You need ,
after every but last characters, but detecting last line in GNU AWK
is not easy, so I print
,
before every but 1st record. I use printf
to rework your whitespace-sheared records into JSON, 1st argument is string with places to fill denoted by %s
, observe that "
needs to be escaped to mean literal "
. BEGIN
and END
are used to encase records into [
and ]
. Disclaimer: this code does not espace characters of special meaning to JSON, for example "
.
(tested in gawk 4.2.1)