I'm trying to print out the happiest countries in the world for 2022, by receiving the data from https://en.wikipedia.org/wiki/World_Happiness_Report?action=raw). and then editing displaying the first 5 countries. Here is my code:
#!/bin/bash
content=$(curl -s "https://en.wikipedia.org/wiki/World_Happiness_Report?action=raw")
lines=$(echo "$content" | grep '^\|' | sed -n '/2022/{n;p;}')
top_5=$(echo "$lines" | awk '{print $3}' | sort | head -n 5)
echo "$top_5"
However, when I run this code in Ubuntu, nothing shows up, its just blank, like this:
....(My computer server).....:~$ bash happy_countriesnew.sh
#(I'm expecting there to be a list here)
....(My computer server).....:~$
I'm expecting something like this instead of the blank space my terminal is displaying:
Finland
Norway
Denmark
Iceland
Switzerland
Netherlands
Canada
New Zealand
Sweden
Australia
What am I doing wrong and what should I change?
CodePudding user response:
I guess you see this error (but you are ignoring it)
grep: empty (sub)expression
the problem is with your grep
expression, remove the ecape
lines=$(echo "$content" | grep '^|' | sed -n '/2022/{n;p;}')
and check for errors.
CodePudding user response:
echo | grep | sed | awk
is a bit of an anti-pattern. Typically, you want to refactor such pipelines to just be a call to awk
. In your case, it looks like your code that is attempting to extract the 2022 data is flawed. The data is already sorted, so you can drop the sort and get the data you want with:
sed -n '/^=== 2022 report/,/^=/{ s/}}//; /^|[12345]|/s/.*|//p; }'
The first portion (the /^=== 2022 report/,/^=/
) tells sed
to only work on lines between those that match the two given patterns, which is the data you are interested in. The rest is just cleaning up and extracting just the country name, printing only those lines in which the 2nd field is exactly one of the single digits 1, 2, 3, 4, or 5.
Note that this is not terribly flexible, and it is difficult to modify it to print the top 7 or the top 12, so you might want something like:
sed -n '/^=== 2022 report/,/^=/{ s/}}//; /^|[[:digit:]]/s/.*|//p; }' | head -n 5
Note that it could be argued that sed | head
is also a bit of an anti-pattern, but keeping track of lines of output in sed
is tedious and the pipe to head
is less egregious than attempting to write such code.