I have one file data.txt with content
2013-04-24;1;0.1635;1.4135;fp.3.Luci_02C06.ctg.ctg7180000085546
2013-04-24;1;0.9135;1.4135;fp.3.Luci_01A01.ctg.ctg7180000038386
2017-04-24;2;0.9135;1.4135;fp.3.Luci_02C06.ctg.ctg7180000085546
2011-04-24;2;0.9135;1.4135;fp.3.Luci_02C06.ctg.ctg7180000085546
2012-04-24;2;0.9135;1.4135;fp.3.Luci_02C06.ctg.ctg7180000085549
2016-04-24;2;0.9135;1.4135;fp.3.Luci_02C06.ctg.ctg7180000085549
2016-04-24;2;0.9135;1.4335;fp.3.Luci_02C06.ctg.ctg7180000085549
2013-04-24;1;0.9135;1.4135;fp.3.Luci_01A01.ctg.ctg7180000038386
2011-04-24;2;0.9135;1.4135;fp.3.Luci_02C06.ctg.ctg7180000085546
and another file lookup.txt with content
1;2012-04-24;2ab1e4c0-de4d-11e2-a934-0f0479162b1b;fp.3.Luci_02C06.ctg.ctg7180000085546
7;2013-04-24;2ab21e90-de4d-11e2-9ce8-d368d9512bad;fp.3.Luci_01A01.ctg.ctg7180000038386
3;2014-04-24;2ab2582e-de4d-11e2-bb5f-6b1f6c4437f8;fp.3.Luci_02C06.ctg.ctg7180000085549
I want to replace column 5 in data.txt with column 1 in lookup.txt based on matching values in column 4 in lookup.txt. The result I want is
2013-04-24;1;0.1635;1.4135;1
2013-04-24;1;0.9135;1.4135;7
2017-04-24;2;0.9135;1.4135;1
2011-04-24;2;0.9135;1.4135;1
2012-04-24;2;0.9135;1.4135;3
2016-04-24;2;0.9135;1.4135;3
2016-04-24;2;0.9135;1.4335;3
2013-04-24;1;0.9135;1.4135;7
2011-04-24;2;0.9135;1.4135;1
I another post I found the following shell script which looks promising
awk -F';' 'NR==FNR{a[$4]=$1;next}{$5=a[$5]}1' lookup.txt data.txt
But the result below leaves for some reason column 5 blank for row 5,6,7 and 9. Why? The fact that we get space separators rather than ";" is also an issue but not as important
2013-04-24 1 0.1635 1.4135 1
2013-04-24 1 0.9135 1.4135 7
2017-04-24 2 0.9135 1.4135 1
2011-04-24 2 0.9135 1.4135 1
2012-04-24 2 0.9135 1.4135
2016-04-24 2 0.9135 1.4135
2016-04-24 2 0.9135 1.4335
2013-04-24 1 0.9135 1.4135 7
2011-04-24 2 0.9135 1.4135
CodePudding user response:
You may use this awk
solution:
awk '
BEGIN{FS=OFS=";"}
{
sub(/\r$/, "")
}
NR == FNR {
map[$NF] = $1
next
}
{
$NF = map[$NF]
} 1' lookup.txt data.txt
2013-04-24;1;0.1635;1.4135;1
2013-04-24;1;0.9135;1.4135;7
2017-04-24;2;0.9135;1.4135;1
2011-04-24;2;0.9135;1.4135;1
2012-04-24;2;0.9135;1.4135;3
2016-04-24;2;0.9135;1.4135;3
2016-04-24;2;0.9135;1.4335;3
2013-04-24;1;0.9135;1.4135;7
2011-04-24;2;0.9135;1.4135;1
Explanation:
BEGIN{FS=OFS=";"}
: Sets input and output field delimiters to;
- Use
NF
consistently instead of numbered column in both files sub(/\r$/, "")
is to remove any carriage characters from line end