I'm new to awk, so I hope someone could help
I have 55 text files like this:
Row1 3553 896 23
Row2 3766 58906 1373
...
Row53 2976 0 0
I would like to add the first column 1 once (the names), and then every 3rd column from all 55 files. The output should look like this:
Row1 896 854 456 876 7864 etc.
Row2 58906 542 99 33301 4564 etc.
...
Row53 0 58 48 7816 0 etc.
I tried this code
paste * | awk 'FNR==NR{a[FNR]=$0; next} {print a[FNR],$3}' *.txt > output.txt | column -t
However, it adds the full first file, with all columns and then only the third column from the second file (total of 5 columns). All other files were not present. What can I do? Thanks!
CodePudding user response:
You could try this awk:
awk -F '\t' '
FNR==NR {table[FNR] = $1}
{table[FNR] = table[FNR] "\t" $3}
END {
for (i=1; i<=FNR; i ) {
print table[i]
}
}' *
The last (55th) value of FNR is used to print the array, so if the files don't all have the same number of lines, you will need to address that.
If you want to use paste
, maybe something like this:
paste * |
awk '
{
printf "%s", $1
for (i=3; i<55*4; i =4) {
printf "\t%s", $i
}
printf "\n"
}'
55*4
is number of files times number of columns. Hard coded. There are various methods of counting these if necessary.
CodePudding user response:
Here's a Ruby solution that can handle big files with an heterogeneous number of lines:
#!/usr/bin/env ruby
files = ARGV.map{|arg| File.open(arg)}
close_count = 0
loop do
row_name = nil
values = files.map{ |file|
next if file.closed?
if line = file.gets
fields = line.split
row_name = fields[0] if row_name.nil?
fields[2]
else
close_count = 1
file.close
end
}
break if close_count == files.count
puts "#{row_name}\t" values.join("\t")
end
# head file{1,2,3}.tsv
==> file1.tsv <==
Row1 1111 111 11
Row2 1112 112 12
==> file2.tsv <==
Row1 2221 221 21
Row2 2222 222 22
Row3 2223 223 23
==> file3.tsv <==
Row1 3331 331 31
# ./script.rb file{1,2,3}.tsv
Row1 111 221 331
Row2 112 222
Row3 223