Home > OS >  Bash for loop not writing to file
Bash for loop not writing to file

Time:09-13

I often work like this:

for skra in `ls *txt` ; do paste foo.csv <(cut -f 5 $skra) > foo.csv; done

for looping through a directory by using 'ls'

Now I don't understand why this command does not add column to foo.csv in every loop

What is happening under the hood? Seems like foo.csv is not saved in every iteration

The output I get is field 5 from the last file. Not even the original foo.csv as I get if I only paste foo.csv bar.txt

EDIT: All files are tab delimited

foo.csv is just one column in the beginning

example.txt as seen in vim with set list:

(101,6352)(11174,51391)(10000,60000)^INC_044048.1^I35000^I6253^I0.038250$ (668,7819)(23384,69939)(20000,70000)^INC_044048.1^I45000^I7153^I0.034164$ (2279,8111)(32691,73588)(30000,80000)^INC_044048.1^I55000^I5834^I0.031908$

Here is a python script that does what I want:

import pandas

rammi=[]

with open('window.list') as f:

    for line in f:

        nafn=line.strip()

        df=pandas.read_csv(nafn, header=None, names=[nafn], sep='\t', usecols=[4])

        rammi.append(df)

frame = pandas.concat(rammi, axis=1)

frame.to_csv('rammi.allra', sep='\t', encoding='utf-8')

Paste column 4 from all files to one (initially I wanted to retain one original column but it was not necessary). The question was about bash not wanting to update stdin in the for loop.

CodePudding user response:

As already noted in the comments, opening foo.csv for output will truncate it in most shells. (Even if that was not the case, opening the file and running cut and paste repeatedly looks quite inefficient.)

If you don’t mind keeping all the data in memory at one point in time, a simple AWK or Bash script can do this type of processing without any further processes such as cut or paste.

awk -F'\t' '    { lines[FNR] = lines[FNR] "\t" $5 }
            END { for (l in lines) print substr(lines[l], 2) }' \
    *.txt > foo.csv

(The output should not be called .csv, but I’m sticking with the naming from the question nonetheless.)

Actually, one doesn’t really need awk for this, Bash will do:

#!/bin/bash
lines=()
for file in *.txt; do
  declare -i i=0
  while IFS=$'\t' read -ra line; do
    lines[i  ] =$'\t'"${line[4]}"
  done < "$file"
done
printf '%s\n' "${lines[@]/#?}" > foo.csv

(As a side note, "${lines[@]:1}" would remove the first line, not the first (\t) character of each line. (This particular expansion syntax works differently for strings (scalars) and arrays in Bash.) Hence "${lines[@]/#?}" (another way to express the removal of the first character), which does get applied to each array element.)

  • Related