I have 200 csv files with names "a.csv", "b.csv", "c.csv" etc..
In each csv file, there are two columns: "type" and "abundance". I'd like to change the name of the abundance column in each csv file to "a_abundance","b_abundance" etc. to match the file name, and then save the csv file with the new column names.
So far I have the following, but it doesn't work.
filenames<- list.files(pattern = ".csv")
all_files <- lapply (filenames, function (x) {
file <- read.csv (x)
name= sub(".*", "", x)
colnames(file) <- paste (colnames(file), name, sep ='_')
return(file)
})
CodePudding user response:
Something like this:
all_files <- lapply(setNames(nm=filenames), function(fn) {
dat <- read.csv(fn)
ind <- colnames(dat) == "abundance"
if (any(ind)) {
colnames(dat)[ind] <- paste0(tools::file_path_sans_ext(basename(fn)), "_abundance")
}
dat
})
The above will read the data and change the one column name. (You said just one column, but your code is changing all columns ... I'll stick with just the one named "abundance"
.)
From here, you can rewrite with one from:
Map(write.csv, all_files, names(all_files))
## or ##
for (nm in names(all_files)) write.csv(all_files[[nm]], nm)
FYI, this could be done a lot faster on the command-line (bash shell or similar, as long as sed
is available) with something like:
for fn in $(ls *.csv) ; do
BN=$(basename "$fn" .csv)
sed -i -E "1{s/abundance/${BN}_abundance/}" "$fn"
done
Walk-through:
- For
BN
, thebasename
removes any leading directory component, and the trailing.csv
removes that extension from the filename; this should translate./a.csv
toa
. - For
sed
:-i
make the modification in-place on the file; note, this does not store a backup of the original file; if you use instead-i.bak
then it will back up the file before modifying it, perhaps safer the first time you try this, then you can remove the*.bak
files-E
is an extended-expression thing; you should be able to get by with-e
as well, it's just habit for me1
means to only apply this rule on the first line of the files/from/to/
translates text from thefrom
pattern to theto
pattern, in this case prepending${BN}_
(braces are a little defensive in bash envvar usage)