Home > Back-end >  Can't get correct colnames for table from csv in R
Can't get correct colnames for table from csv in R

Time:07-29

I have some csv file with the next structure:

 A, B, C, D
'a', 1, 2, 0
'b', 0, 1, 4
'c', 3, 1, 1
 ...

Where column A is expected to be interpreted as rownames. When I try to read it via read.csv(file.choose(), header = T, row.names = 1) or read.csv(file.choose(), header = F, row.names = 1) or read.csv(file.choose(), header = T) the result looks weird for me:

    A B C D
'a' 1 2 0
'b' 0 1 4
'c' 3 1 1
 ...

What is wrong with the file? Or what's wrong with the code for reading the file?

UPD
Solved, my file had wrong header, actually the example above doesn't show the full picture.

CodePudding user response:

Thanks for sharing the file.

First of all, I did this for debugging:

l <- readLines("output_file.csv")
stringr::str_count(l, ",")
#[1] 5191 5192 5192 5192 5192 5192 5192 5192

The header row has 1 less comma. So it seems that we should treat the first column in the file as row names.

dat <- read.csv("output_file.csv", row.names = 1, check.names = FALSE)

dim(dat)
#[1]    7 5192

row.names(dat)
#[1] "BC02" "BC07" "BC08" "BC18" "BC19" "BC23" "BC24"

head(names(dat))
#[1] "BARCODE"                            "'Massilia aquatica' Lu et al. 2020"
#[3] "[Bacillus] selenitireducens MLS10"  "[Bacillus] thermocloacae"          
#[5] "[Brevibacterium] frigoritolerans"   "[Clostridium] dakarense"   

If this is not what you are looking for, then something must have gone wrong when you got this CSV file. For example, 1 item is missing in the header row. I am also suspicious of this, because "BC02", etc. seem to be BARCODE.


Thank you very much! I figured out what the issue was and how to fix it! As you mentioned, the header was incorrectly written!

  •  Tags:  
  • r csv
  • Related