I read a large (24,000 observations and 1900 variables) dataset into R using this command:
expression_data<-read.table("data_expression_median.txt", sep="\t", header=TRUE, fill=TRUE)
When I look at my data using view(expression_data)
and when I pull a limited number of rows/columns out with expression_data[1:3,1:5]
, all of the data shows up correctly. Also, when I use the command expression_data[3, 1:5]
it prints the column headers AND the actual values (which is the expected result):
Hugo_Symbol Entrez_Gene_Id MB.0362 MB.0346 MB.0386
3 CD049690 NA 5.453928 5.454185 5.501577
However, when I try to subset an entire row using expression_data[3,]
or any other command to pull out an entire row, I only get the column headers:
Hugo_Symbol Entrez_Gene_Id MB.0362 MB.0346 MB.0386
MB.0574 MB.0503 MB.0641 MB.0201 MB.0218 MB.0316 MB.0189
MB.0891 MB.0658 MB.0899 MB.0605 MB.0258 MB.0506 MB.0420
MB.0223 MB.0445 MB.0199 MB.0517 MB.0155 MB.0428 MB.0117
Why is this? What am I doing wrong? I need to do operations on a row basis so I need to be able to access the data from entire rows.
CodePudding user response:
R has printing limits and your data are very wide. expression_data[3,]
has all the values and you can access them, they just won't be printed by default.
You can play with the print options, especially the max.print
option to get it to print more in your console, but the R console is really the wrong tool to view thousands of columns of data.
If you're doing a lot of math on the rows of a data frame, you may consider converting to matrix
for efficiency.