I have a CSV file with 141 rows and several columns. I wanted my data to be ordered in ascending order by the first two columns i.e. 'label' and 'index'. Following is my code:
final_data <- read.csv("./features.csv",
header = FALSE,
col.names = c('label','index', 'nr_pix', 'rows_with_1', 'cols_with_1',
'rows_with_3p', 'cols_with_3p', 'aspect_ratio',
'neigh_1', 'no_neigh_above', 'no_neigh_below',
'no_neigh_left', 'no_neigh_right', 'no_neigh_horiz',
'no_neigh_vert', 'connected_areas', 'eyes', 'custom'))
sorted_data_by_label <- final_data[order(label),]
sorted_data_by_index <- sorted_data_by_label[order(index),]
write.table(sorted_data_by_index, file = "./features.csv",
append = FALSE, sep = ',',
row.names = FALSE)
I chose to read from a CSV and use write.table because that was necessary for my code requirement to override the CSV with column names.
Now even when I added a ,
after order(label),
and order(index),
the code sorted data should still read other rows and columns right?
After running this code, I only get the first row out of 141 rows. Is there a way to fix this problem?
CodePudding user response:
As @akrun has mentioned briefly, what you need to do is to change
sorted_data_by_label <- final_data[order(label),]
to
sorted_data_by_label <- final_data[order(final_data$label),]
and to change
sorted_data_by_index <- sorted_data_by_label[order(index),]
to
sorted_data_by_index <- sorted_data_by_label[order(sorted_data_by_label$index),]
This is because when you write label
, R will try to find the index
object in the global environment, not within the final_data
data frame.
If you intended to use index
that is a column of final_data
, you need to use explicit final_data$index
.
Other options
You can use with
:
sorted_data_by_label <- with(final_data, final_data[order(label),])
sorted_data_by_index <- with(sorted_data_by_label, sorted_data_by_label[order(index),])
In dplyr
you can use
sorted_data_by_label <- final_data %>% arrange(label)
sorted_data_by_index <- sorted_data_by_label %>% arrange(index)