Ordering columns of data in R-CodePudding

I have a CSV file with 141 rows and several columns. I wanted my data to be ordered in ascending order by the first two columns i.e. 'label' and 'index'. Following is my code:

final_data <- read.csv("./features.csv",
                   header = FALSE,
                   col.names = c('label','index', 'nr_pix', 'rows_with_1', 'cols_with_1',
                                 'rows_with_3p', 'cols_with_3p', 'aspect_ratio',
                                 'neigh_1', 'no_neigh_above', 'no_neigh_below',
                                 'no_neigh_left', 'no_neigh_right', 'no_neigh_horiz',
                                 'no_neigh_vert', 'connected_areas', 'eyes', 'custom'))

sorted_data_by_label <- final_data[order(label),]
sorted_data_by_index <- sorted_data_by_label[order(index),]

write.table(sorted_data_by_index, file = "./features.csv",
            append = FALSE, sep = ',',
            row.names = FALSE)

I chose to read from a CSV and use write.table because that was necessary for my code requirement to override the CSV with column names.

Now even when I added a , after order(label), and order(index), the code sorted data should still read other rows and columns right?

After running this code, I only get the first row out of 141 rows. Is there a way to fix this problem?

CodePudding user response：

As @akrun has mentioned briefly, what you need to do is to change

sorted_data_by_label <- final_data[order(label),]

sorted_data_by_label <- final_data[order(final_data$label),]

and to change

sorted_data_by_index <- sorted_data_by_label[order(index),]

sorted_data_by_index <- sorted_data_by_label[order(sorted_data_by_label$index),]

This is because when you write label, R will try to find the index object in the global environment, not within the final_data data frame. If you intended to use index that is a column of final_data, you need to use explicit final_data$index.

Other options

You can use with:

sorted_data_by_label <- with(final_data, final_data[order(label),])

sorted_data_by_index <- with(sorted_data_by_label, sorted_data_by_label[order(index),])

In dplyr you can use

sorted_data_by_label <- final_data %>% arrange(label)

sorted_data_by_index <- sorted_data_by_label %>% arrange(index)