I have dozens of Excel sheets, with standardized names and in the R project folder, and I would like to select specific cells, with standardized locations between the sheets and assemble a new data frame.
Example: In cell A3, I have name; In cell C5, I have age; In cell F4, I have a profession; In cell F10 I have the city; In cell J22, I have an Opinion about a product. How can I build a new dataframe that shows me the columns:
NAME AGE PROFESSION CITY OPINION
and in each row the data extracted from each worksheet that is in the folder?
Thank you very much in advance.
CodePudding user response:
For example, the Excel sheets (i.e. saved as csv file) are at your current working directory:
Reading csv files in R, the first column is index, so A corresponding to the second column, B -> the third, and so on...
=> A3 -> [3, 2]
C5 -> [5, 4]
F4 -> [4, 7]
F10 -> [10, 7]
J22 -> [22, 11]
You can do the following:
files <- list.files(".")
df <- data.frame(matrix(NA, nrow=length(files),
ncol=5))
colnames(df) <- c("NAME", "AGE", "PROFESSION", "CITY", "OPINION")
for(i in 1:length(files)){
raw_read <- read.csv(files[i])
df[i,] <- c(raw_read[3,2],
raw_read[5,4],
raw_read[4,7],
raw_read[10,7],
raw_read[22,11])
}
Change read.csv()
to other function for other formats of file.