Home > Enterprise >  How do I stop r from using the first row of data as the column name?
How do I stop r from using the first row of data as the column name?

Time:07-13

I'm extremely new to using R and I keep running into an issue with my current data set. The data set consists of several txt files with 30 rows and 3 columns of numerical data. However, when I try to work with them in r, it automatically makes the first row of data the column heading, so that when I try to combine the files, everything gets messed up as none of them have the same column titles. How do I stop this problem from happening? The code I've used so far is below!

setwd("U:\\filepath")
library(readr)
library(dplyr)
file.list <- list.files(pattern='*.txt')
df.list <- lapply(file.list, read_tsv)

After this point it just says that there are 29 rows and 1 column, which is not what I want! Any help is appreciated!

CodePudding user response:

Use df_list <- lapply(file.list, read_tsv, col_names = FALSE).

CodePudding user response:

You say:

After this point it just says that there are 29 rows and 1 column, which is not what I want!

What that is telling you is that you don't have a tab-separated file. There's not a way to tell which delimiter is being assumed, but it's not a tab. And then you have the issue that your colnames are all different. That well could mean that your files do not have a header line. If you wanted to see what was in your files you could do something like:

 df.list <- lapply(file.list, function(x) readLines(x)[1])
 df.list[[1]]

If there are tabs, then they should reveal themselves by getting expanded into spaces when printed to the console.

Generally it is better to determine what delimiters exist by looking at the file with a text editor (but not MS Word).

  • Related