Problems with displaying .txt file (delimiters)-CodePudding

I have a problem with one task where I have to load some data set, and I have to make sure that missing values are read in properly and that column names are unambiguous.

The format of .txt file:

Format

At the end, data set should contain only country column and median age. I tried using read.delim, precisely this chunk:

rawdata <- read.delim("rawdata_343.txt", sep = "", stringsAsFactors = FALSE, header = TRUE)

And when I run it, I get this:

Result

It confuses me that if country has multiple words (Turks and Caicos Islands) it assigns every word to another column.

Since I am still a beginner in R, any suggestion would be very helpful for me. Thanks!

CodePudding user response：

Three points to note about your input file: (1) the first two lines at the top are not tabular and should be skipped with skip = 2, (2) your column separators are tabs and this should be specified with sep = "\t", and (c) you have no headers, so header = FALSE. Your command should be: -

rawdata <- read.delim("rawdata_343.txt", sep = "\t", stringsAsFactors = FALSE, header = FALSE, skip = 2)

UPDATE: A fourth point is that the first column includes row numbers, so row.names = 1. This also addresses the follow-up comment.

rawdata <- read.delim("rawdata_343.txt", sep = "\t", stringsAsFactors = FALSE, header = FALSE, skip = 2, row.names = 1)

CodePudding user response：

It looks like your delimiter that you are specifying in the sep= argument is telling R to consider spaces as the column delimiter. Looking at your data as a .txt file, there is no apparent delimiter (like commas that you would find in a typical .csv). If you can put the data in a tabular form in something like a .csv or .xlsx file, R is much better at reading that data as expected. As it is, you may struggle to get the .txt format to read in a tabular fashion, which is what I assume you want.

P.s. you can use read.csv() if you do end up putting the data in that format.