Home > Software design >  How to assign data frame names as the first row?
How to assign data frame names as the first row?

Time:11-22

I have several list objects, each containing 31 dataframes, which I have names 'file1980, file1981, file1982 etc up to 'file2010'. These were made by splitting the original (11315 rows) dataset into equal 31 sized (365 rows) dataframes using the following:

n <- 31   
dataList <- split(MainData, factor(sort(rank(row.names(MainData))%%n)))
    names(dataList) <- paste0("file",1980:2010)

The individual data frames have the headers removed, an look like this:

1   001  -6.83  -5.83  -7.83 0.05 0.8217593   8.101852 100.0
2   002  -6.33  -4.83  -7.83 0.10 2.2453704   9.259259 100.0
3   003  -5.83  -4.83  -6.83 0.30 1.9444444   8.101852  94.7
4   004  -5.83  -4.83  -6.83 0.10 1.0416667   8.101852  97.5
5   005  -6.33  -4.83  -7.83 0.00 1.1226852   9.259259  98.5
6   006  -7.83  -5.83  -9.83 0.03 2.0949074  10.416667 100.0

They will be exported with row names removed into *.txt files for use in another piece of software. However, this software starts by reading the first row as the file name, so the first file needs to be 'file1980' and so on. I'm hoping to get something like this:

    file1980
    001  -6.83  -5.83  -7.83 0.05 0.8217593   8.101852 100.0
2   002  -6.33  -4.83  -7.83 0.10 2.2453704   9.259259 100.0
3   003  -5.83  -4.83  -6.83 0.30 1.9444444   8.101852  94.7
4   004  -5.83  -4.83  -6.83 0.10 1.0416667   8.101852  97.5
5   005  -6.33  -4.83  -7.83 0.00 1.1226852   9.259259  98.5
6   006  -7.83  -5.83  -9.83 0.03 2.0949074  10.416667 100.0
7   007  -5.33  -4.83  -5.83 0.00 1.4930556   8.101852  97.6
8   008  -7.33  -5.83  -8.83 0.00 0.9027778   9.259259 100.0
9   009  -7.33  -6.83  -7.83 0.03 0.8217593   8.101852  90.2

I've spent the last couple of days trawling for information on how to do something like this, and so far have found nothing even close in R. So is this even possible?

CodePudding user response:

I believe it's better not to edit the data frame to do this sort of thing, better first print the required string first to the file, then append the data.

write.table() is quite flexible luckily :

# data prep, taking 2 chunks of iris
MainData <- head(iris, 10)
# optional if you want columns with equal width in the csv :
MainData[] <- lapply(MainData, format)
n <- 2   
dataList <- split(
  MainData, 
  factor(sort(rank(row.names(MainData_formatted))%%n))
)
names(dataList) <- paste0("file",seq(n))
dataList

# print to file
for (file in names(dataList)) {
  path <- paste0(file, ".csv")
  # print string to file
  writeLines(file, path)
  # append the data, without headers, quotes or row names
  write.table(
    dataList[[file]], path, 
    col.names = FALSE, row.names = FALSE, quote = FALSE, append = TRUE
  )
}

# in file1.csv :

# file1
# 5.1 3.5 1.4 0.2 setosa
# 4.9 3.0 1.4 0.2 setosa
# 4.7 3.2 1.3 0.2 setosa
# 4.6 3.1 1.5 0.2 setosa
# 5.0 3.6 1.4 0.2 setosa

# cleanup
file.remove(c("file1.csv", "file2.csv"))

write.table() uses " " as a separator, you might use sep = "\t" if you want it tab delimited

CodePudding user response:

I do not really get what you want, maybe this helps.

dataList <- list(iris, iris, iris) 
names(dataList) <- paste0("file",1980:1982)
  
purrr::imap(dataList, ~.x |> 
              mutate(Sepal.Length = Sepal.Length |> as.character()) |> 
              tibble::add_row(.before = 1, Sepal.Length = .y))

Output:

$file1980
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1       file1980          NA           NA          NA       <NA>
2            5.1         3.5          1.4         0.2     setosa

CodePudding user response:

You can use tibble::add_row(.before = 1) to append a row at the top. For this to work, you need to convert the first column to character so you can add the table name since each data.frame column must be of a single type.

library(tidyverse)

l <- list(a = data.frame(matrix(1:4, 2)),
          b = data.frame(matrix(1:4, 2)))

l <- l %>% 
  map2(.x = ., 
       .y = names(.), 
       ~.x %>% 
         mutate(across(1, as.character)) %>% 
         add_row(X1 = .y, .before = 1))

l
#> $a
#>   X1 X2
#> 1  a NA
#> 2  1  3
#> 3  2  4
#> 
#> $b
#>   X1 X2
#> 1  b NA
#> 2  1  3
#> 3  2  4

Created on 2022-11-22 with reprex v2.0.2

  • Related