Home > Mobile >  Convert table to dataframe in R
Convert table to dataframe in R

Time:01-25

I have been struggling on how to convert a table to a dataframe by keeping the original structure.

This is the table I am talking about:

sample_size <- structure(c(0L, 2L, 2L, 0L, 3L, 1L, 3L, 9L, 13L, 0L, 0L, 0L, 
0L, 1L, 0L, 0L, 0L, 0L, 1L, 2L, 0L, 0L, 2L, 0L, 0L, 0L, 0L, 0L, 
2L, 0L, 0L, 2L, 0L, 0L, 0L, 0L, 0L, 6L, 0L, 9L, 2L, 0L, 0L, 1L, 
0L, 0L, 8L, 0L, 6L, 1L, 0L, 0L, 2L, 1L, 3L, 16L, 10L, 0L, 2L, 
0L, 0L, 6L, 33L, 4L, 30L, 18L, 3L, 0L, 14L, 1L, 12L, 40L, 1L, 
13L, 9L, 0L, 0L, 0L, 0L, 11L, 25L, 1L, 5L, 2L, 0L, 2L, 0L, 0L, 
1L, 16L, 2L, 17L, 1L, 0L, 2L, 0L, 0L, 2L, 19L, 2L, 3L, 1L, 0L, 
1L, 0L, 3L, 4L, 17L, 0L, 2L, 0L, 0L, 2L, 0L, 0L, 0L, 0L, 0L, 
2L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 2L, 0L, 0L, 0L, 
0L, 1L, 1L, 0L, 0L, 2L, 0L, 2L, 4L, 1L, 0L, 2L, 0L, 0L, 2L, 0L, 
3L, 7L, 2L, 4L, 10L, 16L, 4L, 2L, 0L, 22L, 11L, 31L, 7L, 8L, 
18L, 3L, 3L, 1L, 21L, 8L, 31L, 4L, 10L, 11L, 3L, 2L, 0L, 19L, 
5L, 29L, 4L, 6L, 16L, 2L, 1L, 2L, 11L, 12L, 27L, 6L, 7L, 14L, 
2L, 0L, 2L, 14L, 9L, 35L), dim = c(9L, 11L, 2L), dimnames = structure(list(
    c("Ardipithecus ramidus", "Australopithecus afarensis", "Australopithecus africanus", 
    "Australopithecus anamensis", "Early Homo", "Homo erectus", 
    "Homo habilis", "Paranthropus boisei", "Paranthropus robustus"
    ), c("C", "dc", "dm1", "dm2", "I1", "I2", "M1", "M2", "M3", 
    "P3", "P4"), c("L", "U")), names = c("", "", "")), class = "table")

When I see the object sample_size, it looks like this:

, ,  = L

                            
                              C dc dm1 dm2 I1 I2 M1 M2 M3 P3 P4
  Ardipithecus ramidus        0  0   1   0  0  0  3  4  1  1  2
  Australopithecus afarensis  2  0   2   2  6  8 16 30 13  5 17
  Australopithecus africanus  2  0   0   0  0  0 10 18  9  2  1
  Australopithecus anamensis  0  0   0   0  9  6  0  3  0  0  0
  Early Homo                  3  1   2   2  2  1  2  0  0  2  2
  Homo erectus                1  0   0   0  0  0  0 14  0  0  0
  Homo habilis                3  0   0   0  0  0  0  1  0  0  0
  Paranthropus boisei         9  0   0   0  1  2  6 12 11  1  2
  Paranthropus robustus      13  0   0   0  0  1 33 40 25 16 19

, ,  = U

                            
                              C dc dm1 dm2 I1 I2 M1 M2 M3 P3 P4
  Ardipithecus ramidus        2  0   0   0  1  0  4  7  4  4  6
  Australopithecus afarensis  3  2   2   2  1  2 10  8 10  6  7
  Australopithecus africanus  1  0   0   0  0  0 16 18 11 16 14
  Australopithecus anamensis  0  0   0   0  0  0  4  3  3  2  2
  Early Homo                  1  2   1   2  2  2  2  3  2  1  0
  Homo erectus                0  0   0   0  0  0  0  1  0  2  2
  Homo habilis                3  0   0   0  2  3 22 21 19 11 14
  Paranthropus boisei         4  0   0   0  4  7 11  8  5 12  9
  Paranthropus robustus      17  0   0   0  1  2 31 31 29 27 35

However, when I convert sample_size to a dataframe, it changes the structure.

sample_size_df <- as.data.frame(sample_size)

head(sample_size_df)

                        Var1 Var2 Var3 Freq
1       Ardipithecus ramidus    C    L    0
2 Australopithecus afarensis    C    L    2
3 Australopithecus africanus    C    L    2
4 Australopithecus anamensis    C    L    0
5                 Early Homo    C    L    3
6               Homo erectus    C    L    1

How can I convert the original table into a dataframe by keeping the original structure shown in the table?

CodePudding user response:

R's S3 dispatch tends to do that conversion when the original object inherits from "table"; we can force the conversion I think you want by specifying as.data.frame.matrix specifically. Since that assumes 2-dim (and your data is 3), we need to do it to each plane (using apply and its MARGIN=3) and rbind them in the end. We need to be careful, though, since your row names are duplicated in each plane, so we'll bring those into a "real" column, and then add a column so you know from which plane the data came.

This is a hybrid base/tibble/dplyr approach, there are many ways to effect each step:

apply(sample_size, 3, as.data.frame.matrix, simplify = FALSE) |>
  lapply(tibble::rownames_to_column) |>
  dplyr::bind_rows(.id = "plane")
#    plane                    rowname  C dc dm1 dm2 I1 I2 M1 M2 M3 P3 P4
# 1      L       Ardipithecus ramidus  0  0   1   0  0  0  3  4  1  1  2
# 2      L Australopithecus afarensis  2  0   2   2  6  8 16 30 13  5 17
# 3      L Australopithecus africanus  2  0   0   0  0  0 10 18  9  2  1
# 4      L Australopithecus anamensis  0  0   0   0  9  6  0  3  0  0  0
# 5      L                 Early Homo  3  1   2   2  2  1  2  0  0  2  2
# 6      L               Homo erectus  1  0   0   0  0  0  0 14  0  0  0
# 7      L               Homo habilis  3  0   0   0  0  0  0  1  0  0  0
# 8      L        Paranthropus boisei  9  0   0   0  1  2  6 12 11  1  2
# 9      L      Paranthropus robustus 13  0   0   0  0  1 33 40 25 16 19
# 10     U       Ardipithecus ramidus  2  0   0   0  1  0  4  7  4  4  6
# 11     U Australopithecus afarensis  3  2   2   2  1  2 10  8 10  6  7
# 12     U Australopithecus africanus  1  0   0   0  0  0 16 18 11 16 14
# 13     U Australopithecus anamensis  0  0   0   0  0  0  4  3  3  2  2
# 14     U                 Early Homo  1  2   1   2  2  2  2  3  2  1  0
# 15     U               Homo erectus  0  0   0   0  0  0  0  1  0  2  2
# 16     U               Homo habilis  3  0   0   0  2  3 22 21 19 11 14
# 17     U        Paranthropus boisei  4  0   0   0  4  7 11  8  5 12  9
# 18     U      Paranthropus robustus 17  0   0   0  1  2 31 31 29 27 35

If you want to keep it base-R only, then a little-less-elegantly:

tmp <- apply(sample_size, 3, as.data.frame.matrix, simplify = FALSE) |>
  lapply(tibble::rownames_to_column)
Map(transform, tmp, plane = names(tmp)) |>
  do.call(rbind.data.frame, args = _)

(Since I use args=_ with base R's |> native pipe, this requires R >= 4.2.0; if you have a lesser version, then you can either switch to magrittr::%>% and its . pronoun, or we can take further steps to mitigate, lmk.)

CodePudding user response:

As an alternative, you can work with the flattened array and reshape to wide format:

library(tidyr)

as.data.frame(sample_size) %>%
  pivot_wider(names_from = Var2, values_from = Freq)

# A tibble: 18 × 13
   Var1                       Var3      C    dc   dm1   dm2    I1    I2    M1    M2    M3    P3    P4
   <fct>                      <fct> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
 1 Ardipithecus ramidus       L         0     0     1     0     0     0     3     4     1     1     2
 2 Australopithecus afarensis L         2     0     2     2     6     8    16    30    13     5    17
 3 Australopithecus africanus L         2     0     0     0     0     0    10    18     9     2     1
 4 Australopithecus anamensis L         0     0     0     0     9     6     0     3     0     0     0
 5 Early Homo                 L         3     1     2     2     2     1     2     0     0     2     2
 6 Homo erectus               L         1     0     0     0     0     0     0    14     0     0     0
 7 Homo habilis               L         3     0     0     0     0     0     0     1     0     0     0
 8 Paranthropus boisei        L         9     0     0     0     1     2     6    12    11     1     2
 9 Paranthropus robustus      L        13     0     0     0     0     1    33    40    25    16    19
10 Ardipithecus ramidus       U         2     0     0     0     1     0     4     7     4     4     6
11 Australopithecus afarensis U         3     2     2     2     1     2    10     8    10     6     7
12 Australopithecus africanus U         1     0     0     0     0     0    16    18    11    16    14
13 Australopithecus anamensis U         0     0     0     0     0     0     4     3     3     2     2
14 Early Homo                 U         1     2     1     2     2     2     2     3     2     1     0
15 Homo erectus               U         0     0     0     0     0     0     0     1     0     2     2
16 Homo habilis               U         3     0     0     0     2     3    22    21    19    11    14
17 Paranthropus boisei        U         4     0     0     0     4     7    11     8     5    12     9
18 Paranthropus robustus      U        17     0     0     0     1     2    31    31    29    27    35

Or with base reshape() (names need slight cleaning up):

reshape(
  as.data.frame(sample_size),
  idvar = c("Var1", "Var3"),
  timevar = "Var2",
  direction = "wide"
)
  • Related