Home > Mobile >  how to convert a table into a dataframe in r?
how to convert a table into a dataframe in r?

Time:10-12

Suppose, I have a data set as follows:

State <- c("CA", "WI", "TX", "MS", "NY", "KT", "UT", "CO", "PA", "SC")
Pov_rt <- c(25, 30, 35, 40, 45, 50, 10, 15, 25, 40)


df <- data. Frame(State, Pov_rt)

I have created a new column

df[["pov_level"]] <- cut(
  df$Pov_rt, 
  breaks = c(-Inf, 10, 20, 30, 40, Inf),
  labels = c(
    "Very Low Poverty (<10%)",
    "Low Poverty (10-20%)",
    "Medium Poverty (20-30%)",
    "High Poverty (30-40%)",
    "Very High Poverty (>40%)"
  )
)

I want to, further, get a table

table(df$pov_level, df$State)
                          
                           CA CO KT MS NY PA SC TX UT WI
  Very Low Poverty (<10%)   0  0  0  0  0  0  0  0  1  0
  Low Poverty (10-20%)      0  1  0  0  0  0  0  0  0  0
  Medium Poverty (20-30%)   1  0  0  0  0  1  0  0  0  1
  High Poverty (30-40%)     0  0  0  1  0  0  1  1  0  0
  Very High Poverty (>40%)  0  0  1  0  1  0  0  0  0  0

If I want to convert this table as a dataframe, I go about it as

x <- data.frame(unclass(table(df$pov_level, df$State)))

instead of x <- as.data.frame(table(df$pov_level, df$State)) since this gives me a dataframe of an ungrouped list.

But using the first one has an issue where sometimes I'd need to mutate the rownames as a new column.

I was wondering if there is a better way of doing this.

Much thanks!

CodePudding user response:

We could use as.data.frame.matrix

df2 <- as.data.frame.matrix(table(df$pov_level, df$State))

-output

> df2
                         CA CO KT MS NY PA SC TX UT WI
Very Low Poverty (<10%)   0  0  0  0  0  0  0  0  1  0
Low Poverty (10-20%)      0  1  0  0  0  0  0  0  0  0
Medium Poverty (20-30%)   1  0  0  0  0  1  0  0  0  1
High Poverty (30-40%)     0  0  0  1  0  0  1  1  0  0
Very High Poverty (>40%)  0  0  1  0  1  0  0  0  0  0
> str(df2)
'data.frame':   5 obs. of  10 variables:
 $ CA: int  0 0 1 0 0
 $ CO: int  0 1 0 0 0
 $ KT: int  0 0 0 0 1
 $ MS: int  0 0 0 1 0
 $ NY: int  0 0 0 0 1
 $ PA: int  0 0 1 0 0
 $ SC: int  0 0 0 1 0
 $ TX: int  0 0 0 1 0
 $ UT: int  1 0 0 0 0
 $ WI: int  0 0 1 0 0

Column can be created from rownames with rownames_to_column (tibble)

library(tibble)
library(dplyr)
df2 %>%
  rownames_to_column("pov_levels")
                pov_levels CA CO KT MS NY PA SC TX UT WI
1  Very Low Poverty (<10%)  0  0  0  0  0  0  0  0  1  0
2     Low Poverty (10-20%)  0  1  0  0  0  0  0  0  0  0
3  Medium Poverty (20-30%)  1  0  0  0  0  1  0  0  0  1
4    High Poverty (30-40%)  0  0  0  1  0  0  1  1  0  0
5 Very High Poverty (>40%)  0  0  1  0  1  0  0  0  0  0

Or if we want to use tidyverse

library(tidyr)
pivot_wider(df, names_from = State, values_from = Pov_rt,
     values_fn = length, values_fill = 0)
# A tibble: 5 × 11
  pov_level                   CA    WI    TX    MS    NY    KT    UT    CO    PA    SC
  <fct>                    <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 Medium Poverty (20-30%)      1     1     0     0     0     0     0     0     1     0
2 High Poverty (30-40%)        0     0     1     1     0     0     0     0     0     1
3 Very High Poverty (>40%)     0     0     0     0     1     1     0     0     0     0
4 Very Low Poverty (<10%)      0     0     0     0     0     0     1     0     0     0
5 Low Poverty (10-20%)         0     0     0     0     0     0     0     1     0     0

Or more directly with tabyl

library(janitor)
df %>%
   tabyl(pov_level, State)
                pov_level CA CO KT MS NY PA SC TX UT WI
  Very Low Poverty (<10%)  0  0  0  0  0  0  0  0  1  0
     Low Poverty (10-20%)  0  1  0  0  0  0  0  0  0  0
  Medium Poverty (20-30%)  1  0  0  0  0  1  0  0  0  1
    High Poverty (30-40%)  0  0  0  1  0  0  1  1  0  0
 Very High Poverty (>40%)  0  0  1  0  1  0  0  0  0  0

CodePudding user response:

Here is an alternative approach with model.matrix:

library(dplyr)

df %>%
  cbind(model.matrix(~ State   0, .) == 1) %>% 
  mutate(across(-c(1:3), ~as.integer(.)))

   State Pov_rt                pov_level StateCA StateCO StateKT StateMS StateNY StatePA StateSC StateTX StateUT StateWI
1     CA     25  Medium Poverty (20-30%)       1       0       0       0       0       0       0       0       0       0
2     WI     30  Medium Poverty (20-30%)       0       0       0       0       0       0       0       0       0       1
3     TX     35    High Poverty (30-40%)       0       0       0       0       0       0       0       1       0       0
4     MS     40    High Poverty (30-40%)       0       0       0       1       0       0       0       0       0       0
5     NY     45 Very High Poverty (>40%)       0       0       0       0       1       0       0       0       0       0
6     KT     50 Very High Poverty (>40%)       0       0       1       0       0       0       0       0       0       0
7     UT     10  Very Low Poverty (<10%)       0       0       0       0       0       0       0       0       1       0
8     CO     15     Low Poverty (10-20%)       0       1       0       0       0       0       0       0       0       0
9     PA     25  Medium Poverty (20-30%)       0       0       0       0       0       1       0       0       0       0
10    SC     40    High Poverty (30-40%)       0       0       0       0       0       0       1       0       0       0
  • Related