Home > Blockchain >  R Software: Creating a table of r-squares from a table with multiple data series
R Software: Creating a table of r-squares from a table with multiple data series

Time:11-23

I have a data.frame with multiple columns in it. The first in the frame is the dependent variable and the other columns are various independent variables. I'd like to create a table with all the R2s where column1 is y, and the each column is a different x.

Here's an example data.frame:

df <- data.frame(
  'A' = runif(20,min=0, max=100),
  'B' = runif(20,min=0, max=100),
  'C' = runif(20,min=0, max=100),
  'D' = runif(20,min=0, max=100),
  'E' = runif(20,min=0, max=100)
)

and I'm using a function to calculate R2:

rsq <- function(x, y) summary(lm(y~x,na.action = na.omit))$r.squared

I would like the output to be look like this:

          A.B         A.C         A.D         A.E 
1 0.009213715 0.009213715 0.009213715 0.009213715 

I know I could hard code the table this way:

r2_df<- data.frame(
  'A~B'=rsq(x=df$B,y=df$A),
  'A~C'=rsq(x=df$C,y=df$A),
  'A~D'=rsq(x=df$D,y=df$A),
  'A~E'=rsq(x=df$E,y=df$A)
)

But, here's the kicker, my data frame will change from time to time, with different data series and a different number of columns. "A" will stay the same, but next time I pull the data I may end up with columns "A","B","X","Y","Z","P","O","S". So, I don't want to hard code anything, I'd like to just set A as y, and have it loop through the the rest of the columns to produce the table. I'm new to R, and I'm struggling to get an apply function to produce anything.

Thank you for any help!

CodePudding user response:

We may need to loop over the columns other than the first, apply the rsq function on the column with the 'A' column, modify the names of the list output and then coerce it to data.frame

lst1 <- lapply(df[-1], function(x) rsq(x, df$A))
names(lst1) <- paste0("A.", names(lst1))
as.data.frame(lst1)

-output

     A.B       A.C         A.D        A.E
1 0.1514966 0.1207118 0.003884215 0.02558644

NOTE: values are different as the data was created with runif and there was no set.seed

  •  Tags:  
  • r
  • Related