Reorder a Dataframe in R-CodePudding

I want to reorder some tables in r.

I extracted all the values from lmer() and confint() functions. Maybe there is another way to summarise the lmer()-results directly? (of various lmer-functions)

My tables look like this for example:

a	loc_map_column	loc_stim_column	logVelocity	confidence_lower	confidence_upper	p_value
1	RA	NSR	-0.502600888	-0.8253034	-0.17987440	4.045130e-02
2	RA	loc_stimLA	-0.034630279	-0.4379126	0.36848580	2.084996e-29
3	RA	loc_stimRA	-0.691259575	-1.0801617	-0.30230958	2.582846e-08
4	LV	NSR	-0.182731047	-0.8182401	0.46092692	5.829759e-01
5	LV	loc_stimLV	-0.236928272	-1.0013429	0.53562934	4.102663e-01
6	LV	loc_stimRA	-0.913238563	-1.6754108	-0.13692249	2.867447e-27
7	LV	loc_stimRV	-0.042520352	-0.8544059	0.77911468	1.212858e-01
8	LA	NSR	0.007310606	-0.3134600	0.32826335	9.650069e-01
9	LA	loc_stimRA	-0.340983839	-0.7728683	0.09108281	1.413686e-09
10	RV	NSR	-0.351529452	-0.8852501	0.18298935	2.710587e-01
11	RV	loc_stimLA	-0.622877246	-1.3220944	0.07870576	1.422310e-03
12	RV	loc_stimLV	-0.175692036	-0.8070004	0.45670750	4.382634e-04
13	RV	loc_stimRA	-0.535687639	-1.1715584	0.10106017	4.286051e-04

What I need is a table of this style:

loc_stim_column		RA	LA	RV	LV
NSR	logVelocity	-0.502600888	...	...	...
	conficence_upper	-0.17987440	...	...	...
	confidence_lower	-0.8253034	...	...	...
	p_value	4.045130e-02	...	...	...
loc_stimRA	logVelocity	-0.691259575	...	...	...
	conficence_upper	-0.30230958	...	...	...
	confidence_lower	-1.0801617	...	...	...
	p_value	2.582846e-08	...	...	...
loc_stimLA	logVelocity	-0.691259575	...	...	...
	conficence_upper	0.36848580	...	...	...
	confidence_lower	-0.4379126	...	...	...
	p_value	2.084996e-29	...	...	...
loc_stimRV	logVelocity	NA	...	...	...
	conficence_upper	NA	...	...	...
	confidence_lower	NA	...	...	...
	p_value	NA	...	...	...
loc_stimLV	logVelocity	NA	...	...	...
	conficence_upper	NA	...	...	...
	confidence_lower	NA	...	...	...
	p_value	NA	...	...	...

How can I do that?

This is a code for a data frame to try out:

loc_map_column <- c("RA","RA","RA","LV","LV","LV","LV","LA","LA","RV","RV","RV","RV")
loc_stim_column <- c("NSR","loc_stimLA","loc_stimRA","NSR","loc_stimLV","loc_stimRA","loc_stimRV","NSR","loc_stimRA","NSR","loc_stimLA","loc_stimLV","loc_stimRA")
logVelocity <- c(-0.502600888,-0.034630279,-0.691259575,-0.182731047,-0.236928272, -0.913238563,-0.042520352,0.007310606,-0.340983839,-0.351529452,-0.622877246,-0.175692036,-0.535687639)
confidence_lower <- c(0.2, 0.3, 0.1, 0.4, 0.22, 0.11, 0.33, 0.44, 0.63, 0.133, 0.311, 0.01, 0.001)
confidence_upper <- c(0.2, 0.3, 0.1, 0.4, 0.22, 0.11, 0.33, 0.44, 0.63, 0.133, 0.311, 0.01, 0.001)
p_value <- c(0.2, 0.3, 0.1, 0.4, 0.22, 0.11, 0.33, 0.44, 0.63, 0.133, 0.311, 0.01, 0.001)

summarydf <- data.frame(loc_map_column, loc_stim_column, logVelocity, confidence_lower, confidence_upper, p_value)

CodePudding user response：

Based on the data you've provided, we can do this in two steps with the tidyr library.

library(tidyr)

The first step is to make the data longer, we do this with the function pivot_longer, selecting to put all of the numerical columns into one column with the function where:

longer <- pivot_longer(summarydf, cols = where(is.numeric), names_to = 'measure')
longer

This gives us:

# A tibble: 52 x 4
   loc_map_column loc_stim_column measure            value
   <chr>          <chr>           <chr>              <dbl>
 1 RA             NSR             logVelocity      -0.503 
 2 RA             NSR             confidence_lower  0.2   
 3 RA             NSR             confidence_upper  0.2   
 4 RA             NSR             p_value           0.2   
 5 RA             loc_stimLA      logVelocity      -0.0346
 6 RA             loc_stimLA      confidence_lower  0.3   
 7 RA             loc_stimLA      confidence_upper  0.3   
 8 RA             loc_stimLA      p_value           0.3   
 9 RA             loc_stimRA      logVelocity      -0.691 
10 RA             loc_stimRA      confidence_lower  0.1   
# ... with 42 more rows

We now need to go wider again, we use the function pivot_wider. We want that the column loc_map_column become the names of the numeric columns. With the previous step all of the numeric columns were mapped to a column called 'value'

pivot_wider(longer,  names_from = loc_map_column, values_from = value)

We can wrap these two steps into one with the pipe operator (%>%) to get

summarydf %>%
  pivot_longer(cols = where(is.numeric), names_to = 'measure') %>%
  pivot_wider(names_from = loc_map_column, values_from = value)

Which gives the output

# A tibble: 24 x 6
   loc_stim_column measure               RA     LV       LA     RV
   <chr>           <chr>              <dbl>  <dbl>    <dbl>  <dbl>
 1 NSR             logVelocity      -0.503  -0.183  0.00731 -0.352
 2 NSR             confidence_lower  0.2     0.4    0.44     0.133
 3 NSR             confidence_upper  0.2     0.4    0.44     0.133
 4 NSR             p_value           0.2     0.4    0.44     0.133
 5 loc_stimLA      logVelocity      -0.0346 NA     NA       -0.623
 6 loc_stimLA      confidence_lower  0.3    NA     NA        0.311
 7 loc_stimLA      confidence_upper  0.3    NA     NA        0.311
 8 loc_stimLA      p_value           0.3    NA     NA        0.311
 9 loc_stimRA      logVelocity      -0.691  -0.913 NA       -0.536
10 loc_stimRA      confidence_lower  0.1     0.11  NA        0.001
# ... with 14 more rows

CodePudding user response：

It would be helpful to have some available format for experimenting with your dataframe. However, I would use a combination of group_by() and pivot_wider() from the libraries: dplyr and tidyr.

https://tidyr.tidyverse.org/reference/pivot_wider.html

Hope it helps as a start... If you provide some copypaste format I can give it a try.

CodePudding user response：

Maybe there's a way to do it clearer, but this code work to accomplish what you're looking for. You can try something like this

data <- data.frame(a = c(1:13),
                   loc_map_column = c("RA","RA","RA","LV","LV","LV","LV",
                                      "LA","LA","RV","RV","RV","RV"),
                   loc_stim_column = c("NSR","loc_stimLA","loc_stimRA","NSR",
                                       "loc_stimLV","loc_stimRA","loc_stimRV","NSR",
                                       "loc_stimRA","NSR","loc_stimLA","loc_stimLV",
                                       "loc_stimRA"),
                   logVelocity = c(-0.502600888,-0.034630279,-0.691259575,-0.182731047,
                                   -0.236928272,-0.913238563,-0.042520352,0.007310606,
                                   -0.340983839,-0.351529452,-0.622877246,-0.175692036,
                                   -0.535687639),
                   confidence_lower = c(-0.8253034,-0.034630279,-0.691259575,-0.182731047,
                                        -0.236928272,-0.913238563,-0.042520352,0.007310606,
                                        -0.340983839,-0.351529452,-0.622877246,-0.175692036,
                                        -0.535687639),
                   confidence_upper = c(-0.1798744,0.3684858,-0.30230958,0.46092692,
                                        0.53562934,-0.13692249,0.77911468,0.32826335,
                                        0.09108281,0.18298935,0.07870576,0.4567075,
                                        0.10106017),
                   p_value = c(0.0404513,2.084996E-29,0.00000002582846,0.5829759,
                               0.4102663,2.867447E-27,0.1212858,0.9650069,
                               0.000000001413686,0.2710587,0.00142231,0.0004382634,
                               0.0004286051))
data <- data[,-1]


dtalogV <- data %>% group_by(loc_stim_column) %>% 
  pivot_wider(names_from = loc_map_column,values_from = logVelocity,
              id_cols = loc_stim_column)
dtaconfL <- data %>% group_by(loc_stim_column) %>% 
  pivot_wider(names_from = loc_map_column,values_from = confidence_lower,
              id_cols = loc_stim_column)
dtaconfU <- data %>% group_by(loc_stim_column) %>% 
  pivot_wider(names_from = loc_map_column,values_from = confidence_upper,
              id_cols = loc_stim_column)
dtaPval <- data %>% group_by(loc_stim_column) %>% 
  pivot_wider(names_from = loc_map_column,values_from = p_value,
              id_cols = loc_stim_column)

varnames <- c(rep("logVelocity",length(dtalogV)),rep("confidence_lower",length(dtaconfU)),
                  rep("confidence_upper",length(dtaPval)),rep("p_value",length(dtaconfL)))


databind <-  rbind(dtalogV,dtaconfL,dtaconfU,dtaPval)
final <- data.frame(databind,varnames=varnames)
final <- final[,c(1,6,2:5)]