Home > Mobile >  R - get z-score equivalents
R - get z-score equivalents

Time:07-11

Assume we have a following numeric vector in R, whose values can range from 1 to 5:

vec <- c(4.6, 1.2, 3.5, 2.1, 3, 1.1, 4.6)

It would be pretty easy to calculate z-scores on this vector:

scale(vec)


            [,1]
[1,]  1.17822372
[2,] -1.13927417
[3,]  0.42844499
[4,] -0.52581885
[5,]  0.08763647
[6,] -1.20743587
[7,]  1.17822372
attr(,"scaled:center")
[1] 2.871429
attr(,"scaled:scale")
[1] 1.4671

However, what I want is a data frame with two columns. The first column shows the integer values which vec can take, and the second column show what would be the equivalent z-scores, given the observed data.

I just don't know how to construct the second column. If the data happens to contain integer numbers, such as in my example, it's easy to find that if vec = 3, vec_z = 0.08763647. However, these actual integers are rarely found in the data that I am dealing with. So what would be the fastest way to construct this?

EDIT

For some reason, I already got two answers, which just suggested that I merge vec and vec_z into a data frame. Both authors deleted their answers. This is not what I am asking - please, pay attention to the part of my question where I say that I need integers in the first column. So given that I have vec, how do I make a dataframe like this:

A B
1 
2
3
4
5

Where the values in B would be corresponding z-values which could be calculated given the observed data in vec.

CodePudding user response:

This might be what you're looking for.

Starting with your vector:

vec <- c(4.6, 1.2, 3.5, 2.1, 3, 1.1, 4.6)

You can use scale and it will provide the mean and SD stored as attributes:

s <- scale(vec)
attributes(s)

$dim
[1] 7 1

$`scaled:center`
[1] 2.871429

$`scaled:scale`
[1] 1.4671

Then you can use those attributes to scale whatever new integers you want to include in your final data.frame.

my_int <- 1:10

data.frame(
  int = my_int,
  scaled_val = (my_int - attr(s, "scaled:center")) / attr(s, "scaled:scale")
)

Output

   int  scaled_val
1    1 -1.27559758
2    2 -0.59398055
3    3  0.08763647
4    4  0.76925350
5    5  1.45087053
6    6  2.13248755
7    7  2.81410458
8    8  3.49572160
9    9  4.17733863
10  10  4.85895566
  •  Tags:  
  • r
  • Related