Home > Software design >  Why does my table say “No data available” instead of zero?
Why does my table say “No data available” instead of zero?

Time:08-16

I need to create multiple new data frames based on different filters that contain two variable counts “d” & “e” based of the values in columns “a”, “b” and “c”. I have created a function for this that works as long as at least one column has a value. However, sometimes certain groups will have no answer for a, b or c. I want both d and e to both return zero in the columns when this happens but instead it says “No data available in table”. I’ve added my code below.

f_calculate_net = function(data) 
{ data %>% mutate(a = ifelse("a" %in% colnames(data), a, 0)) %>% 
mutate(b = ifelse("b" %in% colnames(data), b, 0)) %>% 
mutate(c = ifelse("c" %in% colnames(data), c, 0)) %>% 
mutate(d = ifelse(a   b   c == 0, 0, ((a/(a b))*c) a)) %>% 
mutate(e = ifelse(a   b   c == 0, 0, ((b/(a b))*c) b)) %>% 
select(d,e) }

CodePudding user response:

Your main problem here is in the way you are using ifelse. The expression "a" %in% colnames(data) always returns a length-1 logical vector (either TRUE or FALSE). So the output of the expression ifelse("a" %in% colnames(data), a, 0) will also be of length 1. It will return either the first element of a or a single 0. Since this is inside a mutate call, a will either be overwritten by the first element of a, or will be created as a column of zeros. Instead of ifelse you should use

if(!"a" %in% colnames(data)) data$a <- 0

And the same for columns b and c.

You will sometimes still get a NaN entry in columns d and e here if both a and b are 0, but c isn't, since your expression ((a/(a b))*c) a) divides by the sum of a and b. You should only check whether a b == 0, since in that case you should return a 0

So the fixed function would be something like:

f_calculate_net = function(data) { 
  
  if(!"a" %in% colnames(data)) data$a <- 0
  if(!"b" %in% colnames(data)) data$b <- 0
  if(!"c" %in% colnames(data)) data$c <- 0
  
  data %>% 
  mutate(d = ifelse(a   b == 0, 0, ((a/(a b))*c) a)) %>% 
  mutate(e = ifelse(a   b == 0, 0, ((b/(a b))*c) b)) %>% 
  select(d,e) 
}

Let's create some random data to test this:

set.seed(123)

df <- data.frame(a = rpois(5, 1), b = rpois(5, 2), c = rpois(5, 1))

df
#>   a b c
#> 1 0 0 3
#> 2 2 2 1
#> 3 1 4 1
#> 4 2 2 1
#> 5 3 2 0

And we see that we get the expected output:

f_calculate_net(df)
#>     d   e
#> 1 0.0 0.0
#> 2 2.5 2.5
#> 3 1.2 4.8
#> 4 2.5 2.5
#> 5 3.0 2.0

Created on 2022-08-15 by the reprex package (v2.0.1)

CodePudding user response:

When a and b are zero a/b is NAN. If you want this case to be zero, try change a b c == 0 to (a b) == 0

Based on Allan's explanation and comment, another possibility is to make a logical vector of the same length as the number of rows:

f_calculate_net = function(data) 
{ data %>%
    mutate(a = ifelse(rep("a" %in% colnames(data), nrow(data)), a, 0)) %>% 
    mutate(b = ifelse(rep("b" %in% colnames(data), nrow(data)), b, 0)) %>% 
    mutate(c = ifelse(rep("c" %in% colnames(data), nrow(data)), c, 0)) %>% 
    mutate(d = ifelse(a   b  == 0, 0, ((a/(a b))*c) a)) %>% 
    mutate(e = ifelse(a   b  == 0, 0, ((b/(a b))*c) b)) %>% 
    select(d,e) }
  •  Tags:  
  • r
  • Related