Not able to find my segmentation fault in this code of t-test-CodePudding

I have written this program for t-test. I'll add other functions as well, but first, I need to find my error. Here's my code

# include <stdio.h>
# include <math.h>

float mean(float x[], int size)
{
  float sum = 0.0;
  for (int i=0; i<size;i  )
    sum  = x[i];
  return sum/size;
}

float sumsq(float x[], int size)
{
  float sum = 0.0;
  for (int i=0; i<size;i  )
    sum  = pow(x[i]-mean(x,size),2);
  return sum;
}

int input(n)
{
  
  float x[n];
  printf("Enter the values one by one");
  for (int i = 0; i<n;i  )
    scanf("%f", &x[i]);
  return x;
}

void t_check(float x)// Make sure to write this function before each of the t-tests. That is because it is of void type. If the t-test is done before the checking function is declared, then it assumes it's datatype to be "int", and we get an error. So either write the t-check function before those functions, or just define it at the beginning of the program
{
  float t_tab;
  printf("Enter the tabulated value of t");
  scanf("%f",&t_tab);
  if (x<t_tab)
    printf("We do not have enough evidence to reject the null hypothesis");
  else
    printf("Reject the null hypothesis");
}


float t_diff_of_means()
{
  float x=0.0,y=0.0,s1=0.0,s2=0.0,S=0.0,t=0.0,tcal;
  int n,m,a,b;
  printf("Enter the number of variables in population 1");
  scanf("%d", &n);
  a = input(n);
  printf("Enter the number of variables in population 2");
  scanf("%d", &m);
  b = input(m);
  x = mean(a,n);
  y = mean(b,m);
  s1 = sumsq(a, n);
  s2 = sumsq(b, m);
  S  = sqrt((s1 s2)/(n m-2));
  t = (x-y)/(S*sqrt(1.0/n 1.0/m));
  t_check(t);
}


int main(void) 
{
  t_diff_of_means();
  return 0;
}

It gives segmentation fault as an error. I'm not able to understand where my code uses any memory uses a part of memory that is not allocated to it

CodePudding user response：

The main issue is you expect input() to read an array floats but you return an int. You should declare the type of the argument n. You cannot return an address to a local variable as it out of scope for caller. The easiest option is to the declare the array variable in main() then pass it to input to populate (pun). (not fixed) Check that return value of scanf() otherwise the variable you expect to be initialized may not be.
t_diff_of_means() is declared to return a float but nothing is returned. Not sure what you want to return so I changed the return type to void.
Tweaked various prompts to make it more them more readable.

#include <stdio.h>
#include <math.h>

float mean(float x[], int size)
{
    float sum = 0.0;
    for (int i=0; i<size;i  )
        sum  = x[i];
    return sum/size;
}

float sumsq(float x[], int size)
{
    float sum = 0.0;
    for (int i=0; i<size;i  )
        sum  = pow(x[i]-mean(x,size),2);
    return sum;
}

void input(size_t n, float a[n])
{
    printf("Enter the values one by one: ");
    for (int i = 0; i<n;i  )
        scanf("%f", a i);
}

void t_check(float x)
{
    float t_tab;
    printf("Enter the tabulated value of t: ");
    scanf("%f",&t_tab);
    if (x<t_tab)
        printf("We do not have enough evidence to reject the null hypothesis\n");
    else
        printf("Reject the null hypothesis\n");
}


void t_diff_of_means()
{
    float x=0.0,y=0.0,s1=0.0,s2=0.0,S=0.0,t=0.0;
    int n,m;
    printf("Enter the number of variables in population 1: ");
    scanf("%d", &n);
    float a[n];
    input(n, a);
    printf("Enter the number of variables in population 2: ");
    scanf("%d", &m);
    float b[m];
    input(m, b);
    x = mean(a,n);
    y = mean(b,m);
    s1 = sumsq(a, n);
    s2 = sumsq(b, m);
    S  = sqrt((s1 s2)/(n m-2));
    t = (x-y)/(S*sqrt(1.0/n 1.0/m));
    t_check(t);
}


int main(void)
{
    t_diff_of_means();
    return 0;
}

and example run:

Enter the number of variables in population 1: 2 
Enter the values one by one: 1
2
Enter the number of variables in population 2: 2
Enter the values one by one: 2
3
Enter the tabulated value of t: 0.05
We do not have enough evidence to reject the null hypothesis

Consider eliminating the variables you only use once (x, y, s1, s2, S, t and t_cal):

    t_check(
        (mean(a, n) - mean(b, m)) / (sqrt((sumsq(a, n) sumsq(b, m))/(n m-2))*sqrt(1.0/n 1.0/m))
    );

then I observed that this only depends on variables a, n, b and m so push that calculation into t_check():

void t_check(size_t a_len, float a[a_len], size_t b_len, float b[b_len]) {
    float t = (mean(a, a_len) - mean(b, b_len)) / (sqrt((sumsq(a, a_len) sumsq(b, b_len))/(a_len b_len-2))*sqrt(1.0/a_len 1.0/b_len));
    // ...
}

Then I changed the length types to size_t and used the clearer variable names in t_diff_of_means():

void t_diff_of_means()
{
    printf("Enter the number of variables in population 1: ");
    size_t a_len;
    scanf("%zu", &a_len);
    float a[a_len];
    input(a_len, a);

    printf("Enter the number of variables in population 2: ");
    size_t b_len;
    scanf("%zu", &b_len);
    float b[b_len];
    input(b_len, b);

    t_check(a_len, a, b_len, b);
}

We could take this another step by observing the two first sections in t_diff_of_means() are very similar, so we could have input() take a prompt and a pointer to an array of floats along with elements read. input() would then need to dynamically allocate the array of floats. This means most of our functions take a array of float and length argument. Let's create a type for that and refactor our functions to use it:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

struct array {
    size_t len;
    float *data;
};

float mean(struct array *a)
{
    float sum = 0;
    for (int i=0; i<a->len;i  )
        sum  = a->data[i];
    return sum/a->len;
}

float sumsq(struct array *a)
{
    float sum = 0;
    for (int i=0; i<a->len;i  )
        sum  = pow(a->data[i] - mean(a), 2);
    return sum;
}

void input(int prompt, struct array *a)
{
    printf("Enter the number of variables in population %d: ", prompt);
    scanf("%zu", &a->len);
    a->data = malloc(a->len * sizeof(a->data[0]));
    //if(!a->data) ...
    printf("Enter the values one by one: ");
    for (int i = 0; i<a->len;i  )
        scanf("%f", &a->data[i]);
}

void t_check(struct array a[2])
{
    float t = (mean(a) - mean(a 1)) / (
        sqrt(
            (sumsq(a)   sumsq(a 1)) / (a[0].len   a[1].len-2)
        ) * sqrt(1.0/a[0].len   1.0/a[1].len)
    );
    printf("Enter the tabulated value of t: ");
    float t_tab;
    scanf("%f",&t_tab);
    if (t<t_tab)
        printf("We do not have enough evidence to reject the null hypothesis\n");
    else
        printf("Reject the null hypothesis\n");
}

int main(void)
{
    struct array a[2];
    input(1, a);
    input(2, a 1);
    t_check(a);
}

This would be a good base to add additional functions to.