Home > Back-end >  Loop through text file after filtering out comments in C
Loop through text file after filtering out comments in C

Time:08-31

I am trying to filter out comments from a text file denoted by '#'. I am having trouble looping through the entire file and printing the output to the terminal. The code removes the first line of text and the second lines comments as it should but does not continue past line 2 (prints 4, 2), any help would be appreciated. I'm definitely missing something as I have had to learn two semesters of C in a weekend and don't totally have a grasp on all of its usage.

The file being read

# this line is a full comment that might be pseudo-code or whatever
4, 2 # 4, 3  
1  
# 9  
7  
endNode  
endNet

The program

#include <stdio.h>
#include <string.h>
#define BUFF_SIZE 1024
#define COMMENT_MARKER '#'
int main()
{

FILE *fp;
char buffer[BUFF_SIZE];

if ((fp = fopen("F:\\PythonProjects\\C\\text.txt", "r")) == NULL)
{
    perror("Error opening file");
    exit(1);
}

while (fgets(buffer, BUFF_SIZE, fp) != NULL)
{
    char *comment = strchr(buffer, COMMENT_MARKER);
    if (comment != NULL)
    {
        size_t len = strlen(comment);
        memset(comment, '\0', len);
        printf("%s", buffer);
        
    }
}
fclose(fp);
}

CodePudding user response:

Your current code only prints out a line if a # is found in it. It skips printing lines without any comment. And because you set everything from the first # to the end of the string to nul bytes, it won't print a newline after each line, meaning the results all run together.

You can fix these issues by moving the output after the comment removal block, and always printing out a newline. This means that in lines without comments, you have to do something about the newline at the end (if any; it could be missing because of a long line or the input file lacking one after the last line) so you don't get two newlines after each non-comment line.

Luckily, there are ways in standard C to find the first occurrence of one of a set of character, not just a single character. You can look for either the comment character or newline in a single pass through the line, and replace it with a single nul byte - no need to memset() everything after it to 0's. Example:

#include <stdio.h>
#include <string.h>
#include <stdlib.h> // Needed for exit()

#define BUFF_SIZE 1024
#define COMMENT_MARKER '#'

int main()
{
  FILE *fp;
  char buffer[BUFF_SIZE];

  if ((fp = fopen("text.txt", "r")) == NULL)
    {
      perror("Error opening file");
      exit(1);
    }

  char tokens[3] = { COMMENT_MARKER, '\n', '\0' };
  
  while (fgets(buffer, BUFF_SIZE, fp) != NULL)
    {
      // Look for the first # or newline in the string
      char *comment_or_nl = strpbrk(buffer, tokens);
      if (comment_or_nl)
        {
          // and if found, replace it with a nul byte
          *comment_or_nl = '\0';
        }
      // Then print out the possibly-truncated string (puts() adds a newline)
      puts(buffer);
    }
  
  fclose(fp);
  return 0;
}

CodePudding user response:

Here is your code, minimally adapted to achieve the objective.

#include <stdio.h>
#include <string.h>
#define BUFF_SIZE 1024
#define COMMENT_MARKER '#'

int main()
{
    FILE *fp;
    if ((fp = fopen("F:\\PythonProjects\\C\\text.txt", "r")) == NULL)
    {
        perror("Error opening file");
        exit(1);
    }

    char buffer[BUFF_SIZE]; // declare variables proximate to use
    while (fgets(buffer, BUFF_SIZE, fp) != NULL)
    {
        char *comment = strchr(buffer, COMMENT_MARKER);
        if (comment != NULL)
        {
            strcpy( comment, "\n" ); // Just clobber the comment section
        }
        printf("%s", buffer); // Always print something
    }
    fclose(fp);
}
  • Related