I'm trying to read a .txt file and save all sentences end with .!?
into array. I use getLine
and strtok
to do this. When I save the sentences, it seems work. But when I try to retrieve data later through index, the first line is missing.
The input is in a file input.txt with content below
The wandering earth! In 2058, the aging Sun? is about to turn into a red .giant and threatens to engulf the Earth's orbit!
Below is my code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main() {
FILE *fp = fopen("input.txt", "r ");
char *line = NULL;
size_t len = 0;
char *sentences[100];
if (fp == NULL) {
perror("Cannot open file!");
exit(1);
}
char delimit[] = ".!?";
int i = 0;
while (getline(&line, &len, fp) != -1) {
char *p = strtok(line, delimit);
while (p != NULL) {
sentences[i] = p;
printf("sentences [%d]=%s\n", i, sentences[i]);
i ;
p = strtok(NULL, delimit);
}
}
for (int k = 0; k < i; k ) {
printf("sentence is ----%s\n", sentences[k]);
}
return 0;
}
output is
sentences [0]=The wandering earth
sentences [1]= In 2058, the aging Sun
sentences [2]= is about to turn into a red
sentences [3]=giant and threatens to engulf the Earth's orbit
sentence is ----
sentence is ---- In 2058, the aging Sun
sentence is ---- is about to turn into a red
sentence is ----giant and threatens to engulf the Earth's orbit
I use strtok
to split string directly. It worked fine.
CodePudding user response:
- Change mode from "r " to "r".
- Changed the list of delimiters from a variable to a constant
DELIMITERS
and added '\n'. You may or may not what that '\n' in there but I would need to see the expected output now that you supplied input. vim, at least, ends the last line with a '\n' which would generate at least one '\n' token at the end. The other option is to remove leading and trailing white space, and if you end up with an empty string then don't add it as a sentence. - Introduced a constant for number of sentences, and ignore additional sentences beyond what we have space for.
- Combined the two
strtok()
calls (DRY). - Eliminated the two memory leaks.
- If your input contains multiple lines the contents of line will be overwritten. This means the pointers in in
sentences
no longer make sense. The easiest fix isstrdup()
each string. Another approach would be to retain an array of line pointers (for subsequent free()) and havegetline()
allocate new a new line each time by resettingline = 0
andline = NULL
.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define DELIMITERS ".!?\n"
#define SENTENCES_LEN 100
int main() {
FILE *fp = fopen("input.txt", "r");
if (!fp) {
perror("Cannot open file!");
return 1;
}
char *line = NULL;
size_t len = 0;
char *sentences[SENTENCES_LEN];
int i = 0;
while (getline(&line, &len, fp) != -1) {
char *s = line;
for(; i < SENTENCES_LEN; i ) {
char *sentence = strtok(s, DELIMITERS);
if(!sentence)
break;
sentences[i] = strdup(sentence);
printf("sentences [%d]=%s\n", i, sentences[i]);
s = NULL;
}
}
for (int k = 0; k < i; k ) {
printf("sentence is ----%s\n", sentences[k]);
free(sentences[k]);
}
free(line);
fclose(fp);
}
Using the supplied input file the matching out is:
sentences [0]=The wandering earth
sentences [1]= In 2058, the aging Sun
sentences [2]= is about to turn into a red
sentences [3]=giant and threatens to engulf the Earth's orbit
sentence is ----The wandering earth
sentence is ---- In 2058, the aging Sun
sentence is ---- is about to turn into a red
sentence is ----giant and threatens to engulf the Earth's orbit