I am trying to parse a text file that's formatted like this:
XB0136;4310136;28;10
XB0136;4310136;29;C
XB0139;4310188;30;5
XB0145;4254875;31;20
As you can see there's a pattern, every line corresponds to some values that are relative to the serial number (the first value separated by ";"
I want to search for a certain serial number and take the corresponding data (a serial number can be repeated in my file, as you can see the first two are the same but the corresponding data doesn't match: I want to take both data)
My attempt was to open the file, pass everything into an array, then tokenize the array using "\n"
as the first delimiter and ";"
as the second delimiter.
int main()
{
char matricola[50];
printf("insert serial number: \n");
scanf("%s", matricola);
FILE *fp=fopen("prova.txt","r");
if (!fp){
printf("file doesnt exist\n");
return -1;
}
fseek(fp, 0, SEEK_END);
unsigned int size=(ftell(fp));
rewind(fp);
if(size==-1){
printf("file is empty\n");
return -1;
}
if(size!=0) // if file not empty
{
printf("file exists and it is %u bytes\n", size);
char *delim = "\n", *delim2 = ";";
char buffer[size];
int rows = 25 // approx
int lines = ((size*sizeof(char)/rows) 100); // approx
char matrice[lines][rows];
fread(buffer,sizeof(buffer),1,fp);
fclose(fp);
char *svptr1, *svptr2;
char *token = strtok_r(buffer, delim, &svptr1);
int k=0;
while (token!=NULL)
{
strcpy(matrice[k],token);
token = strtok_r(NULL, delim, &svptr1);
k ;
}
}
return 1;
}
Here I managed to have an array of arrays where every index is a line of my txt file. But from here I really don't know what to do, I tried using strtok again but I'm getting strange behaviour. I want to check every line, see if the serial number is the one I'm searching for, and if yes save the corresponding data elsewhere. Then go to the next line.
CodePudding user response:
fgets
can be used to read each line of the file.
Use strncmp
to compare the first characters of the line to the serial number. strncmp
will return 0
for a match.
Upon a match, sscanf
can parse the fields from the line. The scanset [^;];
will scan up to 19 characters that are not a semi-colon, then scan the semi-colon.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main ( void)
{
char matricola[50] = "";
char line[100] = "";
printf("insert serial number: \n");
fgets ( matricola, sizeof matricola, stdin);
size_t length = strcspn ( matricola, "\n");
matricola[length] = 0; // remove newline
FILE *fp=fopen("prova.txt","r");
if (!fp){
printf("file doesnt exist\n");
return -1;
}
char (*matrice)[4][20] = NULL;
size_t rows = 0;
while ( fgets ( line, sizeof line, fp)) {
if ( ! strncmp ( line, matricola, length)) {
char (*temp)[4][20] = NULL;
if ( NULL == ( temp = realloc ( matrice, sizeof *matrice * ( rows 1)))) {
fprintf ( stderr, "realloc problem\n");
free ( matrice);
return 1;
}
matrice = temp;
if ( 4 == sscanf ( line, "[^;];[^;];[^;];[^\n]"
, matrice[rows][0]
, matrice[rows][1]
, matrice[rows][2]
, matrice[rows][3])) {
rows;
}
}
}
for ( size_t each = 0; each < rows; each) {
printf ( "%s\n", matrice[each][0]);
printf ( "%s\n", matrice[each][1]);
printf ( "%s\n", matrice[each][2]);
printf ( "%s\n\n", matrice[each][3]);
}
free ( matrice);
return 0;
}
CodePudding user response:
https://en.cppreference.com/w/c/string/byte/strtok:
This function is destructive: it writes the '\0' characters in the elements of the string str. In particular, a string literal cannot be used as the first argument of strtok.
I'm pretty sure you'll be happier just reading lines using fscanf
.
On another note, this is a semicolon-separated values file, and there's really really many libraries that read such reliably. Don't do this to yourself – C is really not a very good (or safe to use) language for string processing, and the built-in utilities like strtok
are really not that great (they're also 50 years old!). Many people automatically switch to other languages than C when they have to do text-processing heavy tasks, just because, in all honesty, C is not that well-equipped compared to other languages for this particular set of tasks!