I'm trying to input a line at the end of a file that has the following shape "1 :1 :1 :1" , so at some point the file may have a new line character at the end of it, and in order to execute the operation I have to deal with that, so I came up with the following solution : go to the end of the file and go backward by 1 characters (the length of the new line character in Linux OS as I guess), read that character and if it wasn't a new line character insert a one and then insert the whole line else go and insert the line, and this is the translation of that solution on C :
int insert_element(char filename[]){
elements *elem;
FILE *p,*test;
size_t size = 0;
char *buff=NULL;
char c='\n';
if((p = fopen(filename,"a"))!=NULL){
if(test = fopen(filename,"a")){
fseek(test,-1,SEEK_END );
c= getc(test);
if(c!='\n'){
fprintf(test,"\n");
}
}
fclose(test);
p = fopen(filename,"a");
fseek(p,0,SEEK_END);
elem=(elements *)malloc(sizeof(elements));
fflush(stdin);
printf("\ninput the ID\n");
scanf("%d",&elem->id);
printf("input the adress \n");
scanf("%s",elem->adr);
printf("innput the type \n");
scanf("%s",elem->type);
printf("intput the mark \n");
scanf("%s",elem->mark);
fprintf(p,"%d :%s :%s :%s",elem->id,elem->adr,elem->type,elem->mark);
free(elem);
fflush(stdin);
fclose(p);
return 1;
}else{
printf("\nRrror while opening the file !\n");
return 0;
}
}
as you may notice that the whole program depends on the length of the new line character (1 character "\n") so I wonder if there is an optimal way, in another word works on all OS's
CodePudding user response:
It seems you already understand the basics of appending to a file, so we just have to figure out whether the file already ends with a newline.
In a perfect world, you'd jump to the end of the file, back up one character, read that character, and see if it matches '\n'
. Something like this:
FILE *f = fopen(filename, "r");
fseek(f, -1, SEEK_END); /* this is a problem */
int c = fgetc(f);
fclose(f);
if (c != '\n') {
/* we need to append a newline before the new content */
}
Though this will likely work on Posix systems, it won't work on many others. The problem is rooted in the many different ways systems separate and/or terminate lines in text files. In C and C , '\n'
is a special value that tells the text mode output routines to do whatever needs to be done to insert a line break. Likewise, the text mode input routines will translate each line break to '\n'
as it returns the data read.
On Posix systems (e.g., Linux), a line break is indicated by a line feed character (LF) which occupies a single byte in UTF-8 encoded text. So the compiler just defines '\n'
to be a line feed character, and then the input and output routines don't have to do anything special in text mode.
On some older systems (like old MacOS and Amiga) a line break might be a represented by a carriage return character (CR). Many IBM mainframes used different character encodings called EBCDIC that don't have a direct mappings for LF or CR, but they do have a special control character called next line (NL). There were even systems (like VMS, IIRC) that didn't use a stream model for text files but instead used variable length records to represent each line, so the line breaks themselves were implicit rather than marked by a specific control character.
Most of those are challenges you won't face on modern systems. Unicode added more line break conventions, but very little software supports them in a general way.
The remaining major line break convention is the combination CR LF. What makes CR LF challenging is that it's two control characters, but the C i/o functions have to make them appear to the programmer as though they are the single character '\n'
. That's not a big deal with streaming text in or out. But it makes seeking within a file hard to define. And that brings us back to the problematic line:
fseek(f, -1, SEEK_END);
What does it mean to back up "one character" from the end on a system where line breaks are indicated by a two character sequence like LF CR? Do we really want the i/o system to have to possibly scan the entire file in order for fseek
(and ftell
) to figure out how to make sense of the offset?
The C standards people punted. In text mode, the offset argument for fseek
can only be 0
or a value returned by a previous call to ftell
. So the problematic call, with a negative offset, isn't valid. (On Posix systems, the invalid call to fseek
will likely work, but the standard doesn't require it to.)
Also note that Posix defines LF as a line terminator rather than a separator, so a non-empty text file that doesn't end with a '\n'
should be uncommon (though it does happen).
For a more portable solution, we have two choices:
Read the entire file in text mode, remembering whether the most recent character you read was
'\n'
.This option is hugely inefficient, so unless you're going to do this only occasionally or only with short files, we can rule that out.
Open the file in binary mode, seek backwards a few bytes from the end, and then read to the end, remembering whether the last thing you read was a valid line break sequence.
This might be a problem if our
fseek
doesn't support theSEEK_END
origin when the file is opened in binary mode. Yep, the C standard says supporting that is optional. However, most implementations do support it, so we'll keep this option open.Since the file will be read in binary mode, the input routines aren't going to convert the platform's line break sequence to
'\n'
. We'll need a state machine to detect line break sequences that are more than one byte long.Let's make the simplifying assumption that a line break is either LF or CR LF. In the latter case, we don't care about the CR, so we can simply back up one byte from the end and test whether it's LF.
Oh, and we have to figure out what to do with an empty file.
bool NeedsLineBreak(const char *filename) {
const int LINE_FEED = '\x0A';
FILE *f = fopen(filename, "rb"); /* binary mode */
if (f == NULL) return false;
const bool empty_file = fseek(f, 0, SEEK_END) == 0 && ftell(f) == 0;
const bool result = !empty_file ||
(fseek(f, -1, SEEK_END) == 0 && fgetc(f) == LINE_FEED);
fclose(f);
return result;
}