I've created a start condition (for strings) in flex and everything works fine. However, when I parse the same string twice, the elements using the start condition vanish. How can I solve it? please help me flex file
%option stack noyywrap
%{
extern int lineNumber; // definie dans prog.y, utilise par notre code pour \n
#include "h5parse.hpp"
#include <iostream>
#include <fstream>
using namespace std;
extern string initialdata;
extern string pdata;
extern bool loop;
string val;
string compile(string content);
string compilefile(string path);
void runwithargs(int argc ,char ** argv);
int saveoutput(string compileddata ,string outputpath="");
%}
%x strenv
i_command @include
e_command @extends
l_command @layout
f_command @field
command {i_command}|{e_command}|{l_command}|{f_command}
%%
"\"" { val.clear(); BEGIN(strenv); }
<strenv>"\"" { BEGIN(INITIAL);sprintf(yylval.str,"%s",val.c_str());return(STRING); }
<strenv><<EOF>> { BEGIN(INITIAL); sprintf(yylval.str,"%s",val.c_str());return(STRING); }
<strenv>. { val =yytext[0]; }
{command} {sprintf(yylval.str,"%s",yytext);return (COMMAND);}
"(" { return LPAREN; }
")" { return RPAREN; }
"{" { return LBRACE; }
"}" { return RBRACE; }
.|\n {yylval.c=yytext[0];return TXT; }
%%
//our main function
int main(int argc,char ** argv)
{
if(argc>1)runwithargs(argc,argv);// if there are arguments run with them
system("pause");//don't quit the app at the end of assembly
return(0);
}
//run h5A by using arguments
void runwithargs(int argc ,char ** argv)
{
if(argc == 2)
saveoutput(compilefile(argv[1]));
}
//assemble a string
string compile(string content)
{
do
{
loop=false;
pdata.clear();
YY_BUFFER_STATE b =yy_scan_string(content.c_str());
yyparse();
content=pdata;
}while(loop==true);
return content;
}
//assemble file
string compilefile(string path)
{
string data;
ifstream inputfile(path,ios::in|ios::binary|ios::ate);
int length = inputfile.tellg();
inputfile.seekg(0, std::ios::beg);
char * buffer = new char[length];// allocate memory for a buffer of appropriate dimension
inputfile.read(buffer, length);// read the whole file into the buffer
inputfile.close();
cout<<"start assembly : "<<path<<endl;
return compile(string(buffer));
}
//save assembled file to a specified path
int saveoutput(string compileddata ,string outputpath)
{
outputpath=(outputpath=="")?"output":outputpath;
ofstream outputfile ("output");
//dhow the compiled data in console if we're in debug
outputfile<<compileddata;
cout<<compileddata<<endl;
cout<<"operation terminated successfuly , output at :"
<<outputpath<<endl;
return 0;
}
bison file
%{
#include <stdio.h>
#include <iostream>
#include<fstream>
#include<map>
using namespace std;
typedef void* yyscan_t;
int lineNumber; // notre compteur de lignes
map <string,string> clayouts;
void yyerror ( char const *msg);
typedef union YYSTYPE YYSTYPE;
void yyerror ( char const *msg);
int yylex();
bool loop;
string pdata="";
%}
/* token definition */
%token STRING
%token COMMAND
%token LPAREN RPAREN LBRACE RBRACE
%token TXT
%union { char c; char str [0Xfff]; double real; int integer; }
%type<c> TXT;
%type<str> STRING COMMAND;
%start program
%%
program:value | command_call |txt | program program ;
value: STRING {pdata ='\"' $1 '\"'; };
command_call : COMMAND LPAREN STRING RPAREN {
if(string($1)=="@field")
{
cout<<"define field :"<<$3;
}
else if(string($1)=="@include")
{
ifstream t;
int length;
char * buffer;
t.open($3);
t.seekg(0, std::ios::end);
length = t.tellg();
t.seekg(0, std::ios::beg);
buffer = new char[length];
t.read(buffer, length);
t.close();
pdata =buffer;
}
else if (string($1)=="@layout")
{
cout<<"define layout for field "<<$3;
}
else if (string($1)=="@repeat")
{
cout<<"reapeat instruction"<<$3;
}
else
{
cout<<"extend with : "<<$3;
ifstream t;
int length;
char * buffer;
t.open($3);
t.seekg(0, std::ios::end);
length = t.tellg();
t.seekg(0, std::ios::beg);
buffer = new char[length];
t.read(buffer, length);
t.close();
}
loop=true;
};//LPAREN RPAREN ;
txt: TXT {pdata =$1;};
%%
void yyerror (const char *msg)
{
cout<<msg;
}
this is the output
Please help me understand why the strings disappear. Here is the full code my repository thank in advance
CodePudding user response:
Nothing here is disappearing and you're not parsing the same string twice.
The second parse is on a new string which you yourself created, consisting of data copied during the first parse. So they're different strings, and neither Flex nor Bison know about any relationship between them.
The reason that the second string does not contain the same data as the first string is simple: you didn't copy all of the data. Anything you don't copy "disappears".
In particular, your scanner only sends the data between double quotes to the parser. The parser attempts to add the double quotes, but it doesn't manage because the line:
pdata ='\"' $1 '\"';
means
pdata = ('\"' $1 '\"');
Since character literals are integers and $1
is an array of characters, which decays to a character pointer, that is the same as:
pdata = &$1[68]; // '\"' is 34
which is really undefined behaviour unless $1
has at least 67 characters, but in practice will be an empty string because Bison zero initializes stack values. (You shouldn't depend on that, though.)
In short, the second time you parse, the double quoted strings are not present, something you could easily have noted by debugging your parser actions.
Honestly, I don't think this is an appropriate architecture for a macro preprocessor. In general, you should let Flex handle reading from a file; it's good at doing that. Also, the Flex manual illustrates a couple of ways to handle "include files", and macro expansions can be incorporated using a similar technique.
Moreover, using a semantic value which occupies 4kb is not a good way of managing memory. It can easily result in blowing up the parser stack. And constantly converting back and forth between std::string
and C-style null-terminated arrays is also extremely inefficient.
But those are different questions.