Home > Software design >  Regex to generate dynamic sql
Regex to generate dynamic sql

Time:02-25

I want to generate dynamic sql on Notepad based on some rules. These rules include everything, so no sql knowledge is needed, and are the following:

  1. Dynamic sql must have each single quote escaped by another single quote ( 'hello' becomes ''hello'')
  2. Each line should begin with " @lin"
  3. If a line has only whitespace, nothing should be following the " @lin", despite following rules
  4. Replace each \t directly following " @lin" with " @tab"
  5. Add " ' " after the @lin/@tab sequence
  6. Add a single quote at the end of line

So, as an example, this input:

select 1,'hello'
from        --two tabs exist after from

    table1

should become:

 @lin 'select 1,''hello'''
 @lin 'from     --two tabs exist after from'
 @lin
 @lin @tab 'table1'

What I have for now is the following 4 steps:

  • Replace single quote with double quotes to cover rule 1
  • Replace ^(\t*)(.*)$ with \ @lin\1\ '\2' to cover rules 2,5,6
  • Replace \t with \ @tab to cover rule 4
  • Replace (\ @tab)*\ ''$ with nothing to cover rule 3

Notice that this mostly works, except for the third replacement, which replaces all tabs, and not only the ones at the beginning. I tried (?<=^\t*)\t with no success- it matches nothing.

I'm looking for a solution which satisfies the rules in as few replacement steps as possible.

CodePudding user response:

After replacing single quotes with 2 quotes, you can do the rest in a single step:

Not very elegant for processing multiple TABs, but it works.

  • Ctrl H
  • Find what: ^(?:(\t)(\t)?(\t)?(\t)?(\t)?(\S.*)|\h*|(. ))$
  • Replace with: @lin(?1 @tab (?2@tab )(?3@tab )(?4@tab )(?5@tab )'$6')(?7 '$7')
  • CHECK Match case
  • CHECK Wrap around
  • CHECK Regular expression
  • UNCHECK . matches newline
  • Replace all

Explanation:

^       # beginning of line
    (?:     # non capture group
        (\t)    # group 1, tabulation
        (\t)?   # group 2, tabulation, optional
        (\t)?   # group 3, tabulation, optional
        (\t)?   # group 4, tabulation, optional
        (\t)?   # group 5, tabulation, optional
        (\S.*)  # group 6, a non-space character followed by 0 or more any character but newline
      |       # OR
        \h*     # 0 or more horizontal spaces
      |       # OR
        (. )    # group 7, 1 or more any character but newline
    )       # end group
$       # end of line

Replacement:

 @lin       # literally
(?1         # if group 1 exists
     @tab       # add this
    (?2@tab )   # if group 2 exists, add a second @tab 
    (?3@tab )   #   id
    (?4@tab )   #   id
    (?5@tab )   #   id
    '$6'        # content of group 6 with single quotes
)           # endif
(?7         # if group 7 exists
                # plus sign
    '$7'        # content of group 3 with single quotes
)           # endif

Screenshot (before):

enter image description here

Screenshot (after):

enter image description here

CodePudding user response:

You can use three substitutions here, it is not quite possible (without additional assumptions) to reduce the number of steps here since you need to replace at the same positions.

Step 1: Replace single quotes with double - ' with ''. No regex so far, but you can have the regex checkbox on.

Step 2: Add @lin at the start of the line and only wrap its contents with ' if there is any non-whitespace char on the line (while keeping all TABs before the first '):

Find What: ^(\t* )(\h*\S)? (.*)
Replace With: @lin $1(?2'$2$3':)

Details:

  • ^ - start of a line
  • (\t* ) - Group 1 ($1): zero or more TABs
  • (\h*\S)? - Group 2 ($2): an optional sequence of any zero or more horizontal whitespace chars and then a non-whitespace char
  • (.*) - Group 3 ($3): the rest of the line
  • @lin $1(?2'$2$3':) - replaces the match with @lin Group 1 value (i.e. tabs found), and then - only if Group 2 matches - ' Group 2 Group 3 values '

Step 3: Replace each TAB after @lin with @tab :

Find What: (\G(?!^)|^\ @lin\ )\t
Replace With: $1@tab

Details:

  • (\G(?!^)|^\ @lin\ ) - Group 1: either
    • \G(?!^) - end of the previous match
    • | - or
    • ^\ @lin\ - start of a line and @lin string
  • \t - a TAB char.

The replacement is the concatenation of Group 1 value and @tab string.

See this regex online demo.

  • Related