Home > OS >  Evaluating a condition with AND and OR statements combined
Evaluating a condition with AND and OR statements combined

Time:10-26

I'm parsing files which all have following structure:

A=B      #A==test

This means that variable A will be set to B, only if A equals 'test' Conditions can get more complex as well

C=D      #C==test1,D==test2

This means that C will be set to D if C equals 'test1' AND D equals 'test2' So far so good, I parse the individual conditions one by one, and stop as soon as one evaluates to false.

Conditions can also have OR statements:

E=F     #E==test3,(F==test4|F==test5),username!=abc

This means that E will be set to F if E equals 'test3' AND (F equals 'test4' OR F equals 'test5') AND username does not equal 'abc'

I'm stuck on evaluating this last condition. What is a good approach to evaluate such an expression?

Currently I'm parsing each condition into 3 arrays, being 'lefthands', 'operators, 'righthands'. The condition #C==test1,D==test2 gets parsed into:

$lefthands=(C, D)
$righthands=(test1,test2)
$operands=(==, ==)

Afterwards I loop over the arrays and do the evaluation based on the operand that is currently being used.

if $operands[$i] -eq "==" ( return ($lefthands[$i] -eq $righthands[$i] }

As soon as one of the conditions returns false, I stop evaluating the complete condition. This works fine if there are only AND statements, but with an OR statement present this does no longer work.

CodePudding user response:

Looks like you've got a fairly straightforward infix expression grammar that goes something like this:

variable = [ a-z | A-Z ]
string   = [ a-z | A-Z | 0-9 ]

atom     = variable [ "==" | "!=" ] string

expr     = [ atom | "(" expr ")" | expr [ "," | "|" ] expr ]

Assuming you can nest expressions to an arbitrary depth and you want to handle operator precedence properly (e.g. does w,x|y,z mean ((w,x)|y),z) or (w,x)|(y,z) or even w,(x|y),z, all of which give different results) you're going to struggle to evaluate the expression with basic string manipulation and you might need use a more complicated approach:

  • Lex the string into tokens
  • Change the order of the tokens (infix -> postfix) to make it easier to calculate results
  • Evaluate the expression given some values for variables

Lexing

This basically breaks the expression string down into a list of logical chunks - for example A==test might become identifier "A", operator "==", identifier "test".

It's called "infix" notation because the operator sits in-between the operands - e.g. A==test.

function ConvertTo-InfixTokens
{
    param( [string] $Source )

    $chars = $Source.ToCharArray();
    $index = 0;

    while( $index -lt $chars.Length )
    {

        $token = [pscustomobject] [ordered] @{
            "Type"  = $null
            "Value" = [string]::Empty
        };

        $char = $chars[$index];

        switch -regex ( $char )
        {

            # var or str
            "[a-zA-z0-9]" {
                $token.Type = "identifier";
                $token.Value  = $char;
                $index  = 1;
                while( ($index -lt $chars.Length) -and ($chars[$index] -match "[a-zA-z0-9]") )
                {
                    $token.Value  = $chars[$index];
                    $index  = 1;
                }
            }

            # eq
            "=" {
                if( $chars[$index 1] -eq "=" )
                {
                    $token.Type  = "eq";
                    $token.Value = $chars[$index]   $chars[$index 1];
                    $index  = 2;
                }
                else
                {
                    throw "LEX01 - unhandled char '$($chars[$index 1])'";
                }
            }

            # ne
            "!" {
                if( $chars[$index 1] -eq "=" )
                {
                    $token.Type  = "ne";
                    $token.Value = $chars[$index]   $chars[$index 1];
                    $index  = 2;
                }
                else
                {
                    throw "LEX02 - unhandled char '$($chars[$index 1])'";
                }
            }

            "," {
                $token.Type  = "and";
                $token.Value = $char;
                $index  = 1;
            }

            "\|" {
                $token.Type  = "or";
                $token.Value = $char;
                $index  = 1;
            }

            "\(" {
                $token.Type  = "lb";
                $token.Value = $char;
                $index  = 1;
            }

            "\)" {
                $token.Type  = "rb";
                $token.Value = $char;
                $index  = 1;
            }

            default {
                throw "LEX03 - unhandled char '$char'";
            }

        }

        write-output $token;

    }

}

Examples:

PS> ConvertTo-InfixTokens -Source "A==test"

Type       Value
----       -----
identifier A
eq         ==
identifier test

PS> ConvertTo-InfixTokens -Source "E==test3,(F==test4|F==test5),username!=abc"

Type       Value
----       -----
identifier E
eq         ==
identifier test3
and        ,
lb         (
identifier F
eq         ==
identifier test4
or         |
identifier F
eq         ==
identifier test5
rb         )
and        ,
identifier username
ne         !=
identifier abc

Convert to Postfix

Infix notation looks more natural because it's similar to normal maths, but it relies on you handling the precedence rules for operations when you evaluate expressions - e.g. is 1 1 * 2 2 equal to (((1 1) * 2) 2) => 6 or 1 (1 * 2) 2 => 5 or (1 1) * (2 2) => 8? - which means you might need to look ahead to see what other operators are coming in case they need to be processed first.

It's easier to evaluate the expression if we convert it to postfix notation (or Reverse Polish Notation) - see this answer for some additional advantages. This basically puts the operands first and then follows with the operator - e.g. 1 1 * 2 2 becomes 1 1 2 * 2 , which will be logically processed as: ((1 (1 2 *) ) 2 ) => ((1 2 ) 2 ) => (3 2 ) => 5

The following code will apply this conversion:

# based on https://www.tutorialspoint.com/Convert-Infix-to-Postfix-Expression#:~:text=To convert infix expression to,maintaining the precedence of them.
function Convert-InfixToPostfix
{
    param( [pscustomobject[]] $Tokens )

    $precedence = @{
        "or"  = 1
        "and" = 2
        "eq"  = 9
        "ne"  = 9
    };

    $stack = new-object System.Collections.Generic.Stack[pscustomobject];

    for( $i = 0; $i -lt $Tokens.Length; $i   )
    {

        $token = $Tokens[$i];

        switch( $token.Type )
        {

            "identifier" {
                write-output $token;
            }

            "lb" {
                $stack.Push($token);
            }

            "rb" {
                while( $stack.Peek().Type -ne "lb" )
                {
                    write-output $stack.Pop();
                }
                $null = $stack.Pop();
            }

            default {
                # must be a known operator
                if( -not $precedence.ContainsKey($token.Type) )
                {
                    throw "PF01 - unknown operator '$($token | ConvertTo-Json -Depth 99 -Compress)' at index $i";
                }
                while( ($stack.Count -gt 0) -and ($precedence[$token.Type] -le $precedence[$stack.Peek().Type]) )
                {
                    write-output $stack.Pop();
                }
                $stack.Push($token);
            }

        }
   
    }

    while( $stack.Count -gt 0 )
    {
        write-output $stack.Pop();
    }

}

Examples:

PS> $infix   = ConvertTo-InfixTokens -Source "A==test"
PS> $postfix = Convert-InfixToPostfix -Tokens $infix
PS> $postfix

Type       Value
----       -----
identifier A
identifier test
eq         ==

PS> $infix = ConvertTo-InfixTokens -Source "E==test3,(F==test4|F==test5),username!=abc"
PS> $postfix = Convert-InfixToPostfix -Tokens $infix
PS> $postfix

Type       Value
----       -----
identifier E
identifier test3
eq         ==
identifier F
identifier test4
eq         ==
identifier F
identifier test5
eq         ==
or         |
and        ,
identifier username
identifier abc
ne         !=
and        ,

Evaluation

The last step is to take the postfix tokens and evaluate them. "All" we need to do is read the tokens from left to right - when we get an operand we put it on a stack, and when we get an operator we read some operands off the stack, apply the operator and push the result back onto the stack...

function Invoke-PostfixEval
{
    param( [pscustomobject[]] $Tokens, [hashtable] $Variables )

    $stack = new-object System.Collections.Generic.Stack[pscustomobject];

    for( $i = 0; $i -lt $Tokens.Length; $i   )
    {

        $token = $Tokens[$i];

        if( $token.Type -eq "identifier" )
        {
            $stack.Push($token);
        }
        elseif( $token.Type -in @( "eq", "ne" ) )
        {

            $str = $stack.Pop().Value;
            $var = $stack.Pop();
            if( -not $Variables.ContainsKey($var.Value) )
            {
                throw "undefined variable '$($var.Value)' in token '$($token | ConvertTo-Json -Depth 99 -Compress)' at index $i";
            }
            $val = $Variables[$var.Value];
            $result = switch( $token.Type ) {
                "eq"    { $val -eq $str }
                "ne"    { $val -ne $str }
                default { throw "unhandled operator '$($token.Type)' in token '$($token | ConvertTo-Json -Depth 99 -Compress)' at index $i"; }
            }
            #write-host "'$val' $($token.Type) '$str' => $result";
            $stack.Push([pscustomobject] [ordered] @{ "Type" = "bool"; Value = $result });

        }
        elseif( $token.Type -in @( "and", "or" ) )
        {

            $left  = $stack.Pop().Value;
            $right = $stack.Pop().Value;
            $result = switch( $token.Type ) {
                "and"   { $left -and $right }
                "or"    { $left -or  $right }
                default { throw "unhandled operator '$($token.Type)' in token '$($token | ConvertTo-Json -Depth 99 -Compress)' at index $i"; }
            }
            #write-host "'$left' $($token.Type) '$right' => $result";
            $stack.Push([pscustomobject] [ordered] @{ "Type" = "bool"; Value = $result });

        }
        else
        {
            throw "EVAL01 - unhandled token '$($token | ConvertTo-Json -Depth 99 -Compress)' at index $i";
        }
    }

    return $stack.Peek().Value;

}

Examples:

PS> $variables = [ordered] @{
    "A" = "test"; 
    "C" = "test1";
    "D" = "test2";
    "E" = "test3";
    "F" = "test4";
    "username" = "abc"
}

PS> $infix   = ConvertTo-InfixTokens -Source "A==test"
PS> $postfix = Convert-InfixToPostfix -Tokens $infix
PS> $value   = Invoke-PostfixEval -Tokens $postfix -Variables $variables
PS> $value
True

PS> $infix   = ConvertTo-InfixTokens -Source "E==test3,(F==test4|F==test5),username!=abc"
PS> $postfix = Convert-InfixToPostfix -Tokens $infix
PS> $value   = Invoke-PostfixEval -Tokens $postfix -Variables $variables
PS> $value
False # fails because "username!=abc" is false

PS> $infix   = ConvertTo-InfixTokens -Source "E==test3,F==test4|F==test5,username!=abc"
PS> $postfix = Convert-InfixToPostfix -Tokens $infix
PS> $value   = Invoke-PostfixEval -Tokens $postfix -Variables $variables
PS> $value
True # operator precedence means it's evaluated as "(E==test3,F==test4)|(F==test5,username!=abc)" and "(E==test3,F==test4)" is true

You'll probably want to add more error handling for production code, and write some Pester tests to make sure it works for a larger variety of inputs, but it worked for the limited tests I did here. If you find any examples that don't work feel free to post them in comments...

CodePudding user response:

For the logical-or expression you can stop the evaluation at the first returned true, because the evaluation of the whole expression will clearly be true as well. This is an optimization of the evaluation that is performed in mathematics and by many compilers and interpreters.

  • Related