Home > Back-end >  How to calculate the average of a sequence in Z3(Python)
How to calculate the average of a sequence in Z3(Python)

Time:09-22

I am trying to specify a concrete formula in Z3-Python.

Exists i::Integer s.t. (0<=i<|arr|) & (avg(arr) t<arr[i])

This means: whether there is a position i::0<i<|arr| in the array whose value a[i] is greater than the average of the array avg(arr) plus a given threshold t.

I posted at Theory of arrays in Z3: (1) model is difficult to understand, (2) do not know how to implement functions and (3) difference with sequences how I approached this using arrays. Among others, I need to define a function for getting the average of the array, and another one for its length; however, approaching this problem using Z3's arrays does not seem to be a good idea.

Thus, I 'emulated' it using sequences:

t = Int('t')
avg_seq = Int('avg_seq')
i = Int('i')

seq = Const('seq', SeqSort(IntSort()))

phi = Exists(i, And([(And(2<t, t<10)), (And(0 <= i, i< Length(seq))), ((t avg_seq<seq[i]))]))

s = Solver()
s.add(phi)
print(s.check())
print(s.model())

Where one model I got is the following:

sat
[avg_seq = 7715,
 t = 3,
 seq = Unit(7719),
 seq.nth_u = [else -> 4]]

As you can see, I solved the problem of how to calculate Length, since it is a primitive in Z3.

However, how can I create an average function? Is there any primitives for it? I guess there is not, so I planned to create a recursive version of it. The pseudo-code of my idea is as follows:

Avg(Seq) = Sum(Seq)/Length(Seq)
Sum(Seq) = Head(Seq)   Sum(Tail(Seq))

Is this idea right? How can I implement this? I was recommended to read the example of RecAddDefinition at Converting `appendo` relation from smt2 to python, but I feel pretty lost, specially when I see constructions like Const.

I also have one extra doubt: I understood that when defininf a Exists(i, ...), the top-level declaration of i is unused; however, if I do not declare it, I receive the error: name 'i' is not defined.

CodePudding user response:

This should get you started:

from z3 import *

IntSeqSort = SeqSort(IntSort())

sumArray    = RecFunction('sumArray', IntSeqSort, IntSort())
sumArrayArg = FreshConst(IntSeqSort)

RecAddDefinition( sumArray
                , [sumArrayArg]
                , If(Length(sumArrayArg) == 0
                    , 0
                    , sumArrayArg[0]   sumArray(SubSeq(sumArrayArg, 1, Length(sumArrayArg) - 1))
                    )
                )

def avgArray(arr):
    return ToReal(sumArray(arr)) / ToReal(Length(arr))

In the above, we have defined the recursive function to sum an array, and then avgArray which returns the average. Here's a use case. Let's ask it to find a sequence of length 4, whose average is 3:

arr = Const('arr', IntSeqSort)

s = Solver()
s.add(Length(arr) == 4)
s.add(avgArray(arr) == 3)

if s.check() == sat:
    print(s.model()[arr])
else:
    print("Not sat")

When I run this, I get:

Concat(Unit(-10451),
       Concat(Unit(31365),
              Concat(Unit(-10451), Unit(-10451))))

In more familiar notation, z3 found us:

[-10451, 31365, -10451, -10451]

You can easily see that it satisfies the constraints, it has length 4 and the average of the numbers is 3.

You can take this and convert it to model your original formula in the usual way.

Regarding the existential: This is a z3 hack: To use Exists and Forall you have to define the bound-variables at the top-level, but otherwise what you defined at top and what is used locally is unrelated. Think of it as a way of telling z3 that this variable exists just for internal bookkeeping purposes, but otherwise is irrelevant. If it helps, you can give it a really unique name (like localExistsVariable) so it doesn't confuse you in any other way. (And don't use it anywhere else!)

  • Related