Skip to main content

Range Checker

Miden VM relies very heavily on 16-bit range-checks (checking if a value of a field element is between 00 and 2162^{16}). For example, most of the u32 operations, need to perform between two and four 16-bit range-checks per operation. Similarly, operations involving memory (e.g. load and store) require two 16-bit range-check per operation.

Thus, it is very important for the VM to be able to perform a large number 16-bit range checks very efficiently. In this note we describe how this can be achieved using permutation checks.

8-bit range checks

First, let's define a construction for the simplest possible 8-bit range-check. This can be done with a single column as illustrated below.

rc_8_bit_range_check

For this to work as a range-check we need to enforce a few constraints on this column:

  • Value in the first row must be 00.
  • Value in the last row must be 255255.
  • As we move from one row to the next, we can either keep the value the same, or increment it by 11.

Denoting vv as the value of column vv in the current row, and vv' as the value of column vv in the next row, we can enforce the last condition as follows:

(vv)(vv1)=0(v' - v) \cdot (v' - v - 1) = 0

Together, these constraints guarantee that all values in column vv are between 00 and 255255 (inclusive).

We can then add another column p0p_0, which will keep a running product of values in vv offset by random value α0\alpha_0 (provided by the verifier). Transition constraints for column p0p_0 would look like so:

p0p0(α0+v)=0p'_0 - p_0 \cdot (\alpha_0 + v) = 0

Using these two columns we can check if some other column in the execution trace is a permutation of values in vv. Let's call this other column xx. We can compute the running product of xx in the same way as we compute the running product for vv. Then, we can check that the last value in p0p_0 is the same as the final value for the running product of xx (this is the permutation check).

While this approach works, it has a couple of limitations:

  • First, column vv must contain all values between 00 and 255255. Thus, if column xx does not contain one of these values, we need to artificially add this value to xx somehow (i.e., we need to pad xx with extra values).
  • Second, assuming nn is the length of execution trace, we can range-check at most nn values. Thus, if we wanted to range-check more than nn values, we'd need to introduce another column similar to vv.

To get rid of the padding requirement, we can add a selector column, which would contain 11 for values we want to include in the running product, and 00 for the ones we don't. But we can address both issues with a single solution.

A better construction

Let's add two selector column to our table s0s_0 and s1s_1 as illustrated below.

rc_better_construction

The purpose of these columns is as follows:

  • When s0=0s_0 = 0 and s1=0s_1 = 0, we won't include the value into the running product.
  • When s0=1s_0 = 1 and s1=0s_1 = 0, we will include the value into the running product.
  • When s0=0s_0 = 0 and s1=1s_1 = 1, we will include two copies of the value into the running product.
  • When s0=1s_0 = 1 and s1=1s_1 = 1, we will include four copies of the value into the running product.

Thus, for the table pictured below, the running product will include: a single 00, a single 11, no 22's, and five 33's etc.

To keep the description of constraints simple, we'll first define the four flag values as follows:

f0=(1s0)(1s1)f_0 = (1 - s_0) \cdot (1 - s_1)
f1=s0(1s1)f_1 = s_0 \cdot (1 - s_1)
f2=(1s0)s1f_2 = (1 - s_0) \cdot s_1
f3=s0s1f_3 = s_0 \cdot s_1

Thus, for example, when s0=1s_0 = 1 and s1=1s_1 = 1, f3=1f_3 = 1 and f0=f1=f2=0f_0 = f_1 = f_2 = 0.

Then, we'll update transition constraints for p0p_0 like so:

p0p0((α0+v)4f3+(α0+v)2f2+(α0+v)f1+f0)=0p'_0 - p_0 \cdot \left((\alpha_0 + v)^4 \cdot f_3 + (\alpha_0 + v)^2 \cdot f_2 + (\alpha_0 + v) \cdot f_1 + f_0\right) = 0

The above ensures that when f0=1f_0 = 1, p0p_0 remains the same, when f1=1f_1 = 1, p0p_0 is multiplied by (α0+v)(\alpha_0 + v), when f2=1f_2 = 1, p0p_0 is multiplied by (α0+v)2(\alpha_0 + v)^2, and when f3=1f_3 = 1, p0p_0 is multiplied by (α0+v)4(\alpha_0 + v)^4.

We also need to ensure that values in columns s0s_0 and s1s_1 are binary (either 00 or 11). This can be done with the following constraints:

s02s0=0s_0^2 - s_0 = 0
s12s1=0s_1^2 - s_1 = 0

And lastly, for completeness, we still need to impose a transition constraint that we had in the naive approach:

(vv)(vv1)=0(v' - v) \cdot (v' - v - 1) = 0

This 3-column table addresses the limitations we had as follows:

  1. We no longer need to pad the column we want to range-check with extra values because we can skip the values we don't care about.
  2. We can support almost 4n4n range checks (when nn is relatively large). Though, for short traces (when n<256n < 256), we can range-check at most nn unique values.

The one downside of this approach is that the degree of our constraints is now 66 (vs. 22 in the naive approach). But in the context of Miden VM this doesn't matter as maximum constraint degree for the VM is 88 anyway.

16-bit range checks

To support 16-bit range checks, let's try to extend the idea of the 8-bit table. Our 16-bit table would look like so (the only difference is that column uu now has to end with value 6553565535):

rc_16_bit_range_check

While this works, it is rather wasteful. In the worst case, we'd need to enumerate over 65K values, most of which we may not actually need. It would be nice if we could "skip over" the values that we don't want. We can do this by relying on 8-bit range checks. Specifically, instead of enforcing constraint:

(uu)(uu1)=0(u' - u) \cdot (u' - u - 1) = 0

We would enforce:

p1p1(α0+uu)=0p'_1 - p_1 \cdot (\alpha_0 + u' - u) = 0

Where p1p_1 is another running product column. At the end of the execution trace, we would check that p0=p1p_0 = p_1. This would ensure that as we move from one row to another, values in column uu increase by at most 255255 (we are basically performing an 8-bit range check on increments of column uu). Now, our table can look like this:

rc_table_post_8_bit_range_check

We still may need to include some unneeded rows because we can not "jump" by more than 255255 values, but at least we are guaranteed that the number of such unneeded rows will never be greater than 256256.

We also need to add another running product column p2p_2 to support permutation checks against column uu. The constraint that we'll need to impose against this column is identical to the constraint we imposed against column p0p_0 and will look like so:

p2p2((α0+u)4f3+(α0+u)2f2+(α0+u)f1+f0)=0p'_2 - p_2 \cdot \left((\alpha_0 + u)^4 \cdot f_3 + (\alpha_0 + u)^2 \cdot f_2 + (\alpha_0 + u) \cdot f_1 + f_0\right) = 0

Overall, with this construction we have the following:

  • We need two table of three columns each (66 columns total), and we need 33 running product columns.
  • This gives us the ability to do the following:
    • For long traces (when n>216n > 2^{16}) we can support almost 4n4n arbitrary 16-bit range-checks.
    • For short traces, we can range-check at most nn unique values, but if there are duplicates, we can support up to 4n4n total range-checks.

But we can do better.

Optimizations

First, we can just stack the tables on top of each other. We'll need to add a column to partition the table between the sections used for 8-bit range checks and sections used for 16-bit range checks. Let's call this column tt. When t=0t = 0, we'll apply constraints for the 8-bit table, and when t=1t = 1, we'll apply constraints for the 16-bit table.

rc_optimisation.png

Second, we can merge running product columns p0p_0 and p1p_1 into a single column. We'll do it like so:

  • When t=0t = 0, we'll multiply the current value of the column by the 8-bit value (offset by random α0\alpha_0) that we want to add into the running product.
  • When t=1t = 1, we'll divide the current value of the column by the 8-bit value (offset by random α0\alpha_0) which we'd like to remove from the running product.

In the end, if we added and then removed all the same values, the value in this column (let's call it p0p_0) should equal to 11, and we can check this condition via a boundary constraint.

The only downside of this construction is again higher constraint degree. Specifically, some of the transition constraints described above have degree 88. However, this doesn't matter in the context of Miden VM, since the max constraint degree of the VM is 99.

Miden approach

This final optimized construction is implemented in Miden with the following requirements, capabilities, and constraints.

Requirements

  • 4 columns of the main trace: t,s0,s1,vt, s_0, s_1, v.
  • 1 bus brangeb_{range} to ensure that the range checks performed in the range checker match those requested by other VM components (the stack and the memory chiplet).
  • 1 virtual table, tracked in running product column p0p_0, which enables skipping values by up to 255255 in the 16-bit section of the trace and enforces consistency between the range checks in the 8-bit section of the trace and the value increments in the 16-bit section of the trace.

Capabilities

The construction gives us the following capabilities:

  • For long traces (when n>216n > 2^{16}), we can do over 3n3n arbitrary 16-bit range-checks.
  • For short traces (210<n2162^{10} < n \le 2^{16}), we can range-check at slightly fewer than nn unique values, but if there are duplicates, we may be able to range-check up to 3n3n total values.

Execution trace

The range checker's execution trace looks like the table described in the optimizations section above.

rc_optimisation.png

As previously described, the columns have the following meanings:

  • tt contains a binary value which differentiates between the 8-bit (t=0t = 0) and 16-bit (t=1t = 1) sections of the range checker's trace.
    • The 8-bit section of the trace contains the range checks required to ensure internal consistency of the Range Checker.
    • The 16-bit section of the trace contains the range checks required by other components (e.g. the stack and the memory chiplet).
  • s0s_0 and s1s_1 are selector columns that are combined into flags to indicate the number of times the value in that row should be range checked (included into the running product). With these flags, values can be included 0, 1, 2, or 4 times per row in the execution trace. (Values can be included more times by having multiple trace rows with the same value).
  • vv contains the values to be range checked.
    • During the 8-bit section of the trace (when t=0t = 0), these values go from 00 to 255255 and must either stay the same or increase by one at each step.
    • During the 16-bit section of the trace (when t=1t = 1), these values go from 00 to 6553565535. Values must either stay the same or increase by less than 256256 at each step.
    • The final 2 rows of the 16-bit section of the trace must both equal 6553565535. The extra value of 6553565535 is required in order to pad the trace so the brangeb_{range} running product bus column can be computed correctly.

Execution trace constraints

First, we'll need to make sure that all selector flags are binary. This can be done with the following constraints:

t2t=0 | degree=2t^2 - t = 0 \text{ | degree} = 2
s02s0=0 | degree=2s_0^2 - s_0 = 0 \text{ | degree} = 2
s12s1=0 | degree=2s_1^2 - s_1 = 0 \text{ | degree} = 2

Next, we need to constrain the row transitions in the 8-bit section of the table so that as we move from one row to the next the value either stays the same or increases by 1.

(1t)(vv)(vv1)=0 | degree=3(1 - t') \cdot (v' - v) \cdot (v' - v - 1) = 0 \text{ | degree} = 3

Next, we need to make sure that values in column tt can "flip" from 00 to 11 only once. The following constraint enforces this:

t(1t)=0 | degree=2t \cdot (1 - t') = 0 \text{ | degree} = 2

Finally, we need to make sure that when column tt "flips" from 00 to 11 (we are moving from the 8-bit section of the table to the 16-bit section), the current value in column vv is equal to 255255, and the next value is reset to 00. This can be done with the following constraints:

(1t)t(v255)=0 | degree=3(1 - t) \cdot t' \cdot (v - 255) = 0 \text{ | degree} = 3
(1t)tv=0 | degree=3(1 - t) \cdot t' \cdot v' = 0 \text{ | degree} = 3

In addition to the transition constraints described above, we also need to enforce the following boundary constraints:

  • Value of vv in the first row is 00.
  • Value of vv in the last row is 6553565535.

8-bit range checks table

The 8-bit range checks virtual table is used to enforce the internal correctness of the 16-bit section of the Range Checker (where range checks for user operations and other components are executed).

This table can be thought of as a virtual table that contains all 8-bit range checks required to ensure correctness of the 16-bit section:

  • When a value is range-checked in the 8-bit section of the trace (i.e., the flag indicates that it should be included 1, 2, or 4 times), it is added to the virtual table. The value must be added to the virtual table as many times as that row's flag indicates it should be range-checked.
  • In the 16-bit section of the trace, a single 8-bit range check on vvv' - v is required at each step. Therefore, a single 8-bit range check of vvv' - v is removed from the virtual table at each step.

The running product column p0p_0 is used to keep track of the state of the table.

To simplify the notation, we'll first define variable zz, which represents how a row in the execution trace is reduced to a single value.

z=(α0+v)4f3+(α0+v)2f2+(α0+v)f1+f0z = (\alpha_0 + v)^4 \cdot f_3 + (\alpha_0 + v)^2 \cdot f_2 + (\alpha_0 + v) \cdot f_1 + f_0

In the 8-bit section (t=0t = 0), one, two, or four 8-bit range checks are added to the virtual table at each step and p0p_0 is updated as follows:

p0=p0zp'_0 = p_0 \cdot z

Note that if f0f_0 is true then the value of p0p_0 does not change, so no 8-bit range checks are added to the virtual table.

In the 16-bit section (t=1t = 1), one 8-bit range check is removed from the virtual table at each step and p0p_0 is updated as follows:

p0(α0+vv)=p0p'_0 \cdot (\alpha_0 + v' - v) = p_0

The above actually enforces that p0=p0/(α0+vv)p'_0 = p_0 / (\alpha_0 + v' - v).

These two updates, which first build up the product and then reduce it, can be combined into a single constraint:

p0((α0+vv)tt+1)=p0(zzt+t) | degree=8p'_0 \cdot ((\alpha_0 + v' - v) \cdot t - t + 1) = p_0 \cdot (z - z \cdot t + t) \text{ | degree} = 8

Thus, if the prover arranged the 8-bit and the 16-bit sections of the table correctly and p0p_0 was initialized to 11, we should end up with p0=1p_0 = 1 at the end of the trace.

In addition to the transition constraints described above, we also need to enforce the following boundary constraint:

  • The value of p0p_0 in the first and last rows is 11.

Communication bus

brangeb_{range} is the bus that connects components which require 16-bit range checks to the range-checked values in the 16-bit section of the range checker. The bus constraints are defined by the components that use it to communicate.

Requests are sent to the range checker bus by the following components:

  • The Stack sends requests for 16-bit range checks during some u32 operations.
  • The Memory chiplet sends requests for 16-bit range checks against the values in the d0d_0 and d1d_1 trace columns to enforce internal consistency.

Responses are provided by the range checker as follows.

Once again, we'll make use of variable zz, which represents how a row in the execution trace is reduced to a single value.

z=(α0+v)4f3+(α0+v)2f2+(α0+v)f1+f0z = (\alpha_0 + v)^4 \cdot f_3 + (\alpha_0 + v)^2 \cdot f_2 + (\alpha_0 + v) \cdot f_1 + f_0

Only the 16-bit section of the trace should be included in the brangeb_{range} bus column. Transition constraints for this are fairly straightforward:

brange=brange(ztt+1) | degree=8b'_{range} = b_{range} \cdot (z \cdot t - t + 1) \text{ | degree} = 8

Thus, when t=0t = 0, the value in brangeb_{range} does not change, but when t=1t = 1, the next value in brangeb_{range} is computed by multiplying the current value by zz.

If brangeb_{range} is initialized to 11 and the values sent to the bus by other VM components match those that are range-checked in the 16-bit section of the trace, then at the end of the trace we should end up with brange=1b_{range} = 1.

In addition to the transition constraint described above, we also need to enforce the following boundary constraint:

  • The value of brangeb_{range} in the first and last rows is 11.