The Arithmetic State Machine is a secondary state machine that also has an executor (the Arithmetic SM Executor) and an internal Arithmetic program (a set of verification rules written in the PIL language). The Arithmetic SM Executor is available in two languages: Javascript and C/C++.
It is one of the six secondary state machines receiving instructions from the Main SM Executor. The main purpose of the Arithmetic SM is to carry out elliptic curve arithmetic operations, such as Point Addition and Point Doubling as well as performing 256-bit operations like addition, product or division.
Consider an elliptic curve E defined by y2=x3+ax+b over the finite field F=Zp, where p is the prime,
p=2256−232−29−28−27−26−24−1. Set the coefficients a=0 and b=7, so that E reduces to
y2=x3+7. Given two points, P=(x1,y1) and Q=(x2,y2), on the curve E with x1=x2, the point P+Q=(x3,y3) is computed as follows,
x3=s2−x1−x2,y3=s(x1−x3)−y1 Given a point P=(x1,y1) on the curve E such that P=O, the point P+P=2P=(x3,y3) is computed as follows,
x3=s2−2x1,y3=s(x1−x3)−y1, where,
s=2y13x12. Several 256-bit operations can be expressed in the following form:
A⋅B+C=D⋅2256+E(Eqn A) where A,B,C,D and E are 256-bit integers.
For instance, if C=0, then Eqn A states that the result of multiplying A and B is E with a carry of D. That is, D is the chunk that exceeds 256 bits.
Or, if B=1, Eqn A states that the result of adding A and C is the same as before: E with a carry of D. Similarly, division and modular reductions can also be expressed as derivatives of Eqn A.
These operations are performed in the Arithmetic State Machine, with registers satisfying the following PIL relation,
EQ0:x1⋅y1+x2−y2⋅2256−y3=0 Since the above Elliptic Curve operations are implemented in the PIL language, it is more convenient to express them in terms of the constraints they must satisfy. These constraints are:
EQ0:EQ1:EQ2:EQ3:EQ4:x1⋅y1+x2−y2⋅2256−y3=0,s⋅x2−s⋅x1−y2+y1+q0⋅p=0,2⋅s⋅y1−3⋅x1⋅x1+q0⋅p=0,s⋅s−x1−x2−x3+q1⋅p=0,s⋅x1−s⋅x3−y1−y3+q2⋅p=0, where q0,q1,q2∈Z, implying that these equations hold true over the integers.
This approach is taken because of the need to compute divisions by p. Note that only three possible computation scenarios can arise:
- EQ0 is activated while the rest are deactivated,
- EQ1, EQ3 and EQ4 are activated but EQ0 and EQ2 are deactivated,
- EQ2, EQ3 and EQ4 are activated and EQ0 and EQ1 are deactivated.
Since at most, one of EQ1 and EQ2 are activated in any scenario, we can afford "sharing'' the same q0 for both.
Motivated by the implemented operations, the Arithmetic SM is composed of 6 registers, each of which is composed of 16 sub-registers of 16-bit (2 byte) capacity, adding up to a total of 256 bits per register.
x1, y1, x2, y2, x3, y3. There is also a need to provide s and q0,q1,q2, which are also 256-bit field elements.
Compute the previous operations at 2-byte level. For instance, if one is performing the multiplication of x1 and y1, at the first clock x1[0]⋅y1[0] is computed.
Then, (x1[0]⋅y1[1])+(x1[1]⋅y1[0]) is computed in the second clock, followed by (x1[0]⋅y1[2])+(x1[1]⋅y1[1])+(x1[2]⋅y1[0]) in the third, and so on.
As depicted in the below figure, this process is completely analogous to the schoolbook multiplication. However, it is performed at 2-byte level, instead of decimal level.

Use the following notation:
eq eq′=x1[0]⋅y1[0]=x1[0]⋅y1[1]+x1[1]⋅y1[0] But then, the carry generated by eq has to be taken into account by eq′.
Going back to our equations EQ0,EQ1,EQ2,EQ4; let's see how the operation is performed in EQ0.
Compute eq0=(x1[0]⋅y1[0])+x2[0]−y3[0]
Compute eq1=(x1[0]⋅y1[1])+(x1[1]⋅y1[0])+x2[1]−y3[1]
eq2=(x1[0]⋅y1[2])+(x1[1]⋅y1[1])+(x1[2]⋅y1[0])+x2[2]−y3[2]
This is continued until one reaches the computation, eq15=(x1[0]⋅y1[15])+(x1[1]⋅y1[14])+⋯+x2[15]−y3[15].
At this stage y2 comes into place.
Since the first 256 bits of the result of the operation have been filled (and the result can be made of more than 256-bits), a new register is needed to store the rest of the output. We change the addition of x2[i]−y3[i] by −y2[i].
Therefore, we obtain that:
eq16eq17=(x1[1]⋅y1[15])+(x1[2]⋅y1[14])+⋯−y2[0],=(x1[2]⋅y1[15])+(x1[3]⋅y1[14])+⋯−y2[1] and so on.
Continuing until the last two:
eq30eq31=(x1[15]⋅y1[15])−y2[14],=−y2[15]. Now, notice that the carry generated by the eqi's is not important for them. That means, if eqi=10, then what we really want the result to be is 0 and save 1 as a carry for the next operation. To express this fact as a constraint, we say that the following has to be satisfied:
eq+carry=carry′⋅216, where carry represents the carry taken into account in the actual clock, and carry′ represents the carry generated by the actual operation.
A technicality is that carry is subdivided into two other carryL and carryH such that:
carry=carryL+carryH⋅218. The Polygon zkEVM repository is available on GitHub.
Arithmetic SM Executor: sm_arith folder
Arithmetic SM PIL: arith.pil