Memory Align State Machine
The Memory Align State Machine is a secondary state machine that includes an executor (the Memory Align SM Executor) and an internal Memory Align PIL (program) that is a set of verification rules written in the PIL language.
The Memory Align SM Executor is written in two languages: Javascript and C/C++.
Overview
It checks memory reads/writes using a 32-byte word access, while the EVM can read and write 32-byte words with offsets at a byte level. The table below shows a sample of possible byte-addressed and 32-byte-addressed memory layouts for the same content (three words).
The relationship between the 32-byte word addressable layout and the byte addressable layout is called memory alignment and the Memory Align SM is the state machine that checks the correctness of this relationship.
In more detail, we have to check the following memory operations:
- : It receives an offset and returns the 32 bytes in memory starting at that offset.
- : It receives an offset and saves 32 bytes from the offset address of the memory.
- : It receives an offset and saves one byte on that address of the memory.
In general cases, requires reading bytes of two different words.
Given that the memory's contents are represented in the table above and that the EVM is addressed at the byte level, the value we should get if we check a read from the EVM of a word starting at the address is as follows:
We denote the content of the words affected by an EVM memory read as and .
In our example, these words are the following:
We define a read block as the string concatenating the content of the words affected by the read: .
The below figure shows the affected read words and that form the affected read block and the read value for a read from the EVM at address in our example memory.
Let us now introduce the flow at the time of validating a read.
Suppose that we want to validate that if we perform an operation at the address , we get the previous value . At this point, the Main State Machine will perform several operations.
First of all, it will have to query for the values and . Henceforth, it must call the Memory SM in order to validate the previous queries.
Observe that it is easy to extract the memory positions to query from the address . In fact, if is the memory position of the operation, then is always stored at the memory position and is stored at the memory position . In our example, . Hence, is stored at the position and is stored at the position .
Secondly, we should extract the correct . The represents an index between and indicating the number of bytes we should offset from the starting of to correctly place in the block. In our case, the is . Similarly as before, it is easy to obtain the offset from . In fact, the it is equal to .
Now, the Main SM will check via a Plookup to the Memory Align State Machine that \val is a correct read given the affected words and and the . That is, we should check that the value can be correctly split into and using the provided .
Similarly, instruction requires writing bytes in two words.
The idea is very similar, but we are provided with a value \val that we want to write into a specific location of the memory. We will denote by and the words that arise from and after the corresponding write.
Following our previous example, suppose that we want to write:
in the address of the byte-addressed Ethereum memory. We are using the same and (and since we are writing into the same address as before) and they will transition into:
The Main State Machine will need to perform several operations. We will be given an address , an offset value and a value to be written . Identically as before, the Main SM will be in charge of reading the zkEVM memory to find and from the given address and offset. Of course, the validity of this query should be performed with a specific Plookup into the Memory SM, just as before.
Now, the Main SM can compute and from all the previous values in a uniquely way. The way of validating that we are providing the correct and is to perform a Plookup into the Memory Align SM. That is, we will check that the provided values and are correctly constructed from the provided , , and values.
Finally, the last opcode works similarly, but it only affects one word . Moreover, we can only write one byte and hence, only the less significant byte of will be considered into the write. Observe that, in this opcode, and are unconstrained.
Source Code
The Polygon zkEVM repository is available on GitHub.
Memory Align SM Executor: sm_mem_align.js
Memory Align SM PIL: mem_align.pil