Skip to content

Finding the hash of a program in Fuel VM: hashing bytecode or use MAST root? #794

Closed
@partylikeits1983

Description

@partylikeits1983

I am curious how the hash for a fuel predicate is computed. As I understand, the predicate hash is computed here:

pub fn root_from_code<B>(bytes: B) -> Bytes32

using the root_from_code function:

    pub fn root_from_code<B>(bytes: B) -> Bytes32
    where
        B: AsRef<[u8]>,
    {
        let mut tree = BinaryMerkleTree::new();
        bytes.as_ref().chunks(LEAF_SIZE).for_each(|leaf| {
            // If the bytecode is not a multiple of LEAF_SIZE, the final leaf
            // should be zero-padded rounding up to the nearest multiple of 8
            // bytes.
            let len = leaf.len();
            if len == LEAF_SIZE || len % MULTIPLE == 0 {
                tree.push(leaf);
            } else {
                let padding_size = len.next_multiple_of(MULTIPLE);
                let mut padded_leaf = [PADDING_BYTE; LEAF_SIZE];
                padded_leaf[0..len].clone_from_slice(leaf);
                tree.push(padded_leaf[..padding_size].as_ref());
            }
        });

        tree.root().into()
    }

Looking at this implementation however, I was wondering why use a merkle tree to compute the hash of the predicate? In this implementation, the compiled bytecode of the program is passed in, divided into chunks, inserted into the merkle tree, and then the root of the merkle tree is found.

However, using a merkle tree to find the hash of a program is very similar to the concept of a merkleized abstract syntax tree (MAST). However, a MAST root, is the hash of an entire program, where the leaves of the merkleized abstract syntax tree are the "subprograms" of the entire program. MAST is used in bitcoin: https://github.com/bitcoin/bips/blob/master/bip-0114.mediawiki

My question is why is a merkle tree used to compute the hash of a fuel program? It seems that the function root_from_code does not need to use a merkle tree at all since all it does is divide the byte code into chunks, insert into a merkle tree, and then returns the root of the merkle tree. In this implementation, the bytecode in a single leaf could be from different logic flows in a program.

Why not just use a standard hash function like keccak256 for computing the hash of a program? Why use a merkle tree in this case?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions