Reverse Engineering Meets Smart Contracts: Exposing the Vulnerabilities Lurking Beneath

 2025-04-24

Have you ever wondered how reverse engineering can reveal critical vulnerabilities lurking in your smart contracts?

On July 30, 2023, a vulnerability in the Vyper compiler version v0.2.15’s re-entrancy guard allowed a re-entrance attack exploited Curve Finance Pools and $69M was stolen; after all refunds, the total loss is about $20M. I find this exploit interesting since it is undetectable at the pre-compiled/source code level (this is the tip of the iceberg anyway). The high-level problem was that the Vyper compiler’s re-entrance protection “promised” developers to handle the re-entrancy guard CORRECTLY, but it failed to do so. In this blog, I reverse engineered vulnerable contracts to demonstrate how the vulnerability can be automatically detected.

1. The Timeline

The timeline leading to the successful exploit is interesting. The earliest vulnerable version of Vyper, v0.2.15, was released on July 23, 2021. This was followed by versions v0.2.16 (September 2, 2021) and v0.3.0 (November 4, 2021), both of which remained vulnerable. However, Vyper v0.3.1, released on December 1, 2021, included updates that inadvertently removed the vulnerability, though it remained undetected.

Therefore everything “went well” until the contract compiled with Vyper v0.2.15 was successfully exploited approximately 2 years later. There were plenty of time and occasions to spot the vulnerability and take action. But of course, like any 0-days, it was only detected when **it hit the fan. So here goes the announcement on the CVE database: https://nvd.nist.gov/vuln/detail/cve-2023-39363.

2. What Cause The Security Bug

The GitHub Security Advisory - GHSA-5824-cm3x-3c38 - briefly stated, and I quote: “named re-entrancy locks are allocated incorrectly. Each function using a named re-entrancy lock gets a unique lock regardless of the key, allowing cross-function re-entrancy in contracts compiled with the susceptible versions.”

You can find the exploited contract here. The exploit related to two functions, add_liquidity and remove_liquidity. Both functions use the @nonreentrant decorator with the key 'lock' and because of the assumption about the applied re-entrancy guard, they didn’t follow the checks-effects-interactions pattern.

Therefore, the security bug in the Vyper compiler can be considered as the root cause.

3. Detection Methodologies

3.1 Approach

My experience is that it is simplest to work at the level where the problem occur, so I chose to work at the bytecode level. I also believe at that this level is the most reliable as it expose which instructions are actually executed. Therefore, this approach can be use to discover the “underwater portion of the iceberg” and allow to discover more stealthy security bugs than at the source code level.

3.2 Detecting The Root Cause

3.2.1 Case Study

To study how to detect multiple storage blocks used for a single named re-entrancy lock, I used the contract below:

@external
@nonreentrant("foo")
def foo():
    pass
 
@external
@nonreentrant("foo")
def bar():
    pass

The contract is minimized to only two dummy functions, which use the decorator @nonreentrant with the same key "foo".

3.2.2 EVM Bytecode

I compiled the contract with the vulnerable version (v0.2.15) and the non-vulnerable version (v0.3.1) of Vyper. These two versions of compiled code allow me perform an in-depth vulnerability assessment.

Version 0.2.15 bytecode:

0x600436101561000d5761005b565b600035601c52600051346100615763c298557881141561003a576000546100615760016000556000600055005b63febb0f7e811415610059576001546100615760016001556000600155005b505b60006000fd5b600080fd

Version 0.3.1 bytecode:

0x600436101561000d5761005a565b60046000601c37600051346100605763c2985578811861003a576000546100605760016000556000600055005b63febb0f7e8118610058576000546100605760016000556000600055005b505b60006000fd5b600080fd

3.2.3 EVM Assembly

EVM assembly is just a different representation of EVM bytecode, and thhe relation of the two representations is one-to-one. Therefore, lifting the bytecode to assembly keep the benefit of the lowest level code possible and provide a human readable form. I used an upgraded version of IDA-EVM processor module to disassemble the bytecode.

Putting the two versions of compiled code side by side reveals what has been updated:

This is powerful enough to spots even minor changes in the behavior of the Vyper compiler. For example, version 0.3.1 uses XOR to compare function hashes, while version 0.2.15 used a combination of EQ and ISZERO. That is one instruction instead of two - an improvement in term of execution cost!

Let’s zoom in on the most important basic blocks where re-entrancy locks are implemented:

In both version, the blocks that labeled with A belong to function bar(), and the blocks labeled with B belong to function foo(). Block C is shared between the two functions, and used to reverse the transaction if re-entrancy is detected.

The comparisons against the function hashes (first 4 bytes of the Keccak-256 hash of the function signature) are used to determine the called method. More specifically, 0xc2985578 is foo() and 0xfebb0f7e is bar().

The problem occurs at the blocks A1, A2, and B1, B2 of version 0.2.15 as it uses two different storage slots to implement a single re-entrancy key lock. This bug disappeared in version 0.3.1.

The pattern Vyper used to implement the re-entrancy lock is as follows:

Load and check the value of a pre-determined storage slot
If the value at the storage slot is 1, the entrance is locked and it reverses the transaction.
Else it stores the value 1 to the pre-determined storage slot, executes the function, and restores the storage slot’s value to 0 at the end.

4. Automated Detection

4.1 Detect Single Re-entrancy Locks

Based on the discovered pattern, a re-entrancy lock can be easily detected by matching from the beginning of a function:

PUSH1 <STORAGE SLOT>
SLOAD
...
...
PUSH1 <LOCKED VALUE>
PUSH1 <STORAGE SLOT>
SSTORE
...
<Execute function>
...
PUSH1 <UNLOCKED VALUE>
PUSH1 <STORAGE SLOT>
SSTORE

If that pattern is matched, the <STORAGE SLOT> (a number) will represent the “ID” of the re-entrancy lock used by the function.

Performing the pattern matching on all available functions will provide the set of mappings: function <–> re-entrancy lock ID at the bytecode level.

4.2 Detect Cross-function Re-entrancy Locks

Now let’s revisit the problem we are trying to solve. The security issue arises from a mismatch between the assumption at the source code level - represented by the decorator @nonreentrant(<key>) - and the actual implementation of the compiled code. Specifically, different storage slots were used to implement re-entrancy locks regardless of the same key being used. This assumption wasn’t tested, hence the vulnerability was undetected.

Based on my in-depth analysis, I propose the following measures to automatically identify this vulnerability:

At the source code level, extract the mappings between function signature and re-entrancy lock key.
Compile the source code using the selected compiler version.
At the bytecode level, extract the mappings between function ID and re-entrancy lock ID.
Since the mapping between function signature and function ID is 1:1, if multiple functions at the source code level use the same lock key, they must be mapped to a single lock ID at the bytecode level.

By following these steps, a scanner can detect the inconsistencies between the source code and the compiled code, thereby detecting the vulnerability.

4.3. Applying On The Vulnerable Contract

It’s time to apply the proposed method to the vulnerable contract. Below are the two related methods in the exploited contract:

@payable
@external
@nonreentrant('lock')
def add_liquidity(
    _amounts: uint256[N_COINS],
    _min_mint_amount: uint256,
    _receiver: address = msg.sender
) -> uint256:

remove_liquidity(uint256,uint256[2]) ==> Method ID: 0x5b36389c

@external
@nonreentrant('lock')
def remove_liquidity(
    _burn_amount: uint256,
    _min_amounts: uint256[N_COINS],
    _receiver: address = msg.sender
) -> uint256[N_COINS]

add_liquidity(uint256[2],uint256) ==> Method ID: 0x0b4c7e4d

The image below illustrates how the proposed method can be used to detect the vulnerability at the assembly/bytecode level:

It is clear that two different storage slots, STORAGE[0] and STORAGE[2] were used to implement a single lock key. This is a demonstration of the effectiveness of the proposed method.

I really enjoyed diving into the research and writing this piece - hope you found it insightful too. I’d love to hear your thoughts:

Have you encountered compiler-level bugs during smart contract audits or deployments?
What techniques do you rely on to uncover issues that source-level analysis might miss?

Feel free to reach out - let’s connect!

Tien D. Phan's Blog