OpenAI has launched EVMbench, a new benchmarking system designed to assess AI agents' capabilities in identifying, exploiting, and patching security vulnerabilities within crypto tokens and smart contracts.
Developed with crypto venture capital firm Paradigm, EVMbench establishes standardized testing for flaws in code operating on Ethereum Virtual Machine-compatible blockchains. The system measures AI performance in recognizing weaknesses, demonstrating exploit methods, and implementing remediation.
In conjunction with this release, OpenAI is expanding its Aardvark security research agent to a private beta and dedicating $10 million in API credits to its Cybersecurity Grant Program, supporting defensive research for open-source and critical infrastructure projects.