LLM and Bug Finding: Insights from a $2M Winning Team in the White House's AIxCC

>>garlic+(OP)
I'm part of the team, and we used LLM agents extensively for smart bug finding and patching. I'm happy to discuss some insights, and share all of the approaches after grand final :)

>>hqzhao+Mb
Hey, congrats on getting to the finals of AIxCC!

Have you tested your CRS on weekend CTFs? I’m curious how well it’d be able to perform compared to other teams

>>adrago+3k
Thanks!

We haven't tested it yet. Regarding CTFs, I have some experience. I'm a member of the Tea Deliverers CTF team, and I participated in the DARPA CGC CTF back in 2016 with team b1o0p.

There are a few issues that make it challenging to directly apply our AIxCC approaches to CTF challenges:

1. *Format Compatibility:* This year’s DEFCON CTF finals didn’t follow a uniform format. The challenges were complex and involved formats like a Lua VM running on a custom Verilog simulator. Our system, however, is designed for source code repositories like Git repos.

2. *Binary vs. Source Code:* CTFs are heavily binary-oriented, whereas AIxCC is focused on source code. In CTFs, reverse engineering binaries is often required, but our system isn’t equipped to handle that yet. We are, however, interested in supporting binary analysis in the future!

zlacker