Automated Generation and Verification of External Library Function Specifications via LLM and angr Collaboration

Presenter: Wangyue Lu

Abstract: The absence of formal specifications for external library functions is a fundamental bottleneck in automated program verification. Existing LLM-based specification synthesis approaches rely heavily on function source code and fail when confronted with third-party libraries delivered solely as binaries. Alternative approaches such as dynamic invariant inference and decompilation-based verification suffer from incomplete contract coverage or an inability to produce formal specifications directly usable by verification backends. This talk presents an automated approach for generating and verifying ACSL specifications of external library functions through the collaboration of large language models and the binary symbolic execution framework angr. Operating exclusively on compiled binaries without access to source code, our method employs angr to perform symbolic execution and extract path constraints at each termination state. These constraints, combined with disassembled pseudocode and assembly-level runtime error (RTE) assertions, form a rich semantic context that guides the LLM in producing an initial specification. The candidate specification is then validated through Frama-C syntax checking, followed by LLM-generated angr verification scripts that re-encode the specification’s pre- and post-conditions as symbolic constraints. Verification failures are fed back as counterexample evidence to drive iterative LLM-based refinement, forming a closed-loop automation pipeline. Experiments on a benchmark comprising signed integer, unsigned integer, and floating-point arithmetic functions drawn from CASP, frama-c-problems, sokol, and maratis demonstrate that our approach can effectively generate and verify formal function specifications, offering a viable path toward fully automated verification in source-unavailable settings.