OpenAI's new benchmark reveals current AI models struggle with complex, multi-stage genomics problems, setting a clear performance ceiling for future development.