Manifest Tests

Assessment of LLM-generated Kubernetes manifests across three prompting levels: basic, production, and hardened. Each model’s output was tested for both deployability (does it actually work?) and security compliance against Pod Security Standards.

Original date: 2026-03-09 | Claude Opus 4.6 added: 2026-03-25 | MiniMax M2.7 added: 2026-03-28 | Claude Opus 4.7 added: 2026-04-20 | Qwen 3.6 Plus added: 2026-04-20 | DeepSeek V4 Pro added: 2026-04-24 | DeepSeek V4 Flash added: 2026-04-24 Models: Claude Opus 4.7, Claude Opus 4.6, Claude Sonnet 4.6, GPT 5.4, Gemini 3 Flash, Qwen 3.6 Plus, MiniMax M2.5, MiniMax M2.7, DeepSeek V3.2, DeepSeek V4 Pro, DeepSeek V4 Flash

Manifest Generation Assessment

Manifest Tests

Table of contents