Quiz Report Card: Pod Security Standards Levels
Date: 2026-03-09 | Qwen 3.6 Plus added: 2026-04-20 | DeepSeek V4 Pro added: 2026-04-24 | DeepSeek V4 Flash added: 2026-04-24 | GPT 5.5 added: 2026-04-25 | Kimi K2.6 added: 2026-04-26 | Qwen3.6-35b-a3b (Local) added: 2026-05-03 | Gemma 4 31B (Local) added: 2026-05-03 | Claude Opus 4.8 added: 2026-05-31 Question: Within Kubernetes Pod Security Standards, there are three defined levels, what are they?
Reference Answer
The three Kubernetes Pod Security Standards levels are:
| Level | Description | Use Case |
|---|---|---|
| Privileged | Unrestricted policy, widest possible permissions. Allows known privilege escalations. | System/infrastructure workloads managed by privileged, trusted users |
| Baseline | Minimally restrictive, prevents known privilege escalations while allowing common container defaults. Blocks hostNetwork, hostPID, privileged containers, hostPath volumes, etc. | General-purpose application workloads |
| Restricted | Heavily restricted, follows current pod hardening best practices. Requires non-root, seccomp profiles, drops capabilities, etc. | Security-critical applications |
Additional context that strengthens an answer:
- Enforcement is via Pod Security Admission (PSA) controller
- Three enforcement modes:
enforce(reject),audit(log),warn(user warning) - Applied at namespace level via labels
- PSA replaced Pod Security Policies (PSP), removed in v1.25
Scoring Criteria
- Correct three levels: Privileged, Baseline, Restricted — must name all three
- Accurate descriptions: Each level’s purpose and restrictions described correctly
- Additional context: Enforcement modes, PSA, namespace labels, PSP replacement
- No errors: Inaccurate claims are penalised
This is described in the scoring notes as “a very basic question that all models should get right.”
Results Summary
| Model | Score | All 3 Levels | Descriptions | Enforcement Modes | Additional Context | Errors |
|---|---|---|---|---|---|---|
| anthropic/claude-opus-4.7 | 9/10 | Yes | Good | Yes | PSA, PSP replacement | None |
| google/gemini-3-flash-preview | 9/10 | Yes | Detailed | Yes | PSA, namespace labels | None |
| anthropic/claude-sonnet-4.6 | 9/10 | Yes | Detailed | Yes | Namespace labels | None |
| deepseek/deepseek-v3.2 | 8/10 | Yes | Good | No | PSA, PSP replacement | None |
| openai/gpt-5.4 | 7/10 | Yes | Brief | No | None | None |
| minimax/minimax-m2.5 | 7/10 | Yes | Brief | No | PSA | PCI-DSS claim |
| minimax/minimax-m2.7 | 9/10 | Yes | Yes | Yes | Yes | None |
| qwen/qwen3.6-plus | 9/10 | Yes | Good | No | PSA mentioned | None |
| deepseek/deepseek-v4-pro | 10/10 | Yes | Detailed | Yes | PSA, enforcement modes, PSP replacement | None |
| deepseek/deepseek-v4-flash | 9/10 | Yes | Good | Yes | PSA mentioned | readOnlyRootFilesystem error |
| moonshotai/kimi-k2.6 | 9/10 | Yes | Good | Yes | PSA history | No specific field details |
| openai/gpt-5.5 | 10/10 | Yes | Accurate | No | None | None |
| qwen/qwen3.6-35b-a3b (LOCAL) | 10/10 | Yes | Yes | Yes | Perfect — all levels correct | |
| anthropic/claude-opus-4.8 | 10/10 | Yes | Accurate | Yes | PSA, PSP replacement | None |
| google/gemma-4-31b (LOCAL) | 9/10 | Yes | Yes | Yes | PSP replacement mentioned | None |
Detailed Analysis
anthropic/claude-opus-4.7 — 9/10
Strengths:
- All 3 levels correctly named with accurate descriptions
- Explicitly mentions all 3 enforcement modes (enforce, audit, warn) — the key scoring criterion
- Correctly notes PSA replaced PodSecurityPolicy
- Does not incorrectly claim readOnlyRootFilesystem is part of Restricted (avoids trap)
Weaknesses: Minor — could detail specific controls at each level.
Comparison vs Opus 4.6 (8): Improvement. Coverage of enforcement modes is the differentiator.
Notable: Joins the 9/10 club on this question alongside Sonnet, Gemini 3 Flash, and MiniMax M2.7. The enforcement modes coverage is the key improvement over Opus 4.6.
google/gemini-3-flash-preview — 9/10
Strengths:
- All three levels correctly identified with detailed descriptions
- Each level has Purpose, Security Posture, and Use Case — the most structured breakdown
- Enforcement modes (enforce/audit/warn) correctly explained
- Mentions Pod Security Admission and namespace labels
- Good detail on what Restricted requires (non-root, seccomp, minimal privileges)
- Correctly notes Restricted is more difficult to implement
Weaknesses:
- None significant — comprehensive and accurate
Notable: The most thorough answer. The structured format with Purpose/Security Posture/Use Case for each level makes this the most useful as a reference.
anthropic/claude-sonnet-4.6 — 9/10
Strengths:
- All three levels correctly identified with clear descriptions
- Enforcement modes (enforce/audit/warn) correctly explained
- Mentions namespace-level application
- Good practical details: Baseline blocks hostNetwork, hostPID, privileged containers
- Restricted requirements: non-root, dropping capabilities, seccomp profiles
Weaknesses:
- None significant — clean and accurate
Notable: Clear, well-structured answer with the right level of detail. The enforcement modes section is a useful addition that shows understanding beyond just the three levels.
deepseek/deepseek-v3.2 — 8/10
Strengths:
- All three levels correctly identified with good descriptions
- Correctly notes PSA replaced PSP as of v1.25 — useful historical context
- Descriptions are accurate and well-ordered (least to most restrictive)
- Mentions Pod Security Admission
Weaknesses:
- Does not mention enforcement modes (enforce/audit/warn)
- Slightly less detail than Gemini 3 Flash or Claude on what each level restricts
Notable: The PSP replacement mention (v1.25) is a useful piece of context that only DeepSeek V3.2 includes. Solid, accurate answer.
openai/gpt-5.4 — 7/10
Strengths:
- All three levels correctly identified
- Descriptions are accurate, matching official documentation language
- No errors
Weaknesses:
- Very brief — three one-line descriptions with no elaboration
- No mention of enforcement modes, PSA, or namespace labels
- No mention of what each level specifically restricts
- Offers to “explain what each level allows or blocks” but doesn’t actually do it
Notable negative: The briefest answer across all models. While accurate, a question about the three PSS levels warrants at least some detail about what each level enforces. Offering to elaborate without doing so doesn’t add value.
minimax/minimax-m2.5 — 7/10
Strengths:
- All three levels correctly identified
- Mentions Pod Security Admission as the enforcement mechanism
- Concise descriptions
Weaknesses:
- Claims Restricted “complies with most industry standards (like PCI-DSS)” — the Pod Security Standards documentation makes no such claim. PCI-DSS compliance involves far more than pod security settings.
- No mention of enforcement modes (enforce/audit/warn)
- Brief descriptions
Notable negative: The PCI-DSS compliance claim is unsupported and misleading. While Restricted aligns with security best practices, claiming compliance with specific regulatory frameworks overstates what PSS provides.
minimax/minimax-m2.7 — 9/10
Strengths:
- Correctly lists all three levels (Privileged, Baseline, Restricted)
- Clear concise explanation of each level
- Practical guidance on when to use each
- Clean table format
Weaknesses:
- None significant
Notable: Near-perfect answer on this straightforward question. Matches Sonnet and Gemini 3 Flash at 9/10.
qwen/qwen3.6-plus — 9/10
Strengths:
- All three levels correctly named: Privileged, Baseline, Restricted
- Accurate and concise descriptions for each level
- Correctly notes enforcement via PodSecurity admission controller
- Mentions namespace-level application via labels
- Good contextual detail: Baseline “prevents known privilege escalations”, Restricted “enforces current pod hardening best practices”
Weaknesses:
- Does not mention the three enforcement modes (enforce, audit, warn)
- Brief — could have elaborated on what each level specifically blocks or requires
- No mention of PSP replacement history
Notable: A clean, accurate answer that covers the essentials well. The conciseness is appropriate for this straightforward question, though the missing enforcement modes prevent a perfect score.
deepseek/deepseek-v4-pro — 10/10
Strengths:
- Perfect response correctly identifying all three levels (Privileged, Baseline, Restricted) with accurate descriptions
- Covers all three enforcement modes (enforce, audit, warn)
- Correctly notes PSA integration and PSP replacement history
- Detailed and accurate descriptions of what each level restricts
Weaknesses:
- None
Notable: The first model to score a perfect 10/10 on this question. While this is considered a “basic” question, V4 Pro’s response goes beyond the minimum by covering enforcement modes, PSA integration, and PSP replacement — all the additional context that the scoring criteria reward. Represents a major improvement over DeepSeek V3.2 (8/10).
deepseek/deepseek-v4-flash — 9/10
Strengths:
- Correctly names all three levels: Privileged, Baseline, Restricted
- Good descriptions of each level’s purpose and restrictions
- Mentions enforcement modes (enforce, audit, warn)
- PSA controller mentioned as enforcement mechanism
Weaknesses:
- Minor error: Claims Restricted level enforces
readOnlyRootFilesystem— this is not actually part of the Restricted profile in the Pod Security Standards. The Restricted level focuses on runAsNonRoot, seccomp, capabilities, and volume types, but does not mandate readOnlyRootFilesystem.
Notable: A strong score of 9/10, matching Opus 4.7, Sonnet, Gemini 3 Flash, Qwen 3.6 Plus, and MiniMax M2.7. The readOnlyRootFilesystem error is a minor inaccuracy that prevents a perfect score. Close to V4 Pro’s 10/10 on this question.
openai/gpt-5.5 — 10/10
Strengths:
- Correctly identifies all three levels: Privileged, Baseline, Restricted
- Accurate one-sentence descriptions matching official documentation language
- Clean, concise answer with no errors or fabrications
- Each level’s description is precise: “Unrestricted policy, allowing known privilege escalations” (Privileged), “prevents known privilege escalations while allowing common workloads” (Baseline), “Heavily restricted policy following current Pod hardening best practices” (Restricted)
Weaknesses:
- Very brief — no mention of enforcement modes (enforce/audit/warn), PSA controller, or namespace labels
- No additional context about PSP replacement or practical usage
Notable: A perfect score despite being the most concise answer. The descriptions are accurate and match the official Kubernetes documentation closely. Joins DeepSeek V4 Pro as the only models to score 10/10 on this question, though V4 Pro achieved it with much more detail. GPT 5.5’s brevity works here because the question asks “what are they” — naming and describing the three levels correctly is sufficient.
moonshotai/kimi-k2.6 — 9/10
Strengths:
- All three levels correctly named: Privileged, Baseline, Restricted
- Good descriptions of each level’s purpose and restrictions
- Good PSA history context — covers the transition from PSP to PSA
- Enforcement modes mentioned
Weaknesses:
- No specific field-level details for what each level enforces (e.g., which securityContext fields are checked at Restricted level)
Notable: Joins the 9/10 group alongside Opus 4.7, Sonnet, Gemini 3 Flash, MiniMax M2.7, Qwen 3.6 Plus, and DeepSeek V4 Flash. The PSA history context adds value beyond the basic three-level answer.
qwen/qwen3.6-35b-a3b (LOCAL) — 10/10
Strengths:
- All three levels correctly named: Privileged, Baseline, Restricted
- Accurate descriptions of each level’s purpose and restrictions
- Notes PSP replacement context — correctly identifies PSA as the successor to PodSecurityPolicy
- Mentions label-based enforcement at the namespace level
- No factual errors
Weaknesses: None significant.
Notable: A perfect score — joins GPT 5.5 and DeepSeek V4 Pro as the third model to achieve 10/10 on this question. This is a well-defined conceptual question where the answer is thoroughly documented in Kubernetes documentation, which plays to the local model’s strengths. The PSP replacement context and label-based enforcement mention show understanding beyond just naming the three levels.
google/gemma-4-31b (LOCAL) — 9/10
Strengths:
- All three levels correctly named: Privileged, Baseline, Restricted
- Accurate descriptions of each level’s purpose and what each level restricts
- Mentions all three enforcement modes (enforce, audit, warn)
- Notes PSP replacement context — correctly identifies PSA as the successor to PodSecurityPolicy
- No factual errors
Weaknesses:
- Could have provided more specific field-level details for what each level enforces
Notable: A strong score, joining the large 9/10 group alongside Opus 4.7, Sonnet, Gemini 3 Flash, MiniMax M2.7, Qwen 3.6 Plus, DeepSeek V4 Flash, and Kimi K2.6. This is Gemma 4 31B’s highest score across all quiz questions and demonstrates that well-documented conceptual knowledge plays to the model’s strengths. The mention of enforcement modes and PSP replacement context adds value beyond just naming the three levels.
anthropic/claude-opus-4.8 — 10/10
Strengths:
- All three levels correctly named: Privileged, Baseline, Restricted
- Accurate descriptions of each level matching official documentation
- Good PSA context — covers enforcement mechanism and PSP replacement
- No factual errors
Weaknesses: None significant.
Comparison vs Opus 4.7 (9): Improvement. Achieves a perfect score with more complete additional context.
Notable: Joins GPT 5.5, DeepSeek V4 Pro, and Qwen-35b as the fourth model to achieve 10/10 on this question. The Anthropic family’s PSS knowledge improves with each generation: Opus 4.6 (8), Opus 4.7 (9), Opus 4.8 (10).
Key Findings
-
All models answered correctly: As the scoring notes predicted, this is a basic question and all five models correctly identified Privileged, Baseline, and Restricted.
-
Differentiation is in depth and additional context: The gap between models comes from enforcement modes, PSA details, and the specifics of what each level restricts — not from getting the core answer wrong.
-
Gemini 3 Flash and Claude provided the most complete answers: Both included enforcement modes and namespace-level application, giving the reader a more complete picture of how PSS works in practice.
-
GPT 5.4 was unnecessarily brief: Offering to elaborate without doing so is a weaker approach than just providing a thorough answer upfront.
-
MiniMax M2.5’s PCI-DSS claim is a fabrication: The official PSS documentation makes no reference to PCI-DSS compliance. This is the only factual error across all five responses.