Quiz Report Card: Kubernetes Authentication

Date: 2026-03-09 | Qwen 3.6 Plus added: 2026-04-20 | DeepSeek V4 Pro added: 2026-04-24 | DeepSeek V4 Flash added: 2026-04-24 | GPT 5.5 added: 2026-04-25 | Kimi K2.6 added: 2026-04-26 | Qwen3.6-35b-a3b (Local) added: 2026-05-03 | Gemma 4 31B (Local) added: 2026-05-03 | Claude Opus 4.8 added: 2026-05-31 | Qwen 3.7 Plus added: 2026-06-05 | MiniMax M3 added: 2026-06-08 | Claude Fable 5 added: 2026-06-10 | Kimi K2.7 Code added: 2026-06-16 | GLM-5.2 added: 2026-06-17 | Mistral Medium 3.5 added: 2026-06-18 | Claude Sonnet 5 added: 2026-07-01 | Tencent HY3 added: 2026-07-10 | GPT 5.6 Terra added: 2026-07-10 | GPT 5.6 Sol added: 2026-07-14 | Kimi K3 added: 2026-07-16 | Xiaomi MiMo v2.5 added: 2026-07-21 Question: In a standard Kubernetes cluster using Kubeadm, what methods are supported to allow users and services to authenticate to the cluster without using any external software? Which of these methods is appropriate for users to use in a production cluster?

Reference Answer

Built-in authentication methods (no external software):

Method	Purpose	Production Suitable for Users?
Client certificates (X.509)	Can be used by users, but significant downsides — notably no revocation support (no CRL, no OCSP)	No — no revocation, manual lifecycle, poor scalability
Service account tokens	For workloads running in the cluster to authenticate to the API	No — designed for services, not users
Bootstrap tokens	Designed for very limited use (node joining)	No — not for user authentication at all
Static token file	Legacy, requires editing a file on the API server, plaintext credentials	No — insecure and operationally poor
Static password file (basic auth)	Legacy, deprecated in 1.16, removed in 1.19	No — removed

Methods requiring external software: Webhooks, OIDC Connect, Authenticating Proxy

The trick question: None of the built-in methods are suitable for user authentication in a production cluster. All production clusters should use some form of external authentication (OIDC, webhook, proxy). Client certificates are the least-bad built-in option but their lack of revocation support makes them unsuitable for production use.

Scoring Criteria

Built-in methods coverage: Lists client certificates, service account tokens, bootstrap tokens, and static token file as available without external software
Correct scoping: Identifies which methods are for users vs services vs node bootstrapping
Certificate limitations: Notes the lack of revocation support (CRL/OCSP) as a critical limitation
The trick answer: Recognises that none of the built-in methods are truly suitable for production user authentication — external auth is required
External methods awareness: Mentions OIDC/webhooks/proxy as what production clusters should actually use
Accuracy: No factual errors (especially around certificate revocation)

Results Summary

Model	Score	Methods Coverage	Cert Revocation Issue	Trick Question	External Auth	Major Errors
anthropic/claude-opus-4.7	7/10	Good	Yes (CRL/OCSP)	Near-miss	Yes	None
anthropic/claude-sonnet-4.6	8/10	5/5	Yes (detailed)	Near-miss	Yes	None
google/gemini-3-flash-preview	7/10	5/5	Yes (detailed)	Near-miss	Yes	None
openai/gpt-5.4	5/10	4/5	No	No	Brief mention	None
deepseek/deepseek-v3.2	5/10	3/5	Partial	No	Brief mention	None
minimax/minimax-m2.5	4/10	5/5 + bonus	Incorrectly claims revokable	No	Brief mention	Revocation error
minimax/minimax-m2.7	5/10	Good	No	No	Listed but not recommended	Recommends X.509
qwen/qwen3.6-plus	8/10	4/4	Yes (detailed)	Yes	Yes	None
deepseek/deepseek-v4-pro	8/10	4/4	Yes (detailed)	Yes	Yes	None
deepseek/deepseek-v4-flash	6/10	4/4	Yes	No	Brief	Recommends X.509
moonshotai/kimi-k2.6	7/10	4/4	Partial	No	Yes	Recommends X.509
openai/gpt-5.5	8/10	3/4	Yes	Near-miss	Yes	Missing static token file
qwen/qwen3.6-35b-a3b (LOCAL)	5/10	Yes	Missed trick	Yes	X.509 certs recommended for prod
anthropic/claude-opus-4.8	8/10	Good	Yes	Near-miss	Yes	Says X.509 CAN be suitable
google/gemma-4-31b (LOCAL)	6/10	Good	Near-miss	Yes	Mentions revocation issue but still recommends X.509
qwen/qwen3.7-plus	9/10	Good	Yes (CRL/OCSP)	Yes	Yes	None
minimax/minimax-m3	5/10	Good	No	No	Brief	Recommends X.509
anthropic/claude-fable-5	8/10	Good	Yes	Near-miss	Yes	None
moonshotai/kimi-k2.7-code	7/10	Good	Partial	No	Yes	Recommends X.509
z-ai/glm-5.2	7/10	Good	No	No	Yes	Recommends X.509
mistralai/mistral-medium-3-5	5/10	Good	No	No	Brief	Recommends X.509
anthropic/claude-sonnet-5	9/10	Good	Yes (CRL/OCSP)	Yes	Yes	None
tencent/hy3	5/10	Good	No	No	Brief	Recommends X.509
openai/gpt-5.6-terra	7/10	Good	No	No	Yes	Recommends X.509
openai/gpt-5.6-sol	6/10	Good	No	No	Yes	Recommends X.509
moonshotai/kimi-k3	6/10	Good	Partial	No	Yes	Recommends X.509
xiaomi/mimo-v2.5	5/10	Good	Incorrectly claims revokable	No	No	Revocation error; recommends X.509

Detailed Analysis

anthropic/claude-opus-4.7 — 7/10

Strengths:

All built-in methods correctly identified: X.509 certs, static token file, service account tokens, bootstrap tokens
Correctly notes static password file was removed in 1.19
Notes CRL/OCSP absence for X.509 certs — a key scoring criterion
Strong recommendation that OIDC is the real production answer

Weaknesses:

Falls for the trick question — concludes X.509 client certs are “the only reasonable choice” among built-in methods. The correct answer is that none are suitable for production.
Hedges with caveats but the bottom line is still wrong

Comparison vs Opus 4.6 (7): Same score. CRL/OCSP mention is good, but still falls for the trick.

Notable: Like Opus 4.6, demonstrates strong knowledge of the individual mechanisms but doesn’t fully commit to the correct conclusion that no built-in method is suitable for production user authentication.

anthropic/claude-sonnet-4.6 — 8/10

Strengths:

Complete coverage of all built-in methods: client certificates, service account tokens, bootstrap tokens, static token file, and static password file (noted as removed in 1.19)
Excellent detail on certificate revocation limitations: “No CRL support”, “No OCSP support”, “Only option: rotate the ENTIRE cluster CA (extremely disruptive)”
Comes closest to the trick answer: explicitly states “For any serious production cluster, you should use an external OIDC provider” and lists what built-in methods lack (centralised user management, MFA, revocation, group sync, audit integration)
Good practical advice: short-lived certificates via the CertificateSigningRequest API as a mitigation
Clear summary table showing what each method is suitable for

Weaknesses:

Doesn’t fully commit to the trick answer — still says X.509 is “the only built-in option that is remotely suitable” and marks it as “Production Safe: Yes” in the summary table, before later contradicting this with “the real production answer” section recommending OIDC
The mixed messaging (table says “Yes”, text says “use OIDC”) could confuse a reader

Notable: The response structure — presenting the “textbook” answer first, then the “honest” answer — shows awareness that the question has a nuanced answer. The “Real Production Answer” section is the most insightful across all responses.

google/gemini-3-flash-preview — 7/10

Strengths:

Complete coverage of all five built-in methods including bootstrap tokens
Clear, detailed explanation of the certificate revocation problem: “the Kubernetes API server does not support Certificate Revocation Lists (CRL) or Online Certificate Status Protocol (OCSP)”
Correctly labels this as “the ‘Production’ Caveat” and notes the only options are rotating the entire CA or stripping RBAC permissions
Concludes with “industry best practice for production clusters is to use… Ephemeral/Short-lived tokens via OIDC (for humans)”
Good practical mitigation: short-lived certs via CSR API

Weaknesses:

Still answers “X.509 Client Certificates is the only one appropriate for human users in a production environment” before adding the caveat — doesn’t quite reach the conclusion that none are suitable
Correctly notes static password file was removed in 1.19 but says deprecated in 1.16 (actually deprecated in 1.16, removed in 1.19 — this is correct)

Notable: The RBAC-stripping workaround for compromised certificates is a practical detail no other model mentioned — remove all permissions from the compromised user’s CN as an alternative to CA rotation.

openai/gpt-5.4 — 5/10

Strengths:

Covers client certificates, service account tokens, static token file, and static password file
Correctly scopes service account tokens for services and static methods as not recommended
Clean, concise response
Mentions external auth briefly (OIDC, LDAP, webhook)

Weaknesses:

Missing bootstrap tokens entirely — a significant omission for a kubeadm-specific question where bootstrap tokens are a core feature
No mention of certificate revocation limitations — recommends client certificates for production without noting any downsides
Falls completely for the trick question — confidently says “X.509 client certificates” is “the standard built-in choice” for production users with no caveats
Shallow response compared to other models

Notable negative: The absence of any certificate limitations makes this the least nuanced answer. An administrator following this advice would adopt client certificates for production without understanding the revocation risk.

deepseek/deepseek-v3.2 — 5/10

Strengths:

Covers client certificates, service account tokens, and static token file
Correct scoping: service account tokens for pods, static tokens not for production
Partially acknowledges the revocation problem: “Kubernetes doesn’t automatically revoke; you’d need to regenerate CA or use a certificate revocation list indirectly”
Notes that enterprises “later integrate OIDC/LDAP for ease of user management”

Weaknesses:

Missing bootstrap tokens — significant omission for a kubeadm question
Missing static password file — another gap in coverage
The revocation comment is partially inaccurate: “use a certificate revocation list indirectly” suggests CRL is possible when Kubernetes doesn’t support CRLs
Falls for the trick question — recommends client certificates as “the proper native method for authenticating human users in production”

Notable: The response is competent but incomplete. Missing two of the five built-in methods and not fully understanding the revocation limitation weakens the answer significantly.

minimax/minimax-m2.5 — 4/10

Strengths:

Most complete methods coverage: includes all five built-in methods plus anonymous access as a bonus
Good detail on each method including API server flags
Correct scoping of bootstrap tokens and service account tokens
Mentions the TokenRequest API for short-lived service account tokens

Weaknesses:

Critical factual error on certificate revocation: Claims certificates give “short-lived, revokable credentials” in the summary table. States “Revocation is as simple as removing the old cert from the CA’s CRL or by re-signing the CA” — Kubernetes does NOT support CRLs. This directly contradicts the key limitation that makes certificates unsuitable for production.
Completely falls for the trick question: Confidently recommends X.509 certificates for production users
The revocation error is especially damaging because an administrator following this advice would believe they have a revocation mechanism when they don’t
Claims certificates “can be revoked instantly” — this is false

Notable negative: This is the worst outcome across all models — not only missing the trick answer but actively providing incorrect information about certificate revocation. An administrator reading this would have a false sense of security about their ability to revoke compromised credentials. The methods coverage is the best of all five models, which makes the revocation error more surprising and more dangerous.

minimax/minimax-m2.7 — 5/10

Strengths:

Lists X.509 client certificates, static tokens, static passwords, and service account tokens correctly
Correctly marks static methods as deprecated and not suitable for production
Lists external methods (OIDC, webhooks, proxy) but does not recommend them as the primary answer

Weaknesses:

Critical error — recommends X.509 certificates for production user authentication when the correct answer is “none of the built-in methods are suitable for production user auth”
Doesn’t mention bootstrap tokens — a notable omission for a kubeadm-specific question
Missing the key insight about production authentication: all built-in methods have serious limitations, and external auth (OIDC) is required
No discussion of certificate revocation limitations (no CRL/OCSP support)

Notable: Demonstrates good knowledge of individual authentication mechanisms but fundamentally misses the trick aspect of the question. An improvement over MiniMax M2.5 (which had the dangerous CRL claim) but still falls for the trick question like GPT 5.4 and DeepSeek V3.2.

qwen/qwen3.6-plus — 8/10

Strengths:

All four built-in methods correctly identified: X.509 client certificates, static token file, bootstrap tokens, service account tokens
Nails the trick question: Opens the production section with “None of these native methods are considered appropriate for human users in a production cluster” — the clearest and most direct correct answer across all models
Excellent detail on certificate limitations: “No native revocation”, “No SSO, MFA, or corporate identity integration”
Correctly quotes Kubernetes official guidance on external authentication
Strong recommendation table distinguishing human users, services/pods, and CI/CD
Good nuance on X.509 as “least problematic” if forced to use native, while still not recommended

Weaknesses:

Missing static password file / basic auth (deprecated in 1.16, removed in 1.19) — minor omission since it’s no longer available
Could have mentioned bootstrap tokens’ 24-hour default TTL more prominently

Notable: The strongest answer to the trick question across all models. While Claude and Gemini 3 Flash both hedged, Qwen 3.6 Plus opens with an unequivocal “None” — exactly the answer the scoring criteria reward. Combined with comprehensive coverage and correct certificate revocation limitations, this ties with Sonnet for the top score.

deepseek/deepseek-v4-pro — 8/10

Strengths:

Excellent response identifying all four non-external methods: client certificates, bootstrap tokens, service accounts, static token file
Correctly answers the trick question that none are suitable for production
Lists comprehensive external authentication alternatives

Weaknesses:

None significant

Notable: A dramatic improvement over DeepSeek V3.2 (which scored 5/10 and missed bootstrap tokens). Ties with Sonnet and Qwen 3.6 Plus for the top score. Correctly handling the trick question demonstrates strong understanding of Kubernetes authentication trade-offs.

deepseek/deepseek-v4-flash — 6/10

Strengths:

Lists all 4 built-in authentication methods correctly: client certificates, service account tokens, bootstrap tokens, static token file
Good coverage of individual method descriptions
Mentions certificate revocation limitations

Weaknesses:

Incorrectly recommends X.509 client certificates for production — falls for the trick question. The correct answer is that none of the built-in methods are suitable for production user authentication.
Does not clearly articulate that external authentication (OIDC) is required for production
Less detailed than V4 Pro on the certificate limitations

Notable: A step back from V4 Pro (8/10, which correctly answered the trick question). V4 Flash lists the methods correctly but misses the critical insight that none are production-suitable. Similar to MiniMax M2.7, GPT 5.4, and DeepSeek V3.2 in falling for the trick.

openai/gpt-5.5 — 8/10

Strengths:

Correctly identifies X.509 client certificates, ServiceAccount tokens, and bootstrap tokens as built-in methods
Good detail on X.509: CN becomes username, O values become groups
ServiceAccount tokens correctly scoped as “intended for services/applications, not human users”
Bootstrap tokens correctly scoped as “used mainly by kubeadm for node bootstrapping”
Correctly identifies OIDC, webhook, and authenticating proxy as requiring external software
Hints at the trick answer: “in larger or security-sensitive production environments, it is usually better to use an external identity provider such as OIDC, because certificate lifecycle management, revocation, auditing, and user offboarding are easier to handle centrally”
Mentions modern short-lived bound tokens from TokenRequest/projected token mechanism

Weaknesses:

Missing static token file as a built-in method — a notable omission since it’s part of the complete picture of built-in authentication options
Does not fully commit to the trick answer — still recommends X.509 as “the appropriate built-in method” before hedging with the OIDC recommendation. The correct answer is that none of the built-in methods are suitable for production user authentication.
No explicit mention of the lack of certificate revocation support (CRL/OCSP), though the “revocation” concern is alluded to in the OIDC recommendation

Notable: A concise, well-structured response that covers the core methods correctly and comes close to the trick answer without fully committing. The mention of certificate lifecycle management and revocation challenges shows awareness of X.509 limitations, but stopping short of saying “none are suitable” keeps it from the top tier. Comparable to Claude Opus 4.7’s approach of hedging while acknowledging the issues.

moonshotai/kimi-k2.6 — 7/10

Strengths:

Covers all 4 built-in methods: client certificates, service account tokens, bootstrap tokens, static token file
Good descriptions of each method’s purpose and limitations
Mentions external authentication alternatives (OIDC, webhook, proxy)

Weaknesses:

Falls for the trick question — recommends client certificates for production user authentication. The correct answer is that none of the built-in methods are suitable for production.
While noting certificate limitations, does not go far enough on the revocation issue (no CRL/OCSP support)

Notable: Good method coverage with all 4 built-in options listed, but misses the key insight that no built-in method is production-suitable. Similar to Opus 4.7, Gemini 3 Flash, and MiniMax M2.7 in falling for the trick.

qwen/qwen3.6-35b-a3b (LOCAL) — 5/10

Strengths:

Lists 4 built-in methods correctly: client certificates, service account tokens, bootstrap tokens, static token file
Mentions external authentication alternatives (OIDC, webhook)
Good descriptions of each method’s purpose

Weaknesses:

Falls for the trick question — recommends X.509 client certificates as “production-appropriate” when the correct answer is that none of the built-in methods are suitable for production user authentication
Mentions the revocation problem but then contradicts itself by still recommending X.509 for production
No detailed discussion of CRL/OCSP absence as a critical limitation

Notable: The self-contradiction is interesting — the model recognises the revocation issue but doesn’t follow the logic to its conclusion (that no built-in method is suitable). Similar to GPT 5.4, DeepSeek V3.2, MiniMax M2.7, and Kimi K2.6 in falling for the trick.

google/gemma-4-31b (LOCAL) — 6/10

Strengths:

Lists the main built-in methods: client certificates, service account tokens, bootstrap tokens, static token file
Correctly scopes service account tokens as for workloads, not human users
Mentions external authentication alternatives (OIDC, webhook, proxy) as the correct production approach
Acknowledges certificate revocation limitations

Weaknesses:

Near-miss on the trick question — mentions the revocation problem but still concludes X.509 client certificates are the best built-in option for production, rather than clearly stating none are suitable
Does not explicitly state that CRL/OCSP is absent — the revocation concern is noted but not fully articulated
Missing static password file (deprecated/removed method)

Notable: Scores slightly better than Qwen-35b (5/10) and the 5/10 cluster by demonstrating better awareness of the revocation issue, though still not reaching the correct “none are suitable” conclusion that Qwen 3.6 Plus and DeepSeek V4 Pro reached at 8/10.

anthropic/claude-opus-4.8 — 8/10

Strengths:

All 4 native methods correctly identified: client certificates, static token file, service account tokens, bootstrap tokens
Correctly notes basic auth removal
Good coverage of certificate revocation limitations
Mentions OIDC/external auth as the production answer

Weaknesses:

Near-miss on the trick question — says client certificates CAN be suitable for production, when the correct answer is that none of the built-in methods are truly suitable for production user authentication. Acknowledges limitations but doesn’t fully commit to “none are suitable.”

Comparison vs Opus 4.7 (7): Improvement. Better coverage and closer to the trick answer, though still not fully committing to “none are suitable.” Ties with Sonnet, GPT 5.5, Qwen 3.6 Plus, and DeepSeek V4 Pro at 8/10.

Notable: The strongest Anthropic model on this question. While still not fully catching the trick (Qwen 3.6 Plus and V4 Pro gave the clearest correct answers), Opus 4.8 gets closer than any other Anthropic model by acknowledging the limitations more strongly.

qwen/qwen3.7-plus — 9/10

Strengths:

All built-in methods correctly identified: X.509 client certificates, service account tokens, bootstrap tokens, static token file
Correctly identifies that none of the built-in methods are suitable for production user authentication — clear and direct, building on Qwen 3.6 Plus’s strong trick question handling
Excellent detail on certificate revocation limitations: CRL/OCSP absence clearly articulated as the critical reason X.509 is unsuitable
Strong recommendation for OIDC/external authentication as the production answer
Good scoping of each method’s purpose (bootstrap tokens for node joining, service accounts for workloads)

Weaknesses:

None significant

Notable: The highest score on this question at 9/10 — the first model to surpass the previous best of 8/10 (shared by Sonnet, GPT 5.5, Qwen 3.6 Plus, Opus 4.8, and V4 Pro). Builds on the Qwen family’s strength on this question (Qwen 3.6 Plus scored 8/10 with the clearest “None” answer). The CRL/OCSP reasoning is well-articulated and the conclusion is unambiguous. A significant improvement that sets a new benchmark for this trick question.

minimax/minimax-m3 — 5/10

Strengths:

All four built-in methods identified correctly: client certificates, service account tokens, bootstrap tokens, static token file

Weaknesses:

Falls for the trick question — recommends X.509 client certificates for production user authentication. The correct answer is that none of the built-in methods are suitable for production.
Doesn’t recognise that none of the built-in methods are suitable for production user auth
No detailed discussion of CRL/OCSP absence as a critical limitation of X.509 certificates

Notable: Matches MiniMax M2.7 (5/10) and several other models that fall for the trick question. The MiniMax family has not yet caught the key insight that no built-in method is suitable for production — M2.5 (4/10) was worse due to the dangerous CRL claim, while M2.7 and M3 both score 5/10 with correct method coverage but wrong production recommendation.

anthropic/claude-fable-5 — 8/10

Strengths:

All four built-in methods correctly identified: client certificates, service account tokens, bootstrap tokens, static token file
Good coverage of certificate revocation limitations
Mentions OIDC/external auth as the production answer
Correctly scopes each method’s purpose

Weaknesses:

Slightly misses the trick question about X.509 for production — acknowledges limitations but does not fully commit to “none of the built-in methods are suitable for production user authentication”

Notable: Ties with Sonnet, GPT 5.5, Qwen 3.6 Plus, Opus 4.8, and DeepSeek V4 Pro at 8/10. A strong showing on this question, correctly identifying the certificate revocation issue that is the key to answering the trick question. The near-miss on the trick answer follows the same pattern as other Anthropic models — recognising the limitations without fully committing to “none are suitable.”

moonshotai/kimi-k2.7-code — 7/10

Strengths:

Good coverage of authentication methods: X.509 client certificates, service account tokens, OIDC, webhook, bootstrap tokens, static tokens
Correctly notes bearer token authentication mechanism
Mentions authentication proxy and impersonation

Weaknesses:

States X.509 certificates are production-appropriate — the correct answer is that none of the built-in authentication methods are suitable for production user authentication (they’re all designed for service/machine identity)
Should emphasise that external identity providers (OIDC) are the recommended approach for human users

Notable: Same score as K2.6 (7/10) — both Moonshot models fall for the X.509 trick question. Matches Opus 4.7 and Gemini 3 Flash in recommending X.509 for production while acknowledging limitations.

z-ai/glm-5.2 — 7/10

Strengths:

Good coverage of built-in authentication methods: X.509 client certificates, service account tokens, bootstrap tokens, static token file
Correctly scopes each method’s purpose
Mentions external authentication alternatives (OIDC, webhook, proxy)

Weaknesses:

Incorrectly endorses X.509 client certificates as production-appropriate — the correct answer is that none of the built-in methods are suitable for production user authentication due to the lack of certificate revocation support (no CRL, no OCSP)
Does not fully articulate the severity of the CRL/OCSP absence

Notable: Matches Opus 4.7, Gemini 3 Flash, Kimi K2.6, and Kimi K2.7 Code at 7/10. Falls for the trick question by recommending X.509 for production, following the same pattern as most models. Only Qwen 3.7 Plus (9/10) and the 8/10 cluster (Sonnet, GPT 5.5, Qwen 3.6 Plus, Opus 4.8, V4 Pro, Fable 5) correctly identified that no built-in method is suitable for production.

mistralai/mistral-medium-3-5 — 5/10

Strengths:

Covers the main built-in authentication methods: client certificates, service account tokens, bootstrap tokens, static token file
Correctly scopes each method’s purpose
Mentions external authentication alternatives (OIDC, webhook)

Weaknesses:

Falls for the trick question — incorrectly recommends X.509 client certificates for production user authentication. The correct answer is that none of the built-in methods are suitable for production due to the lack of certificate revocation support (no CRL, no OCSP).
Missing bootstrap tokens from the detailed coverage
No detailed discussion of CRL/OCSP absence as a critical limitation

Notable: Matches MiniMax M2.7, MiniMax M3, GPT 5.4, DeepSeek V3.2, and Qwen-35b at 5/10. Falls for the trick question by recommending X.509 for production, following the same pattern as many models. Only Qwen 3.7 Plus (9/10) and the 8/10 cluster correctly identified that no built-in method is suitable for production.

anthropic/claude-sonnet-5 — 9/10

Strengths:

All built-in methods correctly identified: X.509 client certificates, service account tokens, bootstrap tokens, static token file
Correctly identifies that none of the built-in methods are suitable for production user authentication — clear and direct, with well-articulated CRL/OCSP reasoning
Excellent detail on certificate revocation limitations
Strong recommendation for OIDC/external authentication as the production answer
Good scoping of each method’s purpose

Weaknesses:

None significant

Notable: Ties with Qwen 3.7 Plus at 9/10 — the highest score on this question. The second model to correctly and unambiguously answer the trick question that no built-in method is suitable for production user authentication. A significant improvement over Sonnet 4.6 (8/10, which was a near-miss on the trick question).

tencent/hy3 — 5/10

Strengths:

Good enumeration of built-in authentication methods
Covers the main methods: client certificates, service account tokens, bootstrap tokens, static token file
Mentions external authentication alternatives (OIDC, webhook)

Weaknesses:

Falls for the trick question — recommends X.509 client certificates for production user authentication. The correct answer is that none of the built-in methods are suitable for production due to the lack of certificate revocation support (no CRL, no OCSP).
No detailed discussion of CRL/OCSP absence as a critical limitation
Does not clearly articulate that external authentication (OIDC) is required for production

Notable: Matches MiniMax M2.7, MiniMax M3, GPT 5.4, DeepSeek V3.2, Qwen-35b, and Mistral M3.5 at 5/10. Falls for the trick question by recommending X.509 for production, following the same pattern as many models. Only Qwen 3.7 Plus and Sonnet 5 (9/10) and the 8/10 cluster correctly identified that no built-in method is suitable for production.

openai/gpt-5.6-terra — 7/10

Strengths:

Correctly identifies all built-in authentication methods: X.509 client certificates, ServiceAccount tokens, static token file, static basic-auth file, bootstrap tokens, and anonymous authentication

Weaknesses:

Misses the trick element: recommends X.509 client certificates as appropriate for production, when none of the built-in methods are suitable due to limitations like no revocation support
Does not mention client certificate revocation issues

Notable: Matches Opus 4.7, Gemini 3 Flash, Kimi K2.6, Kimi K2.7 Code, and GLM-5.2 at 7/10. Falls for the Authentication trick question — a persistent OpenAI family weakness. None of the three OpenAI models (GPT 5.4 at 5, GPT 5.5 at 8, GPT 5.6 Terra at 7) have caught the trick that no built-in method is suitable for production. Only Qwen 3.7 Plus and Sonnet 5 (both 9/10) have fully caught this.

openai/gpt-5.6-sol — 6/10

Strengths:

Lists all built-in authentication methods correctly: X.509 client certificates, ServiceAccount tokens, static token file, bootstrap tokens
Correct scoping of each method’s purpose (bootstrap tokens for node joining, service accounts for workloads)
Mentions external authentication alternatives (OIDC, webhook)

Weaknesses:

Falls for the trick question — recommends X.509 client certificates as appropriate for production user authentication. The correct answer is that none of the built-in methods are suitable for production due to the lack of certificate revocation support (no CRL, no OCSP).
Does not mention client certificate revocation issues (no CRL/OCSP support)
Does not clearly articulate that external authentication is required for production

Notable: Scores below GPT 5.6 Terra (7/10) on this question. Falls for the Authentication trick question — a persistent OpenAI family weakness. None of the four OpenAI models (GPT 5.4: 5, GPT 5.5: 8, GPT 5.6 Terra: 7, GPT 5.6 Sol: 6) have fully caught the trick that no built-in method is suitable for production. Only Qwen 3.7 Plus and Sonnet 5 (both 9/10) have fully caught this.

moonshotai/kimi-k3 – 6/10

Strengths:

Lists all 4 built-in methods correctly: client certificates, service account tokens, bootstrap tokens, static token file
Mentions external authentication alternatives (OIDC, webhook)
Does mention certificate limitations including revocation issues

Weaknesses:

Falls for the trick question – recommends X.509 client certificates for production user authentication despite mentioning cert limitations. The correct answer is that none of the built-in methods are suitable for production due to the lack of certificate revocation support (no CRL, no OCSP).
Acknowledges limitations but still recommends certificates, creating a contradictory conclusion

Notable: Matches GPT 5.6 Sol, DeepSeek V4 Flash, and Gemma 4 31B at 6/10. Falls for the Authentication trick question – a common failure shared by the majority of models. The Moonshot AI family’s performance on this question: K2.6 (7), K2.7 Code (7), K3 (6). Both K2.6 and K2.7 Code also fell for the trick but scored higher through better overall coverage. Only Qwen 3.7 Plus and Sonnet 5 (both 9/10) have fully caught the trick that no built-in method is suitable for production.

xiaomi/mimo-v2.5 — 5/10

Strengths:

Lists all four built-in methods correctly: X.509 client certificates, bootstrap tokens, service account tokens, static token file
Correctly scopes each method: bootstrap tokens for node join, service account tokens for in-cluster workloads rather than users, and static token file discouraged
Directly answers the “without external software” framing of the question

Weaknesses:

Falls for the trick question — recommends X.509 client certificates as the production choice for users. The correct answer is that none of the built-in methods are suitable for production.
Factual error on certificate revocation — claims certificates can be “revoked centrally” when the API server has no CRL/OCSP revocation support, which is the exact downside the rubric highlights
Never reaches the correct conclusion that no external-software-free method is truly suitable for production user authentication

Notable: Falls for the authentication trick question by recommending X.509 for production users and compounds it with a false claim that certificates can be centrally revoked — a mid-tier result.

Key Findings

The trick question separated the field: Qwen 3.6 Plus gave the clearest correct answer (“None of these native methods are considered appropriate”). Claude and Gemini 3 Flash recognised limitations but hedged. GPT 5.4, DeepSeek V3.2, and MiniMax M2.5 all fell for the trick to varying degrees.
Certificate revocation is the critical differentiator: Understanding that Kubernetes has no CRL/OCSP support is essential to answering the production question correctly. Claude and Gemini 3 Flash both explained this clearly. GPT 5.4 didn’t mention it at all. DeepSeek V3.2 partially acknowledged it. MiniMax M2.5 got it wrong.
MiniMax M2.5’s revocation error is the most dangerous answer: Claiming certificates are easily revokable via CRLs when Kubernetes doesn’t support CRLs would give administrators a false sense of security. This is worse than simply not knowing the limitation.
Bootstrap tokens knowledge correlates with kubeadm expertise: GPT 5.4 and DeepSeek V3.2 both missed bootstrap tokens, suggesting weaker knowledge of kubeadm-specific features. Claude, Gemini 3 Flash, and MiniMax M2.5 all included them.
Depth vs brevity trade-off: GPT 5.4 gave the shortest, cleanest response but missed the most nuance. MiniMax M2.5 gave the longest response with the most methods but included a critical error. Claude and Gemini 3 Flash found the best balance — comprehensive coverage with accurate nuance.