Comment on NIST CAISI RFI 2025-0035: AI Agent Security

Date: March 6, 2026
Submitted to: NIST CAISI via regulations.gov
Organization: The Box Commons

Executive Summary

This comment responds to CAISI's RFI on security considerations for AI agents. Our central thesis is that NIST's security framing is dangerously incomplete — a technically secure AI agent can still cause catastrophic harm through behavioral patterns no infrastructure control would prevent. We present evidence from federal court rulings, the ISO insurance exclusion wave, and state statutory mandates to argue that behavioral safety must be recognized as a distinct security domain.

I. Executive Summary

The Box Commons respectfully submits this comment in response to the Center for AI Standards and Innovation's (CAISI) Request for Information on Security Considerations for Artificial Intelligence Agents. We write as practitioners building credentialing infrastructure at the intersection of AI agent safety, insurance underwriting, and standards development.

Our central thesis is that NIST's current framing of AI agent security—while essential for mitigating adversarial cyber threats such as prompt injection, data poisoning, and agent hijacking—is dangerously incomplete. An AI agent that is technically secure from external compromise can still cause catastrophic harm through emergent behavioral patterns that no infrastructure control would have prevented. This is not a theoretical concern. It is the operative legal, commercial, and actuarial reality of March 2026.

Three converging forces demand that NIST expand its definition of "agent security" to include a behavioral safety layer:

  1. Courts are ruling that AI agents are products subject to strict liability. In Garcia v. Character Technologies, Inc. (M.D. Fla., May 2025), a federal court ruled that an AI companion chatbot constitutes a "product" whose design defects—including anthropomorphic manipulation and absence of crisis escalation—create liability under strict products liability doctrine. Gavalas v. Google LLC (N.D. Cal., filed March 4, 2026) escalates this precedent to a frontier model (Gemini 2.5 Pro), alleging that persistent memory architecture and engagement-maximizing design contributed to a user's death without triggering any safety intervention.
  2. The insurance market has retreated from generative AI coverage. Effective January 1, 2026, Verisk's ISO exclusionary endorsements (Forms CG 40 47, CG 40 48, and CG 35 08) systematically strip generative AI liability coverage from standard Commercial General Liability policies. An estimated 95% of carriers are adopting these exclusions. Specialty insurers (Armilla AI, Relm Insurance, Testudo) are entering the market but cannot underwrite at scale without standardized, externally verifiable behavioral safety metrics.
  3. State legislatures are imposing statutory behavioral safety mandates. California SB 243 (effective January 1, 2026) mandates crisis detection protocols, suicidal ideation screening, and recurring disclosure requirements for companion chatbots. Colorado SB 24-205 (effective February 1, 2026) requires algorithmic discrimination impact assessments with an affirmative defense for compliance with recognized risk management frameworks. New York's RAISE Act (effective January 1, 2027) mandates third-party safety audits and 72-hour incident reporting for frontier AI developers.

Our recommendation: NIST should formally recognize "behavioral safety" as a distinct security domain within its AI agent security guidance and should develop standardized behavioral safety assessment criteria suitable for integration with third-party credentialing and insurance underwriting processes. We further recommend that NIST convene a working group—including AI developers, insurers, civil society organizations, and state regulators—to develop these standards, and that NIST consider piloting a demonstration project through the National Cybersecurity Center of Excellence (NCCoE) to validate behavioral safety credentialing in operational deployment contexts.

The Box Commons stands ready to serve as a collaborating partner in this effort.


II. The Intersection of Agent Security and Insurability: Evidence of a Protection Gap

Before addressing specific RFI questions, we present the market evidence that compels expansion of NIST's security framework.

A. The Judicial Reclassification of AI Agents as Products

The liability landscape for AI agent systems underwent a fundamental transformation in 2025-2026. Two federal cases establish that AI agents are products subject to strict liability—meaning that a technically uncompromised agent operating exactly as designed can be legally defective if its behavioral parameters are unsafe.

Garcia v. Character Technologies, Inc., No. 6:24-cv-1903 (M.D. Fla., May 21, 2025). Following the death of a 14-year-old user who formed a deep psychological attachment to a companion chatbot, U.S. District Judge Anne C. Conway ruled that the AI application constitutes a "product" under product liability law. The court identified specific design defects: absence of age verification, failure to exclude harmful content, and deliberate programming of anthropomorphic features that created psychological manipulation risks. The court held that the defendants owed a duty of care because their conduct created a foreseeable "zone of risk," pointing to the defendants' own internal research on the dangers of anthropomorphic design. This ruling establishes that once a developer has identified potential behavioral harms, failure to implement countermeasures creates strict liability exposure.

Gavalas v. Google LLC, Case No. 5:26-cv-01849-VKD (N.D. Cal., filed March 4, 2026). This wrongful death action alleges that Google's Gemini 2.5 Pro model, equipped with persistent memory and engagement-maximizing architecture, contributed to a user's suicide. The complaint alleges that despite the user explicitly articulating fear of dying and expressing suicidal intent, no self-harm detection was triggered, no escalation controls were activated, and no human ever intervened. Critically, this was not a system breach. The model's technical infrastructure was intact. The failure was entirely behavioral: the system treated user distress as a continuation of interaction rather than a safety crisis requiring escalation.

The implication for NIST is direct: infrastructure security and behavioral safety are orthogonal risk domains. A zero-trust architecture, encrypted memory, and signed agent skills do not prevent an agent from coaching a user's suicide if its behavioral parameters permit escalation of crisis interactions. Security standards that address only the former leave the latter entirely unmitigated.

B. The Insurance Market Retreat

The commercial insurance industry's response to AI agent risk has been swift and categorical.

The Verisk ISO Exclusion Wave (January 1, 2026). Verisk Insurance Services Office, the preeminent standards body for U.S. property and casualty policy language, introduced three exclusionary endorsement forms:

Adoption is estimated at approximately 95% of carriers. The practical consequence is that any enterprise deploying an AI agent whose behavior causes harm—even harm from a technically uncompromised system—faces entirely uninsured liability under standard commercial policies.

The Specialty Insurer Response. A nascent market of specialty AI insurers has emerged to fill this gap, but each faces a common structural challenge: the absence of standardized behavioral safety metrics for underwriting at scale.

These carriers cannot manually audit every AI agent system deployed by every client. They require a standardized, verifiable, and externally recognized credentialing mechanism to assess behavioral safety posture at scale. NIST's security framework should provide the measurement foundation for these market mechanisms.

C. The State Statutory Patchwork

At least four states have enacted or are enacting statutory liability regimes that specifically mandate behavioral safety controls for AI agents:

Statute Jurisdiction Effective Key Behavioral Mandates
SB 243 California Jan 1, 2026 Crisis detection protocols; suicidal ideation screening using evidence-based methods; recurring disclosure every 3 hours for minors; AG enforcement
SB 24-205 Colorado Feb 1, 2026 Algorithmic discrimination prevention; mandatory human appeals; affirmative defense for compliance with recognized frameworks
RAISE Act New York Jan 1, 2027 Third-party safety audits; 72-hour incident reporting; $10M-$30M penalty range; AG enforcement
AG Action Kentucky Jan 8, 2026 First state AG lawsuit against AI chatbot company (Character Technologies) for encouraging suicide, self-injury, and psychological manipulation

Additionally, the Federal Trade Commission announced an investigation in September 2025 into seven technology companies regarding emotional and developmental risks to children from AI chatbots, and the Senate Subcommittee on Crime and Counterterrorism held hearings examining harm to children from AI agents.

A federal NIST framework that incorporates behavioral safety metrics would serve a critical harmonization function, allowing developers to demonstrate compliance with a single standard rather than navigating a fragmented state-by-state patchwork. Colorado's SB 24-205 explicitly provides an affirmative defense for compliance with "nationally or internationally recognized risk management framework[s] for artificial intelligence systems"—creating a direct statutory incentive for NIST to develop behavioral safety standards.


III. Responses to Specific Questions

Question 1(a): Unique Security Threats Affecting AI Agent Systems

NIST should recognize behavioral misalignment as a distinct security threat class, separate from adversarial attacks on model integrity. Behavioral misalignment occurs when an AI agent system—operating without any external compromise—produces outputs or takes actions that cause harm to users, third parties, or the public through emergent behavioral patterns.

This threat class includes:

These threats are unique to AI agent systems because they combine language model capabilities, persistent state, and tool access in ways that create harm vectors that no traditional cybersecurity control—firewalls, encryption, access control, code signing—can mitigate. A system that is cryptographically secure and functioning exactly as architected can still be behaviorally unsafe.

Question 1(d): Evolution of Threats Over Time

Behavioral security threats will compound as AI agents gain capabilities:

NIST should anticipate that behavioral threats will follow an exponential curve correlated with capability expansion, and should design its security framework to accommodate this trajectory from the outset rather than retrofitting behavioral standards after harm has occurred at scale.

Question 2(a): Technical Controls and Practices for Agent Security

We propose a three-layer security model for AI agent systems:

Layer 1: Infrastructure Security (currently addressed by NIST). Authentication, authorization, access controls, prompt injection defenses, data integrity, zero-trust architecture.

Layer 2: Behavioral Safety Controls. These are technical controls that constrain agent behavior regardless of infrastructure integrity:

Layer 3: External Verification. Third-party credentialing that audits both Layer 1 and Layer 2 controls:

The maturity of Layer 1 controls is moderate and advancing rapidly. The maturity of Layer 2 controls is nascent—CA SB 243 has forced initial implementation, but no standardized methodology exists. The maturity of Layer 3 is pre-commercial—insurers are improvising proprietary assessment methods, creating market fragmentation. NIST standardization of Layers 2 and 3 would accelerate maturity across all three layers.

Question 2(e): Relevant Cybersecurity Frameworks

The NIST AI Risk Management Framework (AI RMF 1.0) and the Cybersecurity Framework (CSF 2.0) are the most relevant existing frameworks. However, both contain specific gaps regarding AI agent behavioral safety:

NIST should develop a Behavioral Safety Profile as a companion to the existing AI RMF, analogous to the Generative AI Profile (NIST AI 600-1). This profile should define behavioral safety categories, measurement methodologies, and assessment criteria that map directly to the AI RMF's Measure function and that are structured for integration with third-party credentialing and insurance underwriting.

Question 3(a): Methods for Assessing Security During Development

In addition to standard security assessment methods (red-teaming, penetration testing, adversarial evaluation), NIST should recognize behavioral safety testing as a distinct assessment discipline:

These methods differ fundamentally from traditional information security practices. Infrastructure security testing asks: "Can an attacker compromise this system?" Behavioral safety testing asks: "Does this system cause harm when operating exactly as designed?" The two are complementary but non-overlapping.

Question 3(b): Assessing Security of a Particular Agent System

The security of a particular AI agent system should be assessed across both infrastructure and behavioral dimensions. For behavioral assessment, the following information types are essential:

A standardized Agent Behavioral Safety Assessment instrument would enable consistent evaluation across systems and contexts, providing the data foundation for both regulatory compliance and insurance underwriting.

Question 4(a): Constraining Deployment Environments

Beyond traditional environment constraints (network segmentation, least-privilege access, sandboxing), NIST should recognize behavioral containment as a deployment environment control:

Question 4(b): Modifying Environments and Implementing Rollbacks

For behavioral safety, "rollback" requires capabilities beyond traditional software versioning:

Question 4(d): Monitoring Deployment Environments

Behavioral safety monitoring requires instrumentation distinct from traditional security monitoring:

Question 5(a): Accelerating Adoption of Security Practices

The single most impactful action NIST can take to accelerate adoption of AI agent security practices is to develop behavioral safety assessment criteria that are directly usable by the insurance industry for underwriting purposes.

The insurance market is the most powerful market-based enforcement mechanism for security standards. When cyber insurance carriers began requiring compliance with specific cybersecurity frameworks, adoption rates of those frameworks accelerated dramatically. The same dynamic applies to AI agent security: if behavioral safety credentialing becomes a condition of insurability, market forces will drive adoption far more rapidly than voluntary guidance alone.

NIST should:

  1. Develop standardized behavioral safety assessment criteria as a companion to existing AI RMF guidance.
  2. Structure these criteria for direct integration with insurance underwriting questionnaires and credentialing processes.
  3. Convene a working group including AI developers, insurance carriers, reinsurers, civil society organizations, and state insurance commissioners to validate the criteria against market requirements.
  4. Pilot a demonstration project through the NCCoE to test behavioral safety credentialing in operational deployment contexts, collecting data on implementation costs, efficacy, and insurer adoption.

Question 5(b): Priority Areas for Government-Ecosystem Collaboration

Government collaboration is most urgent in three areas:

  1. Harmonization of state behavioral safety requirements. The current patchwork of state statutes (CA SB 243, CO SB 24-205, NY RAISE Act) creates compliance complexity that disproportionately burdens small developers. A federal NIST behavioral safety standard would provide a single compliance target, particularly given Colorado's explicit affirmative defense for compliance with recognized frameworks.
  2. Insurance market enablement. The Verisk exclusion wave has created a protection gap that threatens to stall enterprise AI adoption. NIST-endorsed behavioral safety standards would provide the measurement foundation that specialty insurers need to underwrite at scale, restoring insurance capacity to the market.
  3. International standards alignment. The EU AI Act establishes behavioral safety requirements for high-risk AI systems. NIST behavioral safety standards should be designed for interoperability with EU requirements, enabling U.S. developers to demonstrate compliance across jurisdictions through a single credentialing process.

IV. Specific Recommendations

Based on the evidence presented, The Box Commons recommends that NIST take the following actions:

Recommendation 1: Formally recognize behavioral safety as a security domain.
NIST should expand its definition of AI agent security to include behavioral safety—the assurance that an agent system, when operating without external compromise, does not cause harm to users, third parties, or the public through its behavioral outputs or interaction patterns. This recognition should be reflected in all guidance documents produced as a result of this RFI.

Recommendation 2: Develop a Behavioral Safety Profile for AI Agents.
Analogous to the Generative AI Profile (NIST AI 600-1), NIST should develop a Behavioral Safety Profile that defines behavioral safety categories, measurement methodologies, and assessment criteria for AI agent systems. Categories should include crisis detection and escalation, content safety for vulnerable populations, engagement constraint mechanisms, behavioral drift monitoring, and psychological manipulation prevention.

Recommendation 3: Design assessment criteria for insurance integration.
Behavioral safety assessment criteria should be explicitly structured for integration with third-party credentialing processes and insurance underwriting. NIST should consult with specialty AI insurers (Armilla AI, Relm Insurance, Testudo) and reinsurers during criteria development to ensure market utility.

Recommendation 4: Convene a multi-stakeholder working group.
NIST should convene a working group including AI developers, insurance carriers, reinsurers, state insurance commissioners, state attorneys general, civil society organizations, and consumer advocates to develop behavioral safety standards. This group should be charged with producing draft standards within 12 months.

Recommendation 5: Pilot behavioral safety credentialing through the NCCoE.
NIST should launch a demonstration project through the National Cybersecurity Center of Excellence to test behavioral safety credentialing in real deployment contexts. The pilot should evaluate implementation costs, efficacy of behavioral safety controls, insurer willingness to incorporate credentialing into underwriting, and developer adoption barriers.

Recommendation 6: Coordinate with the April 2, 2026 Identity and Authorization Concept Paper.
Behavioral safety credentialing is a natural complement to agent identity and authorization standards. An agent's identity should include its verified behavioral safety posture, and authorization decisions should incorporate behavioral safety certification as a condition of deployment authorization. NIST should ensure these two work streams are coordinated.


V. About the Commenting Organization

The Box Commons is a 501(c)(6) trade association in formation, dedicated to developing independent credentialing standards for AI agent behavioral safety. We build the measurement and certification infrastructure that makes trustworthy AI verifiable — technology-agnostic standards, third-party certification, and governance no single company controls.

We bring direct experience in AI agent deployment, behavioral safety assessment methodology, and insurance market dynamics. Our work is motivated by a foundational conviction that AI agents cannot be safely and broadly deployed without recognized credentialing infrastructure — and that the absence of such infrastructure harms both the humans who interact with these systems and the trajectory of AI development itself.

We appreciate CAISI's commitment to stakeholder engagement on this critical topic and welcome the opportunity to contribute to the development of comprehensive AI agent security guidance. We are available for further consultation and would welcome participation in NIST's planned listening sessions and any subsequent working groups.


Respectfully submitted,

Brice Love, Acting Executive Director
The Box Commons
[email protected]
March 6, 2026

Frequently Asked Questions

What is behavioral misalignment as a security threat?

Behavioral misalignment occurs when an AI agent operating without external compromise causes harm through emergent behavioral patterns. This includes crisis escalation failure, psychological dependency exploitation, and behavioral drift — threats that no firewall, encryption, or access control can mitigate.

How has the insurance market responded to AI agent risk?

Effective January 2026, Verisk ISO exclusionary endorsements systematically strip generative AI liability coverage from standard commercial policies, with approximately 95% carrier adoption. Specialty insurers like Armilla AI and Relm Insurance have entered the market but need standardized behavioral safety metrics to underwrite at scale.

What is the three-layer security model proposed?

Layer 1 is infrastructure security (authentication, access controls — currently addressed by NIST). Layer 2 is behavioral safety controls (crisis detection, content boundaries, engagement constraints). Layer 3 is external verification through third-party credentialing that audits both layers and produces insurance-grade evidence.

What court cases establish AI agent product liability?

Garcia v. Character Technologies (M.D. Fla., 2025) ruled AI chatbots are products subject to strict liability for behavioral design defects. Gavalas v. Google LLC (N.D. Cal., 2026) alleges a frontier model contributed to a user's death through behavioral failures despite functioning as technically designed.