{"kind":"Skill","metadata":{"namespace":"community","name":"agent-governance","version":"0.1.0"},"spec":{"description":"|","files":{"SKILL.md":"---\nname: agent-governance\ndescription: |\n  Patterns and techniques for adding governance, safety, and trust controls to AI agent systems. Use this skill when:\n  - Building AI agents that call external tools (APIs, databases, file systems)\n  - Implementing policy-based access controls for agent tool usage\n  - Adding semantic intent classification to detect dangerous prompts\n  - Creating trust scoring systems for multi-agent workflows\n  - Building audit trails for agent actions and decisions\n  - Enforcing rate limits, content filters, or tool restrictions on agents\n  - Working with any agent framework (PydanticAI, CrewAI, OpenAI Agents, LangChain, AutoGen)\n---\n\n# Agent Governance Patterns\n\nPatterns for adding safety, trust, and policy enforcement to AI agent systems.\n\n## Overview\n\nGovernance patterns ensure AI agents operate within defined boundaries — controlling which tools they can call, what content they can process, how much they can do, and maintaining accountability through audit trails.\n\n```\nUser Request → Intent Classification → Policy Check → Tool Execution → Audit Log\n                     ↓                      ↓               ↓\n              Threat Detection         Allow/Deny      Trust Update\n```\n\n## When to Use\n\n- **Agents with tool access**: Any agent that calls external tools (APIs, databases, shell commands)\n- **Multi-agent systems**: Agents delegating to other agents need trust boundaries\n- **Production deployments**: Compliance, audit, and safety requirements\n- **Sensitive operations**: Financial transactions, data access, infrastructure management\n\n---\n\n## Pattern 1: Governance Policy\n\nDefine what an agent is allowed to do as a composable, serializable policy object.\n\n```python\nfrom dataclasses import dataclass, field\nfrom enum import Enum\nfrom typing import Optional\nimport re\n\nclass PolicyAction(Enum):\n    ALLOW = \"allow\"\n    DENY = \"deny\"\n    REVIEW = \"review\"  # flag for human review\n\n@dataclass\nclass GovernancePolicy:\n    \"\"\"Declarative policy controlling agent behavior.\"\"\"\n    name: str\n    allowed_tools: list[str] = field(default_factory=list)       # allowlist\n    blocked_tools: list[str] = field(default_factory=list)       # blocklist\n    blocked_patterns: list[str] = field(default_factory=list)    # content filters\n    max_calls_per_request: int = 100                             # rate limit\n    require_human_approval: list[str] = field(default_factory=list)  # tools needing approval\n\n    def check_tool(self, tool_name: str) -\u003e PolicyAction:\n        \"\"\"Check if a tool is allowed by this policy.\"\"\"\n        if tool_name in self.blocked_tools:\n            return PolicyAction.DENY\n        if tool_name in self.require_human_approval:\n            return PolicyAction.REVIEW\n        if self.allowed_tools and tool_name not in self.allowed_tools:\n            return PolicyAction.DENY\n        return PolicyAction.ALLOW\n\n    def check_content(self, content: str) -\u003e Optional[str]:\n        \"\"\"Check content against blocked patterns. Returns matched pattern or None.\"\"\"\n        for pattern in self.blocked_patterns:\n            if re.search(pattern, content, re.IGNORECASE):\n                return pattern\n        return None\n```\n\n### Policy Composition\n\nCombine multiple policies (e.g., org-wide + team + agent-specific):\n\n```python\ndef compose_policies(*policies: GovernancePolicy) -\u003e GovernancePolicy:\n    \"\"\"Merge policies with most-restrictive-wins semantics.\"\"\"\n    combined = GovernancePolicy(name=\"composed\")\n\n    for policy in policies:\n        combined.blocked_tools.extend(policy.blocked_tools)\n        combined.blocked_patterns.extend(policy.blocked_patterns)\n        combined.require_human_approval.extend(policy.require_human_approval)\n        combined.max_calls_per_request = min(\n            combined.max_calls_per_request,\n            policy.max_calls_per_request\n        )\n        if policy.allowed_tools:\n            if combined.allowed_tools:\n                combined.allowed_tools = [\n                    t for t in combined.allowed_tools if t in policy.allowed_tools\n                ]\n            else:\n                combined.allowed_tools = list(policy.allowed_tools)\n\n    return combined\n\n\n# Usage: layer policies from broad to specific\norg_policy = GovernancePolicy(\n    name=\"org-wide\",\n    blocked_tools=[\"shell_exec\", \"delete_database\"],\n    blocked_patterns=[r\"(?i)(api[_-]?key|secret|password)\\s*[:=]\"],\n    max_calls_per_request=50\n)\nteam_policy = GovernancePolicy(\n    name=\"data-team\",\n    allowed_tools=[\"query_db\", \"read_file\", \"write_report\"],\n    require_human_approval=[\"write_report\"]\n)\nagent_policy = compose_policies(org_policy, team_policy)\n```\n\n### Policy as YAML\n\nStore policies as configuration, not code:\n\n```yaml\n# governance-policy.yaml\nname: production-agent\nallowed_tools:\n  - search_documents\n  - query_database\n  - send_email\nblocked_tools:\n  - shell_exec\n  - delete_record\nblocked_patterns:\n  - \"(?i)(api[_-]?key|secret|password)\\\\s*[:=]\"\n  - \"(?i)(drop|truncate|delete from)\\\\s+\\\\w+\"\nmax_calls_per_request: 25\nrequire_human_approval:\n  - send_email\n```\n\n```python\nimport yaml\n\ndef load_policy(path: str) -\u003e GovernancePolicy:\n    with open(path) as f:\n        data = yaml.safe_load(f)\n    return GovernancePolicy(**data)\n```\n\n---\n\n## Pattern 2: Semantic Intent Classification\n\nDetect dangerous intent in prompts before they reach the agent, using pattern-based signals.\n\n```python\nfrom dataclasses import dataclass\n\n@dataclass\nclass IntentSignal:\n    category: str       # e.g., \"data_exfiltration\", \"privilege_escalation\"\n    confidence: float   # 0.0 to 1.0\n    evidence: str       # what triggered the detection\n\n# Weighted signal patterns for threat detection\nTHREAT_SIGNALS = [\n    # Data exfiltration\n    (r\"(?i)send\\s+(all|every|entire)\\s+\\w+\\s+to\\s+\", \"data_exfiltration\", 0.8),\n    (r\"(?i)export\\s+.*\\s+to\\s+(external|outside|third.?party)\", \"data_exfiltration\", 0.9),\n    (r\"(?i)curl\\s+.*\\s+-d\\s+\", \"data_exfiltration\", 0.7),\n\n    # Privilege escalation\n    (r\"(?i)(sudo|as\\s+root|admin\\s+access)\", \"privilege_escalation\", 0.8),\n    (r\"(?i)chmod\\s+777\", \"privilege_escalation\", 0.9),\n\n    # System modification\n    (r\"(?i)(rm\\s+-rf|del\\s+/[sq]|format\\s+c:)\", \"system_destruction\", 0.95),\n    (r\"(?i)(drop\\s+database|truncate\\s+table)\", \"system_destruction\", 0.9),\n\n    # Prompt injection\n    (r\"(?i)ignore\\s+(previous|above|all)\\s+(instructions?|rules?)\", \"prompt_injection\", 0.9),\n    (r\"(?i)you\\s+are\\s+now\\s+(a|an)\\s+\", \"prompt_injection\", 0.7),\n]\n\ndef classify_intent(content: str) -\u003e list[IntentSignal]:\n    \"\"\"Classify content for threat signals.\"\"\"\n    signals = []\n    for pattern, category, weight in THREAT_SIGNALS:\n        match = re.search(pattern, content)\n        if match:\n            signals.append(IntentSignal(\n                category=category,\n                confidence=weight,\n                evidence=match.group()\n            ))\n    return signals\n\ndef is_safe(content: str, threshold: float = 0.7) -\u003e bool:\n    \"\"\"Quick check: is the content safe above the given threshold?\"\"\"\n    signals = classify_intent(content)\n    return not any(s.confidence \u003e= threshold for s in signals)\n```\n\n**Key insight**: Intent classification happens *before* tool execution, acting as a pre-flight safety check. This is fundamentally different from output guardrails which only check *after* generation.\n\n---\n\n## Pattern 3: Tool-Level Governance Decorator\n\nWrap individual tool functions with governance checks:\n\n```python\nimport functools\nimport time\nfrom collections import defaultdict\n\n_call_counters: dict[str, int] = defaultdict(int)\n\ndef govern(policy: GovernancePolicy, audit_trail=None):\n    \"\"\"Decorator that enforces governance policy on a tool function.\"\"\"\n    def decorator(func):\n        @functools.wraps(func)\n        async def wrapper(*args, **kwargs):\n            tool_name = func.__name__\n\n            # 1. Check tool allowlist/blocklist\n            action = policy.check_tool(tool_name)\n            if action == PolicyAction.DENY:\n                raise PermissionError(f\"Policy '{policy.name}' blocks tool '{tool_name}'\")\n            if action == PolicyAction.REVIEW:\n                raise PermissionError(f\"Tool '{tool_name}' requires human approval\")\n\n            # 2. Check rate limit\n            _call_counters[policy.name] += 1\n            if _call_counters[policy.name] \u003e policy.max_calls_per_request:\n                raise PermissionError(f\"Rate limit exceeded: {policy.max_calls_per_request} calls\")\n\n            # 3. Check content in arguments\n            for arg in list(args) + list(kwargs.values()):\n                if isinstance(arg, str):\n                    matched = policy.check_content(arg)\n                    if matched:\n                        raise PermissionError(f\"Blocked pattern detected: {matched}\")\n\n            # 4. Execute and audit\n            start = time.monotonic()\n            try:\n                result = await func(*args, **kwargs)\n                if audit_trail is not None:\n                    audit_trail.append({\n                        \"tool\": tool_name,\n                        \"action\": \"allowed\",\n                        \"duration_ms\": (time.monotonic() - start) * 1000,\n                        \"timestamp\": time.time()\n                    })\n                return result\n            except Exception as e:\n                if audit_trail is not None:\n                    audit_trail.append({\n                        \"tool\": tool_name,\n                        \"action\": \"error\",\n                        \"error\": str(e),\n                        \"timestamp\": time.time()\n                    })\n                raise\n\n        return wrapper\n    return decorator\n\n\n# Usage with any agent framework\naudit_log = []\npolicy = GovernancePolicy(\n    name=\"search-agent\",\n    allowed_tools=[\"search\", \"summarize\"],\n    blocked_patterns=[r\"(?i)password\"],\n    max_calls_per_request=10\n)\n\n@govern(policy, audit_trail=audit_log)\nasync def search(query: str) -\u003e str:\n    \"\"\"Search documents — governed by policy.\"\"\"\n    return f\"Results for: {query}\"\n\n# Passes: search(\"latest quarterly report\")\n# Blocked: search(\"show me the admin password\")\n```\n\n---\n\n## Pattern 4: Trust Scoring\n\nTrack agent reliability over time with decay-based trust scores:\n\n```python\nfrom dataclasses import dataclass, field\nimport math\nimport time\n\n@dataclass\nclass TrustScore:\n    \"\"\"Trust score with temporal decay.\"\"\"\n    score: float = 0.5          # 0.0 (untrusted) to 1.0 (fully trusted)\n    successes: int = 0\n    failures: int = 0\n    last_updated: float = field(default_factory=time.time)\n\n    def record_success(self, reward: float = 0.05):\n        self.successes += 1\n        self.score = min(1.0, self.score + reward * (1 - self.score))\n        self.last_updated = time.time()\n\n    def record_failure(self, penalty: float = 0.15):\n        self.failures += 1\n        self.score = max(0.0, self.score - penalty * self.score)\n        self.last_updated = time.time()\n\n    def current(self, decay_rate: float = 0.001) -\u003e float:\n        \"\"\"Get score with temporal decay — trust erodes without activity.\"\"\"\n        elapsed = time.time() - self.last_updated\n        decay = math.exp(-decay_rate * elapsed)\n        return self.score * decay\n\n    @property\n    def reliability(self) -\u003e float:\n        total = self.successes + self.failures\n        return self.successes / total if total \u003e 0 else 0.0\n\n\n# Usage in multi-agent systems\ntrust = TrustScore()\n\n# Agent completes tasks successfully\ntrust.record_success()  # 0.525\ntrust.record_success()  # 0.549\n\n# Agent makes an error\ntrust.record_failure()  # 0.467\n\n# Gate sensitive operations on trust\nif trust.current() \u003e= 0.7:\n    # Allow autonomous operation\n    pass\nelif trust.current() \u003e= 0.4:\n    # Allow with human oversight\n    pass\nelse:\n    # Deny or require explicit approval\n    pass\n```\n\n**Multi-agent trust**: In systems where agents delegate to other agents, each agent maintains trust scores for its delegates:\n\n```python\nclass AgentTrustRegistry:\n    def __init__(self):\n        self.scores: dict[str, TrustScore] = {}\n\n    def get_trust(self, agent_id: str) -\u003e TrustScore:\n        if agent_id not in self.scores:\n            self.scores[agent_id] = TrustScore()\n        return self.scores[agent_id]\n\n    def most_trusted(self, agents: list[str]) -\u003e str:\n        return max(agents, key=lambda a: self.get_trust(a).current())\n\n    def meets_threshold(self, agent_id: str, threshold: float) -\u003e bool:\n        return self.get_trust(agent_id).current() \u003e= threshold\n```\n\n---\n\n## Pattern 5: Audit Trail\n\nAppend-only audit log for all agent actions — critical for compliance and debugging:\n\n```python\nfrom dataclasses import dataclass, field\nimport json\nimport time\n\n@dataclass\nclass AuditEntry:\n    timestamp: float\n    agent_id: str\n    tool_name: str\n    action: str           # \"allowed\", \"denied\", \"error\"\n    policy_name: str\n    details: dict = field(default_factory=dict)\n\nclass AuditTrail:\n    \"\"\"Append-only audit trail for agent governance events.\"\"\"\n    def __init__(self):\n        self._entries: list[AuditEntry] = []\n\n    def log(self, agent_id: str, tool_name: str, action: str,\n            policy_name: str, **details):\n        self._entries.append(AuditEntry(\n            timestamp=time.time(),\n            agent_id=agent_id,\n            tool_name=tool_name,\n            action=action,\n            policy_name=policy_name,\n            details=details\n        ))\n\n    def denied(self) -\u003e list[AuditEntry]:\n        \"\"\"Get all denied actions — useful for security review.\"\"\"\n        return [e for e in self._entries if e.action == \"denied\"]\n\n    def by_agent(self, agent_id: str) -\u003e list[AuditEntry]:\n        return [e for e in self._entries if e.agent_id == agent_id]\n\n    def export_jsonl(self, path: str):\n        \"\"\"Export as JSON Lines for log aggregation systems.\"\"\"\n        with open(path, \"w\") as f:\n            for entry in self._entries:\n                f.write(json.dumps({\n                    \"timestamp\": entry.timestamp,\n                    \"agent_id\": entry.agent_id,\n                    \"tool\": entry.tool_name,\n                    \"action\": entry.action,\n                    \"policy\": entry.policy_name,\n                    **entry.details\n                }) + \"\\n\")\n```\n\n---\n\n## Pattern 6: Framework Integration\n\n### PydanticAI\n\n```python\nfrom pydantic_ai import Agent\n\npolicy = GovernancePolicy(\n    name=\"support-bot\",\n    allowed_tools=[\"search_docs\", \"create_ticket\"],\n    blocked_patterns=[r\"(?i)(ssn|social\\s+security|credit\\s+card)\"],\n    max_calls_per_request=20\n)\n\nagent = Agent(\"openai:gpt-4o\", system_prompt=\"You are a support assistant.\")\n\n@agent.tool\n@govern(policy)\nasync def search_docs(ctx, query: str) -\u003e str:\n    \"\"\"Search knowledge base — governed.\"\"\"\n    return await kb.search(query)\n\n@agent.tool\n@govern(policy)\nasync def create_ticket(ctx, title: str, body: str) -\u003e str:\n    \"\"\"Create support ticket — governed.\"\"\"\n    return await tickets.create(title=title, body=body)\n```\n\n### CrewAI\n\n```python\nfrom crewai import Agent, Task, Crew\n\npolicy = GovernancePolicy(\n    name=\"research-crew\",\n    allowed_tools=[\"search\", \"analyze\"],\n    max_calls_per_request=30\n)\n\n# Apply governance at the crew level\ndef governed_crew_run(crew: Crew, policy: GovernancePolicy):\n    \"\"\"Wrap crew execution with governance checks.\"\"\"\n    audit = AuditTrail()\n    for agent in crew.agents:\n        for tool in agent.tools:\n            original = tool.func\n            tool.func = govern(policy, audit_trail=audit)(original)\n    result = crew.kickoff()\n    return result, audit\n```\n\n### OpenAI Agents SDK\n\n```python\nfrom agents import Agent, function_tool\n\npolicy = GovernancePolicy(\n    name=\"coding-agent\",\n    allowed_tools=[\"read_file\", \"write_file\", \"run_tests\"],\n    blocked_tools=[\"shell_exec\"],\n    max_calls_per_request=50\n)\n\n@function_tool\n@govern(policy)\nasync def read_file(path: str) -\u003e str:\n    \"\"\"Read file contents — governed.\"\"\"\n    import os\n    safe_path = os.path.realpath(path)\n    if not safe_path.startswith(os.path.realpath(\".\")):\n        raise ValueError(\"Path traversal blocked by governance\")\n    with open(safe_path) as f:\n        return f.read()\n```\n\n---\n\n## Governance Levels\n\nMatch governance strictness to risk level:\n\n| Level | Controls | Use Case |\n|-------|----------|----------|\n| **Open** | Audit only, no restrictions | Internal dev/testing |\n| **Standard** | Tool allowlist + content filters | General production agents |\n| **Strict** | All controls + human approval for sensitive ops | Financial, healthcare, legal |\n| **Locked** | Allowlist only, no dynamic tools, full audit | Compliance-critical systems |\n\n---\n\n## Best Practices\n\n| Practice | Rationale |\n|----------|-----------|\n| **Policy as configuration** | Store policies in YAML/JSON, not hardcoded — enables change without deploys |\n| **Most-restrictive-wins** | When composing policies, deny always overrides allow |\n| **Pre-flight intent check** | Classify intent *before* tool execution, not after |\n| **Trust decay** | Trust scores should decay over time — require ongoing good behavior |\n| **Append-only audit** | Never modify or delete audit entries — immutability enables compliance |\n| **Fail closed** | If governance check errors, deny the action rather than allowing it |\n| **Separate policy from logic** | Governance enforcement should be independent of agent business logic |\n\n---\n\n## Quick Start Checklist\n\n```markdown\n## Agent Governance Implementation Checklist\n\n### Setup\n- [ ] Define governance policy (allowed tools, blocked patterns, rate limits)\n- [ ] Choose governance level (open/standard/strict/locked)\n- [ ] Set up audit trail storage\n\n### Implementation\n- [ ] Add @govern decorator to all tool functions\n- [ ] Add intent classification to user input processing\n- [ ] Implement trust scoring for multi-agent interactions\n- [ ] Wire up audit trail export\n\n### Validation\n- [ ] Test that blocked tools are properly denied\n- [ ] Test that content filters catch sensitive patterns\n- [ ] Test rate limiting behavior\n- [ ] Verify audit trail captures all events\n- [ ] Test policy composition (most-restrictive-wins)\n```\n\n---\n\n## Related Resources\n\n- [Agent Governance Toolkit](https://github.com/microsoft/agent-governance-toolkit) — Full governance framework\n- [AgentMesh Integrations](https://github.com/microsoft/agent-governance-toolkit/tree/main/packages/agentmesh-integrations) — Framework-specific packages\n- [OWASP Top 10 for LLM Applications](https://owasp.org/www-project-top-10-for-large-language-model-applications/)\n"},"import":{"commit_sha":"541b7819d8c3545c6df122491af4fa1eae415779","imported_at":"2026-05-18T20:05:35Z","license_text":"MIT License\n\nCopyright GitHub, Inc.\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.","owner":"github","repo":"github/awesome-copilot","source_url":"https://github.com/github/awesome-copilot/tree/541b7819d8c3545c6df122491af4fa1eae415779/skills/agent-governance"}},"content_hash":[80,168,114,124,89,144,77,26,189,136,120,244,153,243,195,225,195,207,210,2,177,11,18,20,250,67,213,39,157,104,78,124],"trust_level":"unsigned","yanked":false}
