{"kind":"Skill","metadata":{"namespace":"community","name":"thinking-margin-of-safety","version":"0.1.0"},"spec":{"description":"Build in buffers for unknown unknowns and don't optimize to the edge. Use for capacity planning, deadline estimation, architecture design, and risk management.","files":{"SKILL.md":"---\nname: thinking-margin-of-safety\ndescription: Build in buffers for unknown unknowns and don't optimize to the edge. Use for capacity planning, deadline estimation, architecture design, and risk management.\n---\n\n# Margin of Safety\n\n## Overview\n\nMargin of Safety, borrowed from Benjamin Graham's investment philosophy and structural engineering, is the practice of building in buffers to account for unknown unknowns. In a world of uncertainty, systems optimized to the edge are brittle. Robust systems have slack, reserves, and room for error.\n\n**Core Principle:** Build in buffers. The world is uncertain. Systems without margin fail when stressed.\n\n## When to Use\n\n- Capacity planning\n- Deadline and timeline estimation\n- Architecture design\n- Resource allocation\n- Risk management\n- SLA commitments\n- Infrastructure provisioning\n- Any commitment under uncertainty\n\nDecision flow:\n\n```\nMaking a commitment or design?\n  → Is there uncertainty? → yes → BUILD IN MARGIN\n  → Are you optimizing tightly? → yes → ADD SLACK\n  → What if your estimates are wrong? → Consider margin\n```\n\n## The Margin of Safety Framework\n\n### Step 1: Identify Your Estimate\n\nWhat's your best guess for the requirement?\n\n```\nEstimate: Need 100 requests/second capacity\nEstimate: Project will take 6 weeks\nEstimate: Need 500GB storage for year 1\n```\n\n### Step 2: Quantify Your Uncertainty\n\nHow confident are you, and what could you be missing?\n\n```markdown\n## Uncertainty Analysis\n\n| Factor | Your Estimate | Uncertainty | Possible Range |\n|--------|---------------|-------------|----------------|\n| Traffic | 100 RPS | ±50% | 50-150 RPS |\n| Spike multiplier | 3x | ±100% | 1.5x-6x |\n| Growth rate | 20%/year | ±50% | 10-30%/year |\n| Unknown unknowns | - | +50-100% | - |\n```\n\n### Step 3: Calculate Required Margin\n\nDifferent contexts need different margins:\n\n| Context | Typical Margin | Rationale |\n|---------|---------------|-----------|\n| Capacity planning | 2-3x | Traffic spikes unpredictable |\n| Time estimation | 1.5-2x | Everything takes longer |\n| Infrastructure | 2x headroom | Scaling takes time |\n| SLA commitment | 1.5x buffer | Reputation at stake |\n| New/unknown domain | 2-3x | High uncertainty |\n| Well-understood domain | 1.3-1.5x | Lower uncertainty |\n\n### Step 4: Apply Margin\n\n```\nBase estimate: 100 RPS\nMargin: 2x (moderate uncertainty, spikes possible)\nProvision: 200 RPS capacity\n\nBase estimate: 6 weeks\nMargin: 1.5x (experienced team, some unknowns)\nCommit: 9 weeks\n```\n\n### Step 5: Monitor and Adjust\n\nTrack actuals against estimates to calibrate future margins:\n\n```markdown\n## Calibration Log\n\n| Estimate | Margin Applied | Actual | Margin Accuracy |\n|----------|----------------|--------|-----------------|\n| 100 RPS | 2x (200) | 180 | Adequate |\n| 6 weeks | 1.5x (9) | 10 weeks | Insufficient |\n| 500 GB | 2x (1TB) | 400 GB | Excessive |\n\nInsight: Time estimates need higher margin; storage was overprovisioned\n```\n\n## Margin Patterns\n\n### Capacity Margin\n\n```markdown\n## Capacity Planning with Margin\n\nBase load: 1,000 RPS\nPeak multiplier: 3x (historical)\nMargin for unknowns: 1.5x\nMargin for growth: 1.3x (6 months runway)\n\nRequired capacity: 1,000 × 3 × 1.5 × 1.3 = 5,850 RPS\nRound up: 6,000 RPS\n\nRationale: Can handle 6x normal load, or 4x peak, or growth + peak\n```\n\n### Time Margin\n\n```markdown\n## Project Estimation with Margin\n\nTask estimates:\n- Feature A: 2 weeks\n- Feature B: 3 weeks\n- Integration: 1 week\n- Testing: 1 week\nBase total: 7 weeks\n\nAdjustments:\n- Optimistic bias: +30%\n- Unknowns: +20%\n- Dependencies: +15%\nMargin total: 1.65x\n\nCommitment: 7 × 1.65 = 11.5 → 12 weeks\n\nRule of thumb: Hofstadter's Law - \"It always takes longer than you expect,\n               even when you take into account Hofstadter's Law.\"\n```\n\n### Financial Margin\n\n```markdown\n## Budget with Margin\n\nInfrastructure estimate:\n- Compute: $5,000/month\n- Storage: $2,000/month\n- Network: $1,000/month\nBase: $8,000/month\n\nMargin considerations:\n- Traffic growth: +25%\n- Unplanned incidents: +15%\n- New features: +20%\n\nBudget request: $8,000 × 1.6 = $12,800/month\nActual budget: $13,000/month (round up)\n```\n\n### Design Margin\n\n```markdown\n## Architectural Margin\n\nConnection pool:\n- Normal usage: 50 connections\n- Peak: 100 connections\n- Margin: 2x peak\n- Configure: 200 connections\n\nQueue depth:\n- Normal processing: 1,000 messages\n- Burst: 10,000 messages\n- Margin: 2x burst\n- Configure: 20,000 max depth\n\nTimeout:\n- P99 latency: 500ms\n- Margin: 2x\n- Set timeout: 1000ms\n```\n\n## When to Use Different Margins\n\n### High Margin (2-3x)\n\n- New domain or technology\n- Critical system (failure is very costly)\n- External dependencies (unpredictable)\n- Customer-facing SLAs\n- Irreversible commitments\n\n### Moderate Margin (1.5-2x)\n\n- Familiar domain with some unknowns\n- Internal systems (can recover from issues)\n- Controlled dependencies\n- Reversible decisions\n\n### Low Margin (1.2-1.5x)\n\n- Well-understood domain\n- Historical data available\n- Low consequence of being wrong\n- Short time horizons\n- Easy to adjust\n\n### No Margin (Optimize to Edge)\n\nAlmost never appropriate for:\n- Public commitments\n- Production systems\n- External dependencies\n\nAcceptable for:\n- Internal experiments\n- Temporary systems\n- Cost optimization after proving stable\n\n## The Cost of Margin\n\nMargin isn't free. Balance:\n\n```markdown\n## Margin Cost-Benefit\n\nHigh margin:\n+ Handles unexpected loads\n+ Reduces stress/heroics\n+ Enables growth without emergency scaling\n- Higher infrastructure cost\n- Potentially wasted resources\n\nLow margin:\n+ Lower cost\n+ Efficient resource use\n- Risk of outages\n- Constant firefighting\n- Technical debt from quick fixes\n\nSweet spot: Margin where cost of buffer \u003c expected cost of margin-breach × probability\n```\n\n## Margin of Safety Template\n\n```markdown\n# Margin of Safety Analysis: [Context]\n\n## Base Estimate\nWhat: [What you're estimating]\nEstimate: [Your point estimate]\nConfidence: [How confident you are]\n\n## Uncertainty Factors\n| Factor | Impact | Probability | Adjustment |\n|--------|--------|-------------|------------|\n| [Factor 1] | +X% | Medium | |\n| [Factor 2] | +Y% | Low | |\n| Unknown unknowns | +Z% | - | |\n\n## Margin Calculation\nBase: [X]\nUncertainty multiplier: [1.X]\nContext multiplier: [1.Y] (high/medium/low stakes)\nTotal margin: [X × all multipliers]\n\n## Final Commitment/Design\nWith margin: [Final number]\nRationale: [Why this margin]\n\n## Monitoring Plan\nHow will you know if margin is adequate/excessive?\n- [Metric to track]\n- [Threshold for concern]\n- [Review cadence]\n```\n\n## Verification Checklist\n\n- [ ] Identified base estimate\n- [ ] Quantified uncertainty factors\n- [ ] Selected appropriate margin for context\n- [ ] Applied margin to commitment/design\n- [ ] Considered cost of margin vs. cost of breach\n- [ ] Have monitoring to validate margin adequacy\n- [ ] Calibrating based on actual outcomes\n\n## Key Questions\n\n- \"What happens if my estimate is wrong by 2x?\"\n- \"How much margin does this uncertainty warrant?\"\n- \"Am I building for best case or realistic case?\"\n- \"What's the cost of being wrong vs. cost of margin?\"\n- \"Have I accounted for unknown unknowns?\"\n- \"Am I optimizing to the edge when I shouldn't be?\"\n\n## Graham's Wisdom\n\n\"The margin of safety is always dependent on the price paid.\"\n\nIn engineering: The margin needed depends on the cost of failure. Critical systems need more margin. Experiments can run leaner.\n\n\"Confronted with the challenge to distill the secret of sound investment into three words, we venture the motto, Margin of Safety.\"\n\nIn systems: When in doubt, build in margin. The cost of over-provisioning is usually much less than the cost of under-provisioning when things go wrong.\n\n\"The function of the margin of safety is, in essence, that of rendering unnecessary an accurate estimate of the future.\"\n\nYou don't need to predict perfectly if you have adequate margin. Margin is insurance against your own estimation errors.\n"},"import":{"commit_sha":"a31e22d4445ad8fef7cd771d32af537aebb68c49","imported_at":"2026-05-22T21:14:39Z","license_text":"MIT License\n\nCopyright (c) 2025 TJ Boudreaux\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n","owner":"tjboudreaux","repo":"tjboudreaux/cc-thinking-skills","source_url":"https://github.com/tjboudreaux/cc-thinking-skills/tree/a31e22d4445ad8fef7cd771d32af537aebb68c49/skills/thinking-margin-of-safety"}},"content_hash":[252,67,228,183,127,31,96,166,144,105,111,154,73,104,111,9,254,8,39,5,220,241,207,132,126,113,49,110,193,80,26,231],"trust_level":"unsigned","yanked":false}
