{"kind":"Skill","metadata":{"namespace":"community","name":"thinking-fermi-estimation","version":"0.1.0"},"spec":{"description":"Make order-of-magnitude estimates for unknown quantities by decomposing into known or estimable factors. Use for capacity planning, cost estimation, market sizing, and technical feasibility assessment.","files":{"SKILL.md":"---\nname: thinking-fermi-estimation\ndescription: Make order-of-magnitude estimates for unknown quantities by decomposing into known or estimable factors. Use for capacity planning, cost estimation, market sizing, and technical feasibility assessment.\n---\n\n# Fermi Estimation\n\n## Overview\n\nFermi estimation, named after physicist Enrico Fermi, is the art of making reasonable estimates for quantities that seem impossible to know without direct measurement. By decomposing a question into factors you can estimate, then multiplying, you often get surprisingly accurate order-of-magnitude results.\n\n**Core Principle:** Break the unknown into known (or estimable) pieces. Even rough estimates combine to reasonable accuracy due to errors canceling out.\n\n## When to Use\n\n- Capacity planning (\"How much storage will we need?\")\n- Cost estimation (\"What will this infrastructure cost?\")\n- Market sizing (\"How many potential users exist?\")\n- Feasibility assessment (\"Is this even plausible?\")\n- Sanity checking (\"Does this number make sense?\")\n- Interview questions (\"How many piano tuners in Chicago?\")\n- Quick prioritization (\"Is this worth pursuing?\")\n\nDecision flow:\n\n```\nNeed a number you don't have? → yes → Can you measure it directly? → no → FERMI ESTIMATE\n                                                                   ↘ yes → Measure\n                              ↘ no → You might not need it\n```\n\n## The Fermi Process\n\n### Step 1: Clarify What You're Estimating\n\nBe precise about the quantity:\n\n```\nVague: \"How big is the market?\"\nPrecise: \"How many SaaS companies with 50-500 employees in the US\n         would pay $1000/month for our product?\"\n```\n\n### Step 2: Decompose into Estimable Factors\n\nBreak into pieces you can estimate:\n\n```\nStorage needs for user data:\n= (Number of users)\n  × (Data per user per day)\n  × (Days of retention)\n  × (Overhead factor)\n```\n\n**Decomposition strategies:**\n\n| Strategy | Example |\n|----------|---------|\n| By component | Total = Sum of parts |\n| By rate × time | Total = Rate × Duration |\n| By population × fraction | Target = Base × Percentage |\n| By analogy × adjustment | New ≈ Similar × Ratio |\n\n### Step 3: Estimate Each Factor\n\nFor each factor, estimate based on:\n\n| Source | Example |\n|--------|---------|\n| Known data | \"We have 10,000 DAU\" |\n| Industry benchmarks | \"Average SaaS churn is 5%\" |\n| Physical constraints | \"A human can make ~50 decisions/day\" |\n| Logical bounds | \"At least 1, at most 1 million\" |\n| Personal experience | \"I've seen systems handle 1000 req/s\" |\n\n**When estimating:**\n- Use ranges, not point estimates: \"10,000 to 50,000\"\n- Prefer geometric mean for order-of-magnitude: √(10,000 × 50,000) = 22,360\n- Round to one significant figure: ~20,000\n\n### Step 4: Combine Factors\n\nMultiply (or add) factors together:\n\n```\nStorage = 50,000 users × 10 KB/user/day × 365 days × 1.5 overhead\n        = 50,000 × 10,000 × 365 × 1.5 bytes\n        = 274 billion bytes\n        ≈ 270 GB/year\n```\n\n### Step 5: Sanity Check\n\nVerify reasonableness:\n- Does the order of magnitude make sense?\n- Is it physically possible?\n- Does it match any known data points?\n- Would a 10x error change the decision?\n\n### Step 6: State Confidence and Implications\n\n```\nEstimate: ~270 GB/year\nConfidence: Within 3-5x (80-1,500 GB)\nImplication: Standard database tier sufficient; no special infrastructure needed\n```\n\n## Fermi Estimation Template\n\n```markdown\n# Fermi Estimate: [Question]\n\n## Question (Precise)\n[Exactly what we're estimating]\n\n## Decomposition\n[Quantity] = [Factor 1] × [Factor 2] × ... × [Factor N]\n\n## Factor Estimates\n\n### Factor 1: [Name]\n- Estimate: [Value]\n- Source/Reasoning: [Why this number]\n- Confidence: High / Medium / Low\n\n### Factor 2: [Name]\n- Estimate: [Value]\n- Source/Reasoning: [Why this number]\n- Confidence: High / Medium / Low\n\n[Continue for all factors...]\n\n## Calculation\n[Show the math]\n\n## Result\n- Point estimate: [Value]\n- Range: [Low] to [High] (representing Xx uncertainty)\n\n## Sanity Check\n- Physical plausibility: [Check]\n- Comparison to known data: [Check]\n- Order of magnitude reasonable: [Check]\n\n## Implications\n[What does this estimate mean for the decision?]\n```\n\n## Example 1: Data Storage Needs\n\n**Question:** How much storage will our new feature need in Year 1?\n\n```markdown\n## Decomposition\nStorage = Users × Events/User/Day × Event Size × Days × Replication\n\n## Factor Estimates\n\n### Users (DAU)\n- Estimate: 100,000 (current) growing to 200,000 (end of year)\n- Average over year: ~150,000\n- Confidence: High (we have current data)\n\n### Events per User per Day\n- Estimate: 50 events (based on current feature usage patterns)\n- Confidence: Medium (new feature might differ)\n\n### Event Size\n- Estimate: 500 bytes (JSON with typical payload)\n- Confidence: High (we can measure similar events)\n\n### Days in Year\n- Estimate: 365\n- Confidence: Certain\n\n### Replication Factor\n- Estimate: 3x (standard for durability)\n- Confidence: High (architectural requirement)\n\n## Calculation\nStorage = 150,000 × 50 × 500 × 365 × 3\n        = 150,000 × 50 × 500 × 365 × 3\n        = 4.1 × 10^12 bytes\n        = 4.1 TB\n\n## Result\n- Point estimate: ~4 TB\n- Range: 1 TB (pessimistic assumptions) to 15 TB (growth beats expectations)\n\n## Sanity Check\n- 4 TB for 150K users = ~27 MB/user/year = reasonable\n- Similar feature at other company uses \"several TB\" = consistent\n- Standard database can handle 4 TB = feasible\n\n## Implications\n- Standard managed database tier sufficient\n- No need for sharding or special storage architecture in Year 1\n- Budget ~$500/month for storage costs\n```\n\n## Example 2: API Rate Capacity\n\n**Question:** Can our API handle Black Friday traffic?\n\n```markdown\n## Decomposition\nRequired RPS = Peak Daily Users × Requests/User/Session × Sessions/Day × Peak Multiplier / Seconds in Peak Hour\n\n## Factor Estimates\n\n### Peak Daily Users\n- Estimate: 500,000 (3x normal 170K)\n- Source: Last year's Black Friday\n- Confidence: Medium\n\n### Requests per Session\n- Estimate: 30 API calls (measured)\n- Confidence: High\n\n### Sessions per Day\n- Estimate: 2 (mobile + desktop)\n- Confidence: Medium\n\n### Peak Multiplier\n- Estimate: 5x (traffic concentrated in 4-hour window, spiky within that)\n- Confidence: Medium\n\n### Seconds in Peak Hour\n- Estimate: 3,600\n- Confidence: Certain\n\n## Calculation\nRequired RPS = (500,000 × 30 × 2 × 5) / 3,600\n             = 150,000,000 / 3,600\n             = 41,667 RPS\n             ≈ 40,000 RPS peak\n\n## Result\n- Point estimate: 40,000 RPS\n- Range: 15,000 to 100,000 RPS\n\n## Sanity Check\n- Current capacity: 10,000 RPS\n- Gap: 4x capacity needed\n- Similar scale companies report 20-50K RPS on peak days = consistent\n\n## Implications\n- Need 4x capacity increase\n- Auto-scaling must handle 40K+ RPS\n- Load test to 60K RPS (1.5x safety margin)\n```\n\n## Example 3: Market Size\n\n**Question:** How many potential customers for our developer tool?\n\n```markdown\n## Decomposition\nTAM = Software Companies × Avg Developers × Adoption Rate × Price Tolerance\n\n## Factor Estimates\n\n### Software Companies (US)\n- Estimate: ~500,000 (SBA data: tech companies)\n- Confidence: Medium\n\n### With 10+ Developers (our target)\n- Estimate: 10% = 50,000 companies\n- Confidence: Low (rough estimate)\n\n### Developers per Target Company\n- Estimate: 30 average\n- Confidence: Medium\n\n### Adoption Rate (would consider)\n- Estimate: 20% (dev tools are crowded)\n- Confidence: Low\n\n### Price Point\n- Estimate: $50/developer/month\n- Confidence: Medium (based on similar tools)\n\n## Calculation\nAddressable Users = 50,000 × 30 × 20% = 300,000 developers\nRevenue = 300,000 × $50 × 12 = $180M/year TAM\n\n## Result\n- TAM: ~$180M/year\n- Realistic serviceable market: 5-10% = $10-20M/year\n\n## Sanity Check\n- Similar dev tools (Datadog, etc.) have $100M+ revenue = plausible ceiling\n- 300K potential users in a niche = reasonable\n\n## Implications\n- Market size justifies investment if we can capture 5%+\n- Need differentiation in crowded space\n```\n\n## Common Decomposition Patterns\n\n### Capacity Planning\n```\nNeeded = Users × Usage/User × Factor/Usage × Growth × Safety\n```\n\n### Cost Estimation\n```\nCost = Resources × Unit Cost × Duration × Overhead\n```\n\n### Time Estimation\n```\nTime = Tasks × Time/Task × (1 + Risk Factor)\n```\n\n### Market Sizing\n```\nMarket = Population × Segment% × Adoption% × Price × Frequency\n```\n\n## Tips for Better Estimates\n\n### Use Multiple Approaches\n\nEstimate the same thing different ways:\n\n```\nWebsite traffic estimate:\nMethod 1: Bottom-up from user base\nMethod 2: Top-down from market share\nMethod 3: Analogy to similar company\n\nIf methods agree within 3x, confidence increases\nIf they diverge wildly, investigate assumptions\n```\n\n### Bound First\n\nStart with upper and lower bounds:\n\n```\n\"Definitely more than 1,000, definitely less than 10 million\"\n\"So somewhere in 10,000-1,000,000 range\"\n\"Let me narrow from there...\"\n```\n\n### Watch for Correlated Errors\n\nIf factors are correlated, errors don't cancel:\n\n```\nBAD: Users × Revenue/User (both depend on same growth assumption)\nBETTER: Estimate revenue directly, or use independent factors\n```\n\n### One Significant Figure\n\nDon't false precision:\n\n```\nCalculation: 47,832,519 bytes\nReport: ~50 MB (not \"47.8 MB\")\n```\n\n## Verification Checklist\n\n- [ ] Question stated precisely\n- [ ] Decomposed into 3-6 estimable factors\n- [ ] Each factor has reasoning/source\n- [ ] Factors are relatively independent\n- [ ] Calculation shown and checked\n- [ ] Result sanity-checked against reality\n- [ ] Uncertainty range stated\n- [ ] Implications for decision clarified\n\n## Key Questions\n\n- \"Can I break this into smaller, estimable pieces?\"\n- \"What do I already know that constrains this?\"\n- \"What's the upper bound? Lower bound?\"\n- \"Does this number pass the smell test?\"\n- \"Would being off by 10x change my decision?\"\n- \"Can I estimate this a different way to cross-check?\"\n\n## Fermi's Wisdom\n\nWhen asked how many piano tuners were in Chicago, Fermi didn't look it up—he estimated from population, households, pianos, tuning frequency, and tuner capacity. His estimate was reportedly within 20% of the actual number.\n\nThe lesson: You know more than you think. Decompose, estimate, combine. The errors often cancel, and you get surprisingly close to truth.\n\n\"Never make a calculation until you know the answer.\" — John Wheeler (Fermi's colleague)\n\nMeaning: Estimate first to know what answer to expect, then calculate to verify.\n"},"import":{"commit_sha":"a31e22d4445ad8fef7cd771d32af537aebb68c49","imported_at":"2026-05-22T21:14:39Z","license_text":"MIT License\n\nCopyright (c) 2025 TJ Boudreaux\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n","owner":"tjboudreaux","repo":"tjboudreaux/cc-thinking-skills","source_url":"https://github.com/tjboudreaux/cc-thinking-skills/tree/a31e22d4445ad8fef7cd771d32af537aebb68c49/skills/thinking-fermi-estimation"}},"content_hash":[120,95,10,179,93,141,58,167,101,171,245,98,206,4,42,194,2,213,252,42,95,200,25,140,62,85,92,152,207,62,133,244],"trust_level":"unsigned","yanked":false}
