References & Bibliography
References & Bibliography
Section titled “References & Bibliography”Complete citations for the Herding Cats in the AI Age research series, organized by source type.
Academic Research
Section titled “Academic Research”Cemri, M., Pan, M.Z., et al. “Why Do Multi-Agent LLM Systems Fail?” UC Berkeley Sky Computing Lab. NeurIPS 2025 Datasets and Benchmarks Track, Spotlight. arXiv:2503.13657. March 17, 2025. → https://arxiv.org/abs/2503.13657 → Project: https://sky.cs.berkeley.edu/project/mast/ → GitHub: https://github.com/multi-agent-systems-failure-taxonomy/MAST
Kim, Y., et al. “Towards a Science of Scaling Agent Systems.” Google Research, Google DeepMind, MIT. arXiv:2512.08296. December 9, 2025. DOI: https://doi.org/10.48550/arXiv.2512.08296 → https://arxiv.org/abs/2512.08296
IEEE Transactions on Visualization and Computer Graphics. Survey on text generation accuracy in AI image platforms. March 2024. Referenced in Paper 4 (text rendering accuracy below 45% across major platforms).
Multi-Agent Coordination, Cognitive Architecture, and AI Governance
Section titled “Multi-Agent Coordination, Cognitive Architecture, and AI Governance”Laird, John E. “Introduction to the Soar Cognitive Architecture.” arXiv preprint, 2022. arXiv:2205.03854. → https://arxiv.org/abs/2205.03854 SOAR integrates procedural memory, semantic/declarative memory, and episodic memory into a unified cognitive architecture — validates the multi-knowledge-type pattern (knowledge wells + skills + hooks) used in this series’ vault architecture. Referenced in Papers 1 and 3.
Matarić, Maja J. “Integration of Representation Into Goal-Driven Behavior-Based Robots.” IEEE Transactions on Robotics and Automation, Volume 8, Issue 3, pp. 304–312. 1992. → https://ieeexplore.ieee.org/document/143349/ Demonstrates that simple, modular behaviors composed through organizational structure produce complex coordinated behavior — empirical proof that structure enables capability, the core thesis of this series. Referenced in Papers 1 and 2.
Gomes, Carla P., van Hoeve, Willem-Jan, Selman, Bart, & Lombardi, Michele. “Optimal Multi-Agent Scheduling with Constraint Programming.” AAAI Conference on Artificial Intelligence, pp. 1813–1818. 2007. → https://cdn.aaai.org/AAAI/2007/AAAI07-291.pdf Constraint-based scheduling across multiple agents — validates that structural constraints matter more than individual agent intelligence for coordination. Referenced in Paper 3 (Tetris task primitives pattern).
Grosof, Benjamin N., Wan, Hui, & Kifer, Michael. “Defeasibility in Answer Set Programs with Defaults and Argumentation Rules.” Semantic Web, Volume 6, Issue 1, pp. 81–98. 2015. DOI: 10.3233/SW-140140. → https://doi.org/10.3233/SW-140140 Rule systems with defeasibility (exception handling) for governing agent behavior through argumentation-based reasoning — validates the hook/rule/skill architecture where rules can override defaults. Referenced in Papers 1 and 3.
Bender, Emily M., Gebru, Timnit, McMillan-Major, Angelina, & Shmitchell, Shmargaret. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” FAccT ‘21 (Conference on Fairness, Accountability, and Transparency), March 2021. DOI: 10.1145/3442188.3445922. → https://doi.org/10.1145/3442188.3445922 Argues that structural governance mechanisms are required before deployment, not empirical safety promises after — validates the gate enforcement and structural oversight patterns. Referenced in Papers 1 and 2.
Dwork, Cynthia. “Differential Privacy.” ICALP 2006 (International Colloquium on Automata, Languages and Programming). DOI: 10.1007/11787006_1. → https://doi.org/10.1007/11787006_1 Foundational work establishing that mathematical guarantees are more reliable than empirical promises — validates the structural deny hook pattern where formal blocking is preferred over advisory guidance. Referenced in Paper 3.
Karpathy, Andrej. “LLM OS.” X (Twitter) thread, September 28, 2023. → https://x.com/karpathy/status/1707437820045062561 Positions LLMs as operating system kernels with layered I/O, code execution, memory (embeddings), and context window as working RAM — validates the three-layer vault architecture (knowledge/operations/presentation). Referenced in Paper 1.
Shannon, Claude E. “A Mathematical Theory of Communication.” Bell System Technical Journal, 27(3): 379–423, 1948. Foundational information theory: entropy H(X) = −Σ p(xᵢ) log₂ p(xᵢ), channel capacity C = B log₂(1 + S/N). Template governance reduces action entropy; structured interfaces increase channel capacity. Referenced in Innovation Inventory and Glossary.
Ng, Andrew Y., Harada, Daishi, & Russell, Stuart. “Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping.” ICML 1999, pp. 278–287. Formal basis for reward shaping in RL — modifying the reward function to accelerate policy convergence without changing optimal policy. Template-driven governance is applied reward shaping: the template shapes the agent’s reward landscape toward compliant behavior. Referenced in Innovation Inventory and Glossary.
Simon, Herbert A. The Sciences of the Artificial. 3rd ed. MIT Press, 1996. Bounded rationality: agents optimize within cognitive and informational constraints. Explains why structural enforcement (hooks) is necessary — agents satisfice within local context rather than global optimization. Referenced in Glossary.
Sutton, Richard S. & Barto, Andrew G. Reinforcement Learning: An Introduction. 2nd ed. MIT Press, 2018. → http://incompleteideas.net/book/the-book-2nd.html Comprehensive treatment of RL, policy gradient methods, and reward shaping. Formal foundation for CPI loop analysis and template improvement mechanics. Referenced in Glossary.
Thaler, Richard H. & Sunstein, Cass R. Nudge: Improving Decisions About Health, Wealth, and Happiness. Yale University Press, 2008. Choice architecture: default options and environmental design determine outcomes more reliably than education or incentives. The Toboggan Doctrine’s channel-based enforcement is direct application: compliant path = default path. Referenced in Innovation Inventory, Paper 3.
Bai, Yuntao et al. “Constitutional AI: Harmlessness from AI Feedback.” arXiv:2212.08073, 2022. → https://arxiv.org/abs/2212.08073 Alignment via constitutional principles and AI-generated feedback — the alignment-layer approach complemented by the toboggan’s structural-enforcement layer. Referenced in Innovation Inventory.
Ouyang, Long et al. “Training language models to follow instructions with human feedback.” arXiv:2203.02155, 2022. → https://arxiv.org/abs/2203.02155 InstructGPT: RLHF for instruction-following. Establishes baseline for alignment-based governance. Toboggan doctrine operates orthogonally: structural enforcement where alignment-based approaches leave residual variance. Referenced in Innovation Inventory.
Lowe, Ryan et al. “Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments.” arXiv:1706.02275, 2017. → https://arxiv.org/abs/1706.02275 MADDPG: multi-agent RL with centralized training and decentralized execution. Formal parallel to the vault’s orchestrator-worker pattern — central supervisor (training/coordination), independent agent execution. Referenced in Papers 5–6.
Dietterich, Thomas G. “Ensemble Methods in Machine Learning.” Multiple Classifier Systems, LNCS 1857, pp. 1–15. Springer, 2000. Ensemble theory: diversity among classifiers reduces error variance. Validates PAT (parallel orthogonal review) and CMDP (independent generation before synthesis). Referenced in Papers 5–6.
Shazeer, Noam et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer.” arXiv:1701.06538, 2017. → https://arxiv.org/abs/1701.06538 MoE: gating function routes inputs to specialized expert sub-networks. Structural model for multi-agent staff roles — orchestrator as gating function, specialist agents as experts. Referenced in Glossary.
Jiang, Dongfu & Lu, Xiang. “LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion.” arXiv:2306.02561, 2023. → https://arxiv.org/abs/2306.02561 LLM ensembling via pairwise ranking (PairRanker) and generative fusion (GenFuser). Validates multi-model synthesis patterns used in CMDP and PAT reviews. Referenced in Papers 5–6.
Radford, Alec et al. “Learning Transferable Visual Models From Natural Language Supervision.” arXiv:2103.00020, 2021. → https://arxiv.org/abs/2103.00020 CLIP: joint vision-language embedding space. Referenced in Paper 4 (creative AI analysis, image-text alignment limitations).
Ho, Jonathan et al. “Denoising Diffusion Probabilistic Models.” arXiv:2006.11239, 2020. → https://arxiv.org/abs/2006.11239 DDPM: foundational diffusion model paper. Referenced in Paper 4 (Firefly/generative AI technical background).
U.S. Government Publications and Doctrine
Section titled “U.S. Government Publications and Doctrine”Department of War. “Artificial Intelligence Strategy for the Department of War.” January 12, 2026.
FM 5-0. Army Planning and Orders Production. 2022. MDMP codified in Chapter 12.
ADP 5-0. The Operations Process. 2019. METT-TC defined.
ADRP 5-0. The Operations Process. 2012. Mission Analysis and commander’s intent two levels up.
TC 25-20. A Leader’s Guide to After-Action Reviews. 1993. Four AAR questions.
ATP 5-19. Risk Management. 2014. Four-step risk management process.
FM 6-0. Commander and Staff Organization and Operations. 2022. Battle rhythm concept.
AR 702-12. Quality Assurance Specialist (Ammunition Surveillance). 2020. QASAS program.
AR 385-10. The Army Safety Program. 2023.
U.S. Army. “Army establishes new AI/machine learning career path for officers.” December 2025. MOS 49B, first VTIP window January 5 – February 6, 2026.
SOCOM. Request for Information: Agentic AI Demonstrations, Task Experiment 26-2. SAM.gov. December 2025. Event: April 13–17, 2026, Avon Park Air Force Range, Florida.
Industry Research and Analysis
Section titled “Industry Research and Analysis”Gartner, Inc. “Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027.” Press Release, June 25, 2025. → https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027
Gartner, Inc. Poll of 3,412 respondents on agentic AI investments. January 2025. Referenced in Paper 1, Section 8.5.
Gartner, Inc. “Predicts 2026: AI and Lean Six Sigma Convergence.” 2025. Quality 4.0 projection: 50%+ of LSS orgs will incorporate AI by 2026.
Think Tank and Strategic Analysis
Section titled “Think Tank and Strategic Analysis”Jensen, Benjamin & Strohmeyer, Matthew. “Agentic Warfare and the Future of Military Operations: Rethinking the Napoleonic Staff.” CSIS Futures Lab. July 2025. → Center for Strategic and International Studies: https://www.csis.org
Jensen, Benjamin, Tadross, Dan & Strohmeyer, Matthew. “Agentic Warfare Is Here. Will America Be the First Mover?” War on the Rocks. October 2025. → https://warontherocks.com
Weist, LTC (Ret.) Thad, Kepley, Maj. Skyler & Musgrove, Maj. Braxton. “AI-Augmented MDMP: Lessons from the CGSC Wargaming Experiment.” Small Wars Journal. February 2026. → https://smallwarsjournal.com
Technology Company Publications
Section titled “Technology Company Publications”Anthropic. “Building Effective Agents.” December 2024. → https://www.anthropic.com/research/building-effective-agents
Anthropic. “How We Built Our Multi-Agent Research System.” Anthropic Engineering Blog. June 13, 2025. → https://www.anthropic.com/engineering/multi-agent-research-system
Anthropic. “Donating the Model Context Protocol and Establishing the Agentic AI Foundation.” December 9, 2025. → https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation
Anthropic. “Introducing Claude Opus 4.6.” February 5, 2026. → https://www.anthropic.com/news/claude-opus-4-6
Cursor. “Scaling Agents.” cursor.com/blog/scaling-agents. October 2025. → https://cursor.com/blog/scaling-agents Production results: ~1,000 commits/hour; web browser from scratch (1M+ lines, 1 week); React migration (266K additions / 193K deletions); video rendering 25x optimization.
Google. “Announcing the Agent2Agent Protocol (A2A).” Google Developers Blog. April 9, 2025. → https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/ → Specification: https://a2a-protocol.org/latest/specification/
Linux Foundation. “Linux Foundation Announces the Formation of the Agentic AI Foundation.” December 9, 2025. Platinum members: AWS, Anthropic, Block, Bloomberg, Cloudflare, Google, Microsoft, OpenAI.
OpenAI. Agents SDK Documentation. → https://openai.github.io/openai-agents-python/
Adobe-Specific Sources (Paper 4)
Section titled “Adobe-Specific Sources (Paper 4)”Adobe. “Partner models in Adobe products.” helpx.adobe.com. Updated February 2026. → https://helpx.adobe.com (search: partner models Firefly)
Adobe. “Known limitations in Firefly.” helpx.adobe.com. December 2025. Direct quote used: “Text and symbol generation in images still needs support in the Text to Image feature.”
Adobe. “Firefly partner models expand your creative power.” adobe.com/products/firefly/partner-models.html.
Adobe / Google Cloud. Joint press release: “Expand Strategic Partnership to Advance the Future of Creative AI.” October 28, 2025. Announced at Adobe MAX.
Benzinga. “Adobe (ADBE) Stock Price Prediction 2026, 2027 & 2030.” February 2026. 43% YoY decline, 23% YTD decline reported.
Fortune. “Adobe deepens Google Cloud partnership to advance AI.” October 30, 2025. $5B+ AI-influenced ARR, 700M MAU confirmed by Adobe EVP David Durn.
TipRanks. Adobe analyst rating downgrades compilation. February 2026. HSBC ($388→$302), Piper Sandler (Overweight→Neutral), Oppenheimer (Outperform→Perform), Baird ($410→$350).
Wu, Fei. “Adobe Firefly Partner Models: The 2026 Field Guide.” feisworld.com. February 2026. → https://www.feisworld.com
Practitioner Sources and Commentary
Section titled “Practitioner Sources and Commentary”Jones, Nate B. AI News & Strategy Daily. YouTube / Substack. January 2026. → https://www.natebjones.com/substack Key synthesis of Kim et al., Cursor production data, and Gastown framework.
Yegge, Steve. “Welcome to Gas Town.” Medium. January 1, 2026. → https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04 → GitHub (Gastown): https://github.com/steveyegge/gastown (9,900+ stars)
Yegge, Steve. “Introducing Beads: A coding agent memory system.” Medium. January 2026.
Yegge, Steve. “The Future of Coding Agents.” Medium. January 2026.
Abrahms, Justin. “Wrapping My Head Around Gas Town.” justin.abrah.ms. January 5, 2026.
News Sources
Section titled “News Sources”Axios. “Anthropic’s Claude helped capture Venezuelan dictator Maduro.” February 13, 2026. Referenced in Paper 1, Section 8.6.
Axios. “Pentagon threatens to label Anthropic’s AI a ‘supply chain risk.’” February 16, 2026.
CNBC. “Anthropic is clashing with the Pentagon over AI use.” February 18, 2026.
Small Wars Journal. “AI-Enabled Decapitation Strike.” February 17, 2026.
Books and Foundational Sources
Section titled “Books and Foundational Sources”Deming, W. Edwards. Out of the Crisis. MIT Press, 1986. Plan-Do-Check-Act (PDCA) cycle; foundation of continuous process improvement. Formal basis for CPI loop in the case study and the gravity-fed pipeline. First-pass yield and process capability analysis throughout the series.
Senge, Peter M. The Fifth Discipline: The Art and Practice of the Learning Organization. Doubleday, 1990. Systems thinking, mental models, and organizational learning. The vault’s gravity-fed pipeline is a structural implementation of Senge’s learning organization: systems that improve through normal operation, not dedicated improvement programs.
Boyd, John R. “A Discourse on Winning and Losing.” Unpublished briefing series, USAF, 1987. OODA Loop (Observe-Orient-Decide-Act): the decision cycle underlying all session orientation protocols. Boyd’s key insight: faster OODA cycling outperforms raw capability. Applied throughout the series as the cognitive model for agent decision-making.
Ohno, Taiichi. Toyota Production System. Productivity Press, 1988. Source for Lean waste framework.
Womack, James P. & Jones, Daniel T. Lean Thinking. Free Press, 2003. DOWNTIME waste framework.
Hirano, H. 5S for Operators. Productivity Press, 1996. 5S methodology.
Pyzdek, Thomas. The Six Sigma Handbook. 4th ed. McGraw-Hill, 2014. Sigma cost escalation, satisficing threshold.
All web sources accessed February 2026. URLs current as of publication date. Source availability may change.
← Series Home | ← Glossary | Field Test Gallery →
Related
Section titled “Related”- Index - Published — parent folder