by sickn33
一个适用于 Claude、Codex 和 Claude Code 的 AI 技能
1. 打开 Claude 聊天界面
2. 点击下方 "📋 复制" 按钮
3. 粘贴到 Claude 聊天框中并发送
4. 输入 "使用 loki-mode 技能" 开始使用
=== loki-mode 技能 === 作者: sickn33 描述: 一个适用于 Claude、Codex 和 Claude Code 的 AI 技能 使用方法: 1. 调用技能: "使用 loki-mode 技能" 2. 提供相关信息: 根据技能要求提供必要参数 3. 查看结果: 技能会返回处理结果 示例: "使用 loki-mode 技能,帮我分析一下这段代码"
这种方法适用于所有 Claude 用户,不需要安装额外工具。
productivity
medium
The First Truly Autonomous Multi-Agent Startup System
PRD → Deployed Product in Zero Human Intervention
Loki Mode transforms a Product Requirements Document into a fully built, tested, deployed, and revenue-generating product while you sleep. No manual steps. No intervention. Just results.
Click to watch Loki Mode build a complete Todo App from PRD - zero human intervention
| System | Pass@1 | Details |
|---|---|---|
| Loki Mode (Multi-Agent) | 98.78% | 162/164 problems, RARV cycle recovered 2 |
| Direct Claude | 98.17% | 161/164 problems (baseline) |
| MetaGPT | 85.9-87.7% | Published benchmark |
Loki Mode beats MetaGPT by +11-13% thanks to the RARV (Reason-Act-Reflect-Verify) cycle.
| Benchmark | Score | Details |
|---|---|---|
| Loki Mode HumanEval | 98.78% Pass@1 | 162/164 (multi-agent with RARV) |
| Direct Claude HumanEval | 98.17% Pass@1 | 161/164 (single agent baseline) |
| Direct Claude SWE-bench | 99.67% patch gen | 299/300 problems |
| Loki Mode SWE-bench | 99.67% patch gen | 299/300 problems |
| Model | Claude Opus 4.5 |
Key Finding: Multi-agent RARV matches single-agent performance on both benchmarks after timeout optimization. The 4-agent pipeline (Architect->Engineer->QA->Reviewer) achieves the same 99.67% patch generation as direct Claude.
See benchmarks/results/ for full methodology and solutions.
Loki Mode is a Claude Code skill that orchestrates 37 specialized AI agent types across 6 swarms to autonomously build, test, deploy, and scale complete startups. It dynamically spawns only the agents you need—5-10 for simple projects, 100+ for complex startups—working in parallel with continuous self-verification.
PRD → Research → Architecture → Development → Testing → Deployment → Marketing → Revenue
Just say "Loki Mode" and point to a PRD. Walk away. Come back to a deployed product.
| What Others Do | What Loki Mode Does |
|---|---|
| Single agent writes code linearly | 100+ agents work in parallel across engineering, ops, business, data, product, and growth |
| Manual deployment required | Autonomous deployment to AWS, GCP, Azure, Vercel, Railway with blue-green and canary strategies |
| No testing or basic unit tests | 14 automated quality gates: security scans, load tests, accessibility audits, code reviews |
| Code only - you handle the rest | Full business operations: marketing, sales, legal, HR, finance, investor relations |
| Stops on errors | Self-healing: circuit breakers, dead letter queues, exponential backoff, automatic recovery |
| No visibility into progress | Real-time dashboard with agent monitoring, task queues, and live status updates |
| "Done" when code is written | Never "done": continuous optimization, A/B testing, customer feedback loops, perpetual improvement |
Monitor your autonomous startup being built in real-time through the Loki Mode dashboard:
Track all active agents in real-time:
Four-column kanban view:
# Watch status updates in terminal
watch -n 2 cat .loki/STATUS.txt
╔════════════════════════════════════════════════════════════════╗
║ LOKI MODE STATUS ║
╚════════════════════════════════════════════════════════════════╝
Phase: DEVELOPMENT
Active Agents: 47
├─ Engineering: 18
├─ Operations: 12
├─ QA: 8
└─ Business: 9
Tasks:
├─ Pending: 10
├─ In Progress: 47
├─ Completed: 203
└─ Failed: 0
Last Updated: 2026-01-04 20:45:32
Access the dashboard:
# Automatically opens when running autonomously
./autonomy/run.sh ./docs/requirements.md
# Or open manually
open .loki/dashboard/index.html
Auto-refreshes every 3 seconds. Works with any modern browser.
Loki Mode doesn't just write code—it thinks, acts, learns, and verifies:
1. REASON
└─ Read .loki/CONTINUITY.md including "Mistakes & Learnings"
└─ Check .loki/state/ and .loki/queue/
└─ Identify next task or improvement
2. ACT
└─ Execute task, write code
└─ Commit changes atomically (git checkpoint)
3. REFLECT
└─ Update .loki/CONTINUITY.md with progress
└─ Update state files
└─ Identify NEXT improvement
4. VERIFY
└─ Run automated tests (unit, integration, E2E)
└─ Check compilation/build
└─ Verify against spec
IF VERIFICATION FAILS:
├─ Capture error details (stack trace, logs)
├─ Analyze root cause
├─ UPDATE "Mistakes & Learnings" in CONTINUITY.md
├─ Rollback to last good git checkpoint if needed
└─ Apply learning and RETRY from REASON
Result: 2-3x quality improvement through continuous self-verification.
There is NEVER a "finished" state. After completing the PRD, Loki Mode:
It keeps going until you stop it.
Rate limits? Exponential backoff and automatic resume. Errors? Circuit breakers, dead letter queues, retry logic. Interruptions? State checkpoints every 5 seconds—just restart.
# Start autonomous mode
./autonomy/run.sh ./docs/requirements.md
# Hit rate limit? Script automatically:
# ├─ Saves state checkpoint
# ├─ Waits with exponential backoff (60s → 120s → 240s...)
# ├─ Resumes from exact point
# └─ Continues until completion or max retries (default: 50)
# Clone to your Claude Code skills directory
git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
See INSTALLATION.md for other installation methods (Web, API Console, minimal curl install).
# Product: AI-Powered Todo App
## Overview
Build a todo app with AI-powered task suggestions and deadline predictions.
## Features
- User authentication (email/password)
- Create, read, update, delete todos
- AI suggests next tasks based on patterns
- Smart deadline predictions
- Mobile-responsive design
## Tech Stack
- Next.js 14 with TypeScript
- PostgreSQL database
- OpenAI API for suggestions
- Deploy to Vercel
Save as my-prd.md.
# Autonomous mode (recommended)
./autonomy/run.sh ./my-prd.md
# Or manual mode
claude --dangerously-skip-permissions
> Loki Mode with PRD at ./my-prd.md
Open the dashboard in your browser (auto-opens) or check status:
watch -n 2 cat .loki/STATUS.txt
Seriously. Go get coffee. It'll be deployed when you get back.
That's it. No configuration. No manual steps. No intervention.
Loki Mode has 37 predefined agent types organized into 6 specialized swarms. The orchestrator spawns only what you need—simple projects use 5-10 agents, complex startups spawn 100+.
<img width="5309" height="979" alt="Agent Swarms Visualization" src="https://github.com/user-attachments/assets/7d18635d-a606-401f-8d9f-430e6e4ee689" />eng-frontend eng-backend eng-database eng-mobile eng-api eng-qa eng-perf eng-infra
ops-devops ops-sre ops-security ops-monitor ops-incident ops-release ops-cost ops-compliance
biz-marketing biz-sales biz-finance biz-legal biz-support biz-hr biz-investor biz-partnerships
data-ml data-eng data-analytics
prod-pm prod-design prod-techwriter
growth-hacker growth-community growth-success growth-lifecycle
review-code review-business review-security
See references/agents.md for complete agent type definitions.
| Phase | Description |
|---|---|
| 0. Bootstrap | Create .loki/ directory structure, initialize state |
| 1. Discovery | Parse PRD, competitive research via web search |
| 2. Architecture | Tech stack selection with self-reflection |
| 3. Infrastructure | Provision cloud, CI/CD, monitoring |
| 4. Development | Implement with TDD, parallel code review |
| 5. QA | 14 quality gates, security audit, load testing |
| 6. Deployment | Blue-green deploy, auto-rollback on errors |
| 7. Business | Marketing, sales, legal, support setup |
| 8. Growth | Continuous optimization, A/B testing, feedback loops |
Every code change goes through 3 specialized reviewers simultaneously:
IMPLEMENT → REVIEW (parallel) → AGGREGATE → FIX → RE-REVIEW → COMPLETE
│
├─ code-reviewer (Opus) - Code quality, patterns, best practices
├─ business-logic-reviewer (Opus) - Requirements, edge cases, UX
└─ security-reviewer (Opus) - Vulnerabilities, OWASP Top 10
Severity-based issue handling:
// TODO(review): ... comment, continue.// FIXME(nitpick): ... comment, continue..loki/
├── state/ # Orchestrator and agent states
├── queue/ # Task queue (pending, in-progress, completed, dead-letter)
├── memory/ # Episodic, semantic, and procedural memory
├── metrics/ # Efficiency tracking and reward signals
├── messages/ # Inter-agent communication
├── logs/ # Audit logs
├── config/ # Configuration files
├── prompts/ # Agent role prompts
├── artifacts/ # Releases, reports, backups
├── dashboard/ # Real-time monitoring dashboard
└── scripts/ # Helper scripts
Test Loki Mode with these pre-built PRDs in the examples/ directory:
| PRD | Complexity | Est. Time | Description |
|---|---|---|---|
simple-todo-app.md | Low | ~10 min | Basic todo app - tests core functionality |
api-only.md | Low | ~10 min | REST API only - tests backend agents |
static-landing-page.md | Low | ~5 min | HTML/CSS only - tests frontend/marketing |
full-stack-demo.md | Medium | ~30-60 min | Complete bookmark manager - full test |
# Example: Run with simple todo app
./autonomy/run.sh examples/simple-todo-app.md
Customize the autonomous runner with environment variables:
LOKI_MAX_RETRIES=100 \
LOKI_BASE_WAIT=120 \
LOKI_MAX_WAIT=7200 \
./autonomy/run.sh ./docs/requirements.md
| Variable | Default | Description |
|---|---|---|
LOKI_MAX_RETRIES | 50 | Maximum retry attempts before giving up |
LOKI_BASE_WAIT | 60 | Base wait time in seconds |
LOKI_MAX_WAIT | 3600 | Maximum wait time (1 hour) |
LOKI_SKIP_PREREQS | false | Skip prerequisite checks |
# .loki/config/circuit-breakers.yaml
defaults:
failureThreshold: 5
cooldownSeconds: 300
# .loki/config/alerting.yaml
channels:
slack:
webhook_url: "${SLACK_WEBHOOK_URL}"
severity: [critical, high]
pagerduty:
integration_key: "${PAGERDUTY_KEY}"
severity: [critical]
--dangerously-skip-permissions flagOptional but recommended:
Integrate with Vibe Kanban for a visual kanban board:
# Install Vibe Kanban
npx vibe-kanban
# Export Loki tasks to Vibe Kanban
./scripts/export-to-vibe-kanban.sh
Benefits:
See integrations/vibe-kanban.md for full setup guide.
Run the comprehensive test suite:
# Run all tests
./tests/run-all-tests.sh
# Or run individual test suites
./tests/test-bootstrap.sh # Directory structure, state init
./tests/test-task-queue.sh # Queue operations, priorities
./tests/test-circuit-breaker.sh # Failure handling, recovery
./tests/test-agent-timeout.sh # Timeout, stuck process handling
./tests/test-state-recovery.sh # Checkpoints, recovery
Contributions welcome! Please:
MIT License - see LICENSE for details.
Loki Mode incorporates research and patterns from leading AI labs and practitioners:
| Source | Key Contribution |
|---|---|
| Anthropic: Building Effective Agents | Evaluator-optimizer pattern, parallelization |
| Anthropic: Constitutional AI | Self-critique against principles |
| DeepMind: Scalable Oversight via Debate | Debate-based verification |
| DeepMind: SIMA 2 | Self-improvement loop |
| OpenAI: Agents SDK | Guardrails, tripwires, tracing |
| NVIDIA ToolOrchestra | Efficiency metrics, reward signals |
| CONSENSAGENT (ACL 2025) | Anti-sycophancy, blind review |
| GoalAct | Hierarchical planning |
Full Acknowledgements - Complete list of 50+ research papers, articles, and resources
Built for the Claude Code ecosystem, powered by Anthropic's Claude models (Sonnet, Haiku, Opus).
Ready to build a startup while you sleep?
git clone https://github.com/asklokesh/loki-mode.git ~/.claude/skills/loki-mode
./autonomy/run.sh your-prd.md
Keywords: claude-code, claude-skills, ai-agents, autonomous-development, multi-agent-system, sdlc-automation, startup-automation, devops, mlops, deployment-automation, self-healing, perpetual-improvement
View Count
0
Download Count
0
Favorite Count
0
Quality Score
63