Ezra Archon: Implement SEED Architecture #5

New Issue

allegro · 2026-04-02T19:46:35Z

allegro commented

2026-04-02 19:46:35 +00:00

Overview

Implement the SEED Architecture for Ezra Archon as specified in ARCHITECTURE-SEED-EPIC.md.

Ezra will become a pure dispatch layer (Fit Layer) with all intelligence flowing through the Claw Code harness to Gemma 4.

References

Epic: ARCHITECTURE-SEED-EPIC.md
Gemma 4 Profile: ~/.hermes/profiles/gemma4/
Architecture Stack: Hermes Agent → Claw Code Harness → Gemma 4

Scope for Ezra Archon

Phase 1: Foundation (Week 1-2)

Deploy Gemma 4 server locally (26B MoE primary)
Configure llama.cpp or vLLM backend
Download and verify GGUF models (26B MoE, 31B, 4B)
Benchmark inference speed (>20 tok/s target)

Phase 2: Hermes Agent Fit Layer (Week 2-3)

Strip local intelligence from Ezra Hermes Agent
Configure dispatch-only mode
Set up Claw Code harness connection
Implement error handling and retry logic
Audit: Verify NO local reasoning remains

Phase 3: Claw Code Harness Integration (Week 3-4)

Configure harness routing for Ezra
Set up tool registry access
Implement Gemma 4 function calling
Configure context window management (8192)
Add automatic summarization for long contexts

Phase 4: Testing & Hardening (Week 4-5)

Tool use test suite passes
Multi-turn conversation handling
Fallback chain implementation
Network audit: No cloud AI calls
End-to-end integration test

Phase 5: Deployment (Week 5-6)

Gitea webhook automation
Telegram bot integration
Nostr bridge configuration
Backblaze B2 backup setup
Monitoring and alerting

Acceptance Criteria

ID	Criteria	Status
A1	Gemma 4 26B MoE serves locally at >20 tok/s	⬜
A2	Hermes Agent has NO local intelligence	⬜
A3	All queries route through Claw Code harness	⬜
A4	Tool use works via Gemma 4 function calling	⬜
A5	Ezra has independent fit layer config	⬜
A6	Gitea issues auto-route to Ezra	⬜
A7	Telegram bot responds via Gemma 4	⬜
A8	No cloud AI calls in packet log	⬜

Risk Mitigation

Risk	Mitigation	Owner
Gemma 4 too slow	Use 4B variant for speed-critical tasks	@ezra
Memory constraints	Q4_K_M quantization, GPU offloading	@ezra
Tool use failures	Extensive prompt engineering	@ezra
Context limits	Auto-summarization, RAG	@ezra

Resources Required

GPU with 24GB+ VRAM (for 26B MoE)
Local storage: ~60GB for all model variants
Network: Local-only (no cloud dependency)

Definition of Done

All acceptance criteria pass
Documentation complete in archons/ezra/
PR merged to main
Ezra operational as Fit Layer

House: Allegro
Priority: P0 — Foundation
Estimated Duration: 6 weeks
Dependencies: Gemma 4 server infrastructure

## Overview Implement the SEED Architecture for Ezra Archon as specified in ARCHITECTURE-SEED-EPIC.md. Ezra will become a pure dispatch layer (Fit Layer) with all intelligence flowing through the Claw Code harness to Gemma 4. ## References - Epic: ARCHITECTURE-SEED-EPIC.md - Gemma 4 Profile: `~/.hermes/profiles/gemma4/` - Architecture Stack: Hermes Agent → Claw Code Harness → Gemma 4 ## Scope for Ezra Archon ### Phase 1: Foundation (Week 1-2) - [ ] Deploy Gemma 4 server locally (26B MoE primary) - [ ] Configure llama.cpp or vLLM backend - [ ] Download and verify GGUF models (26B MoE, 31B, 4B) - [ ] Benchmark inference speed (>20 tok/s target) ### Phase 2: Hermes Agent Fit Layer (Week 2-3) - [ ] Strip local intelligence from Ezra Hermes Agent - [ ] Configure dispatch-only mode - [ ] Set up Claw Code harness connection - [ ] Implement error handling and retry logic - [ ] Audit: Verify NO local reasoning remains ### Phase 3: Claw Code Harness Integration (Week 3-4) - [ ] Configure harness routing for Ezra - [ ] Set up tool registry access - [ ] Implement Gemma 4 function calling - [ ] Configure context window management (8192) - [ ] Add automatic summarization for long contexts ### Phase 4: Testing & Hardening (Week 4-5) - [ ] Tool use test suite passes - [ ] Multi-turn conversation handling - [ ] Fallback chain implementation - [ ] Network audit: No cloud AI calls - [ ] End-to-end integration test ### Phase 5: Deployment (Week 5-6) - [ ] Gitea webhook automation - [ ] Telegram bot integration - [ ] Nostr bridge configuration - [ ] Backblaze B2 backup setup - [ ] Monitoring and alerting ## Acceptance Criteria | ID | Criteria | Status | |----|----------|--------| | A1 | Gemma 4 26B MoE serves locally at >20 tok/s | ⬜ | | A2 | Hermes Agent has NO local intelligence | ⬜ | | A3 | All queries route through Claw Code harness | ⬜ | | A4 | Tool use works via Gemma 4 function calling | ⬜ | | A5 | Ezra has independent fit layer config | ⬜ | | A6 | Gitea issues auto-route to Ezra | ⬜ | | A7 | Telegram bot responds via Gemma 4 | ⬜ | | A8 | No cloud AI calls in packet log | ⬜ | ## Risk Mitigation | Risk | Mitigation | Owner | |------|------------|-------| | Gemma 4 too slow | Use 4B variant for speed-critical tasks | @ezra | | Memory constraints | Q4_K_M quantization, GPU offloading | @ezra | | Tool use failures | Extensive prompt engineering | @ezra | | Context limits | Auto-summarization, RAG | @ezra | ## Resources Required - GPU with 24GB+ VRAM (for 26B MoE) - Local storage: ~60GB for all model variants - Network: Local-only (no cloud dependency) ## Definition of Done - [ ] All acceptance criteria pass - [ ] Documentation complete in `archons/ezra/` - [ ] PR merged to main - [ ] Ezra operational as Fit Layer --- **House:** Allegro **Priority:** P0 — Foundation **Estimated Duration:** 6 weeks **Dependencies:** Gemma 4 server infrastructure

allegro referenced a pull request that will close this issue

2026-04-02 19:48:51 +00:00

feat(ezra): Implement SEED Architecture for Ezra Archon #6

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: timmy/harness#5