HALO Core Repository Overview
1) Purpose
HALO Core is a Streamlit-based, NotebookLM-style workspace that combines:
- Source management ("Quellen")
- Grounded chat over selected sources
- Studio-style artifact generation (reports, infographic, podcast, presentation, table, etc.)
The app uses Agno for agent and team orchestration, with a service-oriented Python backend under services/ and Streamlit UI in app/.
2) Current implementation status
The repository contains a functional MVP with active refactoring toward more modular runtime services.
Implemented capabilities include:
- Source upload and parsing for text, office docs, images, audio, and video
- LanceDB-backed retrieval and indexing
- Streaming chat with tool-call visibility and fallback behavior
- Agent/team configuration persisted in per-agent JSON files
- Preset-driven chat model/tools/member switching
- Studio template generation and artifact persistence
Planned and roadmap work is tracked in:
AGNO_ADOPTION_BACKLOG.mddocs_internal/HALO_CORE_PRD.mddocs_internal/HALO_CHAT_INTEGRATION_PRD.md
3) High-level structure
app/ Streamlit app and pages
services/ Core business logic and orchestration
templates/ Studio template registry (JSON)
data/ Local runtime storage (JSON, LanceDB, generated assets)
tests/ Pytest suite
docs/ Maintainer-facing implementation docs (this folder)
docs_internal/ PRDs and integration plans
adr/ Architecture decision records
.github/workflows/ CI, docs lint, release, dependency audit
4) Key runtime entry points
Primary app entrypoint
streamlit run app/main.py- Main function:
run_app()inapp/main.py
Major service entrypoints
- Chat orchestration:
services/chat_runtime.py::run_chat_turn - Agent abstraction:
services/agents.py - Team building:
services/halo_team.py - Streaming normalization:
services/streaming_adapter.py::stream_agent_response - Storage and persistence:
services/storage.py - Source ingestion:
services/ingestion.py - Retrieval and indexing:
services/retrieval.py
5) Technology stack
Application and orchestration
- Streamlit
- Agno
- OpenAI SDK
Data and parsing
- LanceDB + pyarrow + numpy
- pypdf, python-docx, openpyxl, python-pptx
- moviepy (video to audio extraction)
Validation and config
- Pydantic + pydantic-settings
- dotenv support via dependencies
Quality and CI
- pytest
- ruff
- black
- mypy (present but currently commented in CI workflow)
6) Configuration model
Configuration is split between:
- Environment/settings (
services/settings.py) OPENAI_API_KEYHALO_DATA_DIRHALO_TEMPLATES_DIR-
HALO_AGENT_DB -
Runtime UI configuration (
data/config.json) - connector enablement
- logging toggles
- image model
-
selected preset
-
Agent configuration (
HALO_DATA_DIR/agents/*.json) -
per-agent role/instructions/tools/model/members/coordination controls
-
Studio template registry (
templates/studio_templates.json) -
data-driven template definitions
-
Presets (
presets.json) - quick updates for chat model/tools/members/stream/coordinator behavior
7) Data persistence model
The app primarily uses local JSON and local LanceDB:
- Sources:
data/sources.json - Chat history:
data/chat_history/<session_id>.json - Studio notes:
data/studio_notes.json - Studio outputs:
data/studio_outputs.json - Connector cache:
data/connector_cache.json - Retrieval index:
data/lancedb/
Optional Agno SQLite memory backend is enabled via HALO_AGENT_DB.
8) UI composition model
The default app view is a 3-column layout:
- Sources panel
- Chat panel
- Studio panel
Sidebar includes navigation to multipage entries:
- Dashboard
- Configuration
- Account
- Help
- Agent Config page (reachable from Configuration page)
9) Test coverage map
The tests/ suite covers:
- Agent config validation/merging
- Team routing and coordination behavior
- Chat runtime orchestration and telemetry
- Streaming event handling and deduplication
- Chunking, ingestion, knowledge, presets, chat state
This supports a safe refactoring path from monolithic UI logic to service-level modules.
10) CI/CD overview
GitHub workflows include:
ci.yml: lint + format + tests (Python 3.11 and 3.13)docs.yml: markdown lint on PR markdown changesaudit.yml: scheduled dependency audit viapip-auditrelease.yml: Docker build on version tags
11) Internal reference documents
- Product requirements:
docs_internal/HALO_CORE_PRD.md - Chat integration PRD:
docs_internal/HALO_CHAT_INTEGRATION_PRD.md - Agent config integration plan:
docs_internal/AGENT_CONFIG_INTEGRATION_PLAN.md - Architecture decision:
adr/0001-streamlit-agno-architecture.md - Collaboration/coding rules:
AGENTS.md
12) Where to start (by role)
New maintainer
README.mddocs/ARCHITECTURE_AND_RUNTIME.mddocs/DEVELOPMENT_TESTING_AND_OPERATIONS.md
Chat/agent contributor
docs/AGENT_SYSTEM.mdservices/chat_runtime.pyservices/streaming_adapter.py
Ingestion/RAG contributor
docs/DATA_STORAGE_AND_RETRIEVAL.mdservices/ingestion.pyservices/retrieval.py
UI contributor
docs/UI_PANELS_AND_PAGES.mdapp/main.pyapp/pages/*.py