Architecture#

hl-research is layered. Each layer has one responsibility and one direction of dependency.

Module layout#

hl_research/
├── api/          typed httpx client for the HL info endpoint
├── cache/        Parquet + DuckDB local cache layer
├── data/         Polars LazyFrame views over the cache
├── analytics/    pure functions: wrapped, behavior, counterfactual, funding, vault
├── backtest/     event loop, fill simulation, metrics, optimizer, walk-forward
├── presentation/ Rich tables, matplotlib + plotly themes, Jinja2 HTML reports
├── cli/          Typer commands for every group
└── tui/          Textual app + reactive screen models

Data flow#

HL info endpoint
    │
    ▼
api/client.py    ── retries, rate limit ──▶ Pydantic v2 model (api/types.py)
    │
    ▼
cache/sync.py    ── incremental fetch ──▶ cache/store.py (Parquet partitioned by entity)
    │
    ▼
data/*.py        ── Polars LazyFrames over cached Parquet
    │
    ├──▶ analytics/*.py  ── pure functions, return frozen dataclasses
    │
    ├──▶ backtest/*.py   ── event loop, consumes data frames
    │
    └──▶ presentation/*.py  ── Rich panels, Plotly fragments, Jinja2 templates
              │
              └──▶ cli/*.py and tui/*.py present the result

Rules#

The data layer never calls presentation.
Presentation never calls the API.
The CLI binds the two.
Analytics functions are pure: in → out, no I/O, no side effects.
The cache layer is the only thing that touches disk for data.
Pydantic models are the only runtime type guard. Everywhere else is plain Python with mypy strict.

Tech choices#

Layer	Library	Why
CLI framework	Typer	Type hints become flags
HTTP	httpx (async)	HTTP/2, retries, native async
API types	Pydantic v2	Validation + serialization
Cache data	Parquet via Polars	Columnar, fast, language-portable
Cache metadata	DuckDB	Embedded SQL, reads Parquet natively
DataFrame	Polars	Lazy eval, modern API
Terminal output	Rich	Tables, panels, color
TUI	Textual	Reactive, modern
Static plots	Matplotlib	Notebook integration
Interactive plots	Plotly	HTML embeds
HTML reports	Jinja2 + Plotly	Single self-contained file
Lint	ruff	Fast, replaces multiple tools
Types	mypy strict	Enforced in CI
Tests	pytest + pytest-asyncio	Standard

Cache layout on disk#

XDG default: ~/.cache/hl-research/ on Linux/macOS.

~/.cache/hl-research/
├── meta.duckdb                 metadata, sync state, asset and vault tables
└── data/
    ├── candles/
    │   └── asset=BTC/
    │       └── interval=1h/
    │           └── 2024-01.parquet
    ├── funding/
    │   └── asset=BTC/
    │       └── 2024.parquet
    ├── fills/
    │   └── wallet=0xabc.../
    │       └── 2024-Q1.parquet
    └── vaults/
        └── address=0xvault.../
            └── trades.parquet

Partitioning lets Polars and DuckDB skip irrelevant files. Files are immutable once written; new data appends new files.