Architecture#
hl-research is layered. Each layer has one responsibility and one direction of dependency.
Module layout#
hl_research/
├── api/ typed httpx client for the HL info endpoint
├── cache/ Parquet + DuckDB local cache layer
├── data/ Polars LazyFrame views over the cache
├── analytics/ pure functions: wrapped, behavior, counterfactual, funding, vault
├── backtest/ event loop, fill simulation, metrics, optimizer, walk-forward
├── presentation/ Rich tables, matplotlib + plotly themes, Jinja2 HTML reports
├── cli/ Typer commands for every group
└── tui/ Textual app + reactive screen models
Data flow#
HL info endpoint
│
▼
api/client.py ── retries, rate limit ──▶ Pydantic v2 model (api/types.py)
│
▼
cache/sync.py ── incremental fetch ──▶ cache/store.py (Parquet partitioned by entity)
│
▼
data/*.py ── Polars LazyFrames over cached Parquet
│
├──▶ analytics/*.py ── pure functions, return frozen dataclasses
│
├──▶ backtest/*.py ── event loop, consumes data frames
│
└──▶ presentation/*.py ── Rich panels, Plotly fragments, Jinja2 templates
│
└──▶ cli/*.py and tui/*.py present the result
Rules#
- The data layer never calls presentation.
- Presentation never calls the API.
- The CLI binds the two.
- Analytics functions are pure: in → out, no I/O, no side effects.
- The cache layer is the only thing that touches disk for data.
- Pydantic models are the only runtime type guard. Everywhere else is plain Python with mypy strict.
Tech choices#
| Layer | Library | Why |
|---|---|---|
| CLI framework | Typer | Type hints become flags |
| HTTP | httpx (async) | HTTP/2, retries, native async |
| API types | Pydantic v2 | Validation + serialization |
| Cache data | Parquet via Polars | Columnar, fast, language-portable |
| Cache metadata | DuckDB | Embedded SQL, reads Parquet natively |
| DataFrame | Polars | Lazy eval, modern API |
| Terminal output | Rich | Tables, panels, color |
| TUI | Textual | Reactive, modern |
| Static plots | Matplotlib | Notebook integration |
| Interactive plots | Plotly | HTML embeds |
| HTML reports | Jinja2 + Plotly | Single self-contained file |
| Lint | ruff | Fast, replaces multiple tools |
| Types | mypy strict | Enforced in CI |
| Tests | pytest + pytest-asyncio | Standard |
Cache layout on disk#
XDG default: ~/.cache/hl-research/ on Linux/macOS.
~/.cache/hl-research/
├── meta.duckdb metadata, sync state, asset and vault tables
└── data/
├── candles/
│ └── asset=BTC/
│ └── interval=1h/
│ └── 2024-01.parquet
├── funding/
│ └── asset=BTC/
│ └── 2024.parquet
├── fills/
│ └── wallet=0xabc.../
│ └── 2024-Q1.parquet
└── vaults/
└── address=0xvault.../
└── trades.parquet
Partitioning lets Polars and DuckDB skip irrelevant files. Files are immutable once written; new data appends new files.