Skip to content

Architecture#

hl-research is layered. Each layer has one responsibility and one direction of dependency.

Module layout#

hl_research/
├── api/          typed httpx client for the HL info endpoint
├── cache/        Parquet + DuckDB local cache layer
├── data/         Polars LazyFrame views over the cache
├── analytics/    pure functions: wrapped, behavior, counterfactual, funding, vault
├── backtest/     event loop, fill simulation, metrics, optimizer, walk-forward
├── presentation/ Rich tables, matplotlib + plotly themes, Jinja2 HTML reports
├── cli/          Typer commands for every group
└── tui/          Textual app + reactive screen models

Data flow#

HL info endpoint
api/client.py    ── retries, rate limit ──▶ Pydantic v2 model (api/types.py)
cache/sync.py    ── incremental fetch ──▶ cache/store.py (Parquet partitioned by entity)
data/*.py        ── Polars LazyFrames over cached Parquet
    ├──▶ analytics/*.py  ── pure functions, return frozen dataclasses
    ├──▶ backtest/*.py   ── event loop, consumes data frames
    └──▶ presentation/*.py  ── Rich panels, Plotly fragments, Jinja2 templates
              └──▶ cli/*.py and tui/*.py present the result

Rules#

  • The data layer never calls presentation.
  • Presentation never calls the API.
  • The CLI binds the two.
  • Analytics functions are pure: in → out, no I/O, no side effects.
  • The cache layer is the only thing that touches disk for data.
  • Pydantic models are the only runtime type guard. Everywhere else is plain Python with mypy strict.

Tech choices#

Layer Library Why
CLI framework Typer Type hints become flags
HTTP httpx (async) HTTP/2, retries, native async
API types Pydantic v2 Validation + serialization
Cache data Parquet via Polars Columnar, fast, language-portable
Cache metadata DuckDB Embedded SQL, reads Parquet natively
DataFrame Polars Lazy eval, modern API
Terminal output Rich Tables, panels, color
TUI Textual Reactive, modern
Static plots Matplotlib Notebook integration
Interactive plots Plotly HTML embeds
HTML reports Jinja2 + Plotly Single self-contained file
Lint ruff Fast, replaces multiple tools
Types mypy strict Enforced in CI
Tests pytest + pytest-asyncio Standard

Cache layout on disk#

XDG default: ~/.cache/hl-research/ on Linux/macOS.

~/.cache/hl-research/
├── meta.duckdb                 metadata, sync state, asset and vault tables
└── data/
    ├── candles/
    │   └── asset=BTC/
    │       └── interval=1h/
    │           └── 2024-01.parquet
    ├── funding/
    │   └── asset=BTC/
    │       └── 2024.parquet
    ├── fills/
    │   └── wallet=0xabc.../
    │       └── 2024-Q1.parquet
    └── vaults/
        └── address=0xvault.../
            └── trades.parquet

Partitioning lets Polars and DuckDB skip irrelevant files. Files are immutable once written; new data appends new files.