ML signals, institutional flows, and volatility-aware sizing on a private data lake you control.
One ./run.sh no Python install ·
★ Star on GitHub
Real interactive REPL no slash command needed, just type. Every number is computed in Python/SQL from your live ClickHouse data, never invented by the LLM.
"What should I buy today? Am I overexposed?" One-click dashboard + plain-English ask. Composite 0–100 scores, premium alerts, and clear BUY / HOLD / AVOID no spreadsheets, no jargon.
Launch the dashboard →Track institutional flows, reverse-engineer DSP / Nippon / ICICI conviction, map macro themes to positioning, and size with the GARCH Risk Governor. Built by data enthusiasts with deep AMC domain knowledge and quant network backing.
See institutional signals →Walk-forward LightGBM, GARCH + Isolation Forest, Kelly sizing, raw ClickHouse SQL. Add a signal pillar in one class, backtest it, ship it all via ./mosaic.sh.
Launch with a single ./run.sh
and the full Streamlit hub opens at localhost:8501 12 tabs over your live
ClickHouse data. No Python, no notebooks, no glue code.
12 tabs: Import · SQL Query · Explorer · Anomaly Detection · Who Is Selling · MF Holdings · ETF Scanner · Market News · Signals · Kite Dashboard · Deep Dive · Intl ETFs
Serious decisions need clean cross-asset data, models that quantify edge and risk, and context that explains why something moved. Retail tools give you charts. Terminals give you feeds. Neither gives you all three in one place on your own machine.
All market data stays in a local ClickHouse instance. No third-party cloud sync, no API leakage your portfolio intelligence is private.
Every number the agent reports is first computed in Python or SQL. The LLM only narrates never calculates. A hard architectural rule, enforced everywhere.
Runs fully locally via Ollama (Gemma 4). The orchestrator auto-switches to compact prompts and data injection paths for low-context local models.
Every flagged date gets a full report: GARCH regime + Final-Z, news sentiment correlation, COMEX futures context, COT speculator positioning, and what the ML model predicted that day — all point-in-time, no future leakage.
Reverse-engineer DSP, Nippon, and ICICI AMC conviction from monthly portfolio disclosures. 24+ months of cross-fund ownership = highest-quality single-name signal.
GARCH(1,1) conditional vol feeds inverse-vol + Kelly position sizing. High-stress regimes automatically reduce position weights before you need to think about it.
Data lands in a local lake, fans out to four quant engines, converges in a multi-agent orchestrator backed by a SQLite LLM cache so repeat questions are free and surfaces wherever you work.
6 pillars → 0–100 composite score, run in parallel
LightGBM 5-day return + quantile confidence band
GARCH(1,1) vol → inverse-vol + Kelly sizing
MAD-Z → GARCH(1,1) → Isolation Forest → PELT change-point · 8 regime labels · corporate action suppression
LLM intent router → specialist sub-agent guild (10 agents) · budget + tracing middleware auto-attached
18+ ETFs scored 0–100 daily across six independent pillars run in parallel via ThreadPoolExecutor.
Fires on ~8% of days vs. 21% for a naive Random Forest. Four independent methods vote; corporate action ex-dates are automatically suppressed so splits and bonuses never pollute your signal.
Walk-forward time-series CV with quantile regression for calibrated uncertainty bands.
Real-time GARCH vol feeds an inverse-vol + Kelly blend. Regime overrides cut size during stress.
One command pulls price, earnings, cashflow, promoter trends, MF cross-ownership, and news in parallel.
Reverse-engineer AMC tactical pivots from monthly portfolio disclosures across 7 multi-asset funds.
Every analysis — anomaly breakdown, ML forecast, signal report, or equity deep-dive — can be exported as a shareable PDF with one command.
output/reports/ — ready to share or archiveRather than one LLM with 80 tools, an intent router dispatches to 10 specialist sub-agents, each with a curated tool set, domain system prompt, and hard limits (20 tool calls / 30k tokens / 180s).
GOLDBEES ML pipeline, composite ETF scores, GARCH vol, anomaly explanation.
NSE/BSE stocks parallel fetch of price, earnings, promoter, MF holdings, news.
COMEX pre-market, FII/DII flows, macro theme scanner, whale tracker.
SEC EDGAR 10-K/10-Q, XBRL financials, exec comp, Workday hiring trends.
MAFANG, Hang Seng, Nasdaq ETFs scarcity premium vs NAV analysis.
Multi-source sentiment aggregation NewsAPI + GNews per symbol.
Ad-hoc Python execution and raw ClickHouse SQL queries.
Schema inspection, watermark status, import freshness checks.
No Python, no virtualenv, no dependency hell. Install
Docker Desktop,
then ./run.sh
for the dashboard and ./mosaic.sh
for everything else.
Builds the image, starts ClickHouse + UI, opens your browser.
# macOS / Linux $ ./run.sh # Windows > run.bat # Dashboard opens at → http://localhost:8501 # Stop it anytime $ ./stop.sh
Any CLI command or script inside Docker, zero local setup.
# Pre-market commodity signals $ ./mosaic.sh comex # Composite ETF signals (0–100) $ ./mosaic.sh signals --save # GOLDBEES ML pipeline $ ./mosaic.sh src/scripts/goldbees_report.py # Sync fresh data first $ ./mosaic.sh import --category etfs
Bare ./mosaic.sh opens a REPL no slash command needed, auto-routes to the right agent.
$ ./mosaic.sh ✔ Agent ready mosaic-gemma4 @ ollama You: explain GOLDBEES anomalies You: am I overexposed to IT? You: /signals # or slash commands # one-shot mode also works $ ./mosaic.sh ask "today's gold signal"
Developers can run natively without the wrappers.
$ python3 -m venv .venv $ source .venv/bin/activate $ pip install -r requirements.txt $ cp .env.example .env # add keys $ docker compose up clickhouse -d $ python src/main.py ui
On first ./run.sh a
.env is created
add OPENAI_API_KEY,
NEWSAPI_KEY,
GOLD_API_KEY,
or point LLM_BASE_URL
at Ollama for fully offline operation.