This project implements the INSIGHT-XAI framework, a multi-layered approach to making unsupervised learning models transparent, interpretable, and actionable for financial decision-makers. It applies clustering, energy-based modeling (EBMs), information-theoretic evaluation, and large language model (LLM)-based narrative generation to over 40 years of stock market data.
Phase | Scope | Status | Notes |
---|---|---|---|
1 | Foundational Exploration | Completed | Volatility clustering, PCA drift, ticker longevity, entropy, etc. |
2 | Energy-Based Modeling (EBMs) | Completed | RBM training, energy surfaces, attractor maps, anomaly zones. |
3 | Info-Theoretic Explainability | Completed | I(Y;E), H(Y |
4 | LLM + Persona Narratives | Partially Done | Ollama-based narratives with templates; early-stage personalization. |
5 | Stress Testing & Intervention | Not Started | Placeholder sections created; implementation pending. |
6 | Governance, Fairness, Ethics | Not Started | No fairness or policy simulations yet. Planning underway. |
- Rolling 30/60/90-day features for return, volatility, and momentum.
- KMeans clustering for latent regime discovery.
- PCA-based structural drift analysis.
- Ticker longevity analysis and index overlap (e.g., ZION, V, UHG).
- Volatility fingerprinting via UMAP.
- Entropy pre-analysis for regime stability.
Outputs from this phase feed directly into RBM modeling in Phase 2.
- RBMs trained on regime-labeled features to capture latent attractors.
- Visualization of energy surfaces and attractor regions using UMAP.
- Anomaly zones identified via high-energy states and reconstruction divergence.
- Metastability and transition zone timelines computed.
Outputs include attractor labels, entropy maps, and anomaly surfaces.
- Mutual Information (I(Y;E)) to quantify latent regime informativeness.
- Conditional entropy (H(Y|E)) to evaluate residual uncertainty.
- Multi-scale explainability across daily, weekly, and monthly time resolutions.
- Actionability (I(Actions;E)/H(Actions)) computed for simulated decision personas.
- Trade-off plots between completeness and faithfulness.
Implementation aligns tightly with theoretical foundations from the literature review.
- Uses Ollama (LLaMA or Mistral) for on-device narrative generation.
- Prompt templates created for different roles: analyst, investor, executive, regulator, researcher.
- Metrics (volatility, entropy, regime type) mapped into slot-based summaries.
Partially Implemented:
- Narrative drift audits not yet done.
- No faithfulness scoring of generated outputs.
- Simulated persona satisfaction scores pending.
Output quality is promising; further LLM evaluation and personalization logic needed.
Phase 5 Plans:
- Simulate shocks (e.g., interest rate hike, sector collapse).
- Measure regime responses using energy surface shifts.
- Evaluate how different personas react to perturbed narratives.
Phase 6 Plans:
- Audit fairness of explanations across tickers/sectors.
- Detect regional/sectoral underrepresentation in outputs.
- Propose early warning systems and transparency metrics.
Planning frameworks are in place; implementation to follow.
-
Automate Narrative Drift Tests
Detect changes in LLM-generated text for fixed inputs. -
Faithfulness Audits
Compare narratives to quantitative ground truth (e.g., entropy, MI). -
Persona Feedback Simulation
Use clarity, trust, and utility rubrics for rating. -
Shock Injection Sandbox
Reuse entropy and energy timelines to test resilience. -
Causal Pathway Modeling
Extend Granger causality with do-intervention style simulations.
Author: Aditya Saxena
Affiliation: Toronto Metropolitan University
Email: [email protected]
GitHub: https://github.com/profadityasaxena/Labels-to-Latents
"Explainability is not a luxury in AI—it’s the language of trust."