Skip to content

Conversation

@mhidalgoaraya
Copy link
Contributor

Add Jaakkola-Jordan Lower Bound Implementation for Sigmoid Node

This PR implements the Jaakkola-Jordan lower bound approximation for the sigmoid function as described in Bishop's "Pattern Recognition and Machine Learning" (PRML).

New Sigmoid Node Implementation:

  • Added a new Sigmoid node with three interfaces: out, in, and ζ (zeta)
  • Implemented the Jaakkola-Jordan lower bound using a variational parameter ζ
  • Added corresponding average energy computation for the sigmoid node

Updated Message Passing Rules:

  • ζ (zeta) rule: Computes the optimal variational parameter as ζ = √(μ² + σ²) where μ and σ are the mean and standard deviation of the input
  • in rule: Updated to handle both Categorical and PointMass output distributions, computing weighted mean and precision using the Jaakkola-Jordan approximation
  • out rule: Computes the output categorical distribution using the logistic function with the variational parameter

Mathematical Foundation:

  • The lower bound uses: σ(x) ≥ σ(ζ) * exp((x-ζ)/2 - λ(ζ)(x²-ζ²))
  • Where λ(ζ) = (σ(ζ) - 0.5)/(2ζ) and σ(ζ) is the logistic function
  • The optimal ζ is computed as ζ = √(μ² + σ²) for maximum bound tightness

@codecov
Copy link

codecov bot commented Oct 21, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.22%. Comparing base (be9acdc) to head (39d3242).
⚠️ Report is 58 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #529      +/-   ##
==========================================
+ Coverage   75.99%   76.22%   +0.23%     
==========================================
  Files         205      210       +5     
  Lines        6077     6163      +86     
==========================================
+ Hits         4618     4698      +80     
- Misses       1459     1465       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bvdmitri
Copy link
Member

I like the addition! @ismailsenoz could you also look at it?

@wouterwln wouterwln requested a review from ismailsenoz October 23, 2025 08:54
@bvdmitri
Copy link
Member

@ismailsenoz do you have time to look at it? Otherwise I ask someone else

@ismailsenoz
Copy link
Contributor

I will take a look!

@ismailsenoz
Copy link
Contributor

ismailsenoz commented Oct 30, 2025

Thanks for the PR! Before we can merge this, we need to establish that it provides clear advantages over our existing MultinomialPolya node, which handles similar binomial/multinomial problems.
Requested before further review:

Benchmarks: Please provide performance comparisons between this Jaakkola-Jordan implementation and our existing Pólya-Gamma approach
Validation tests: Add tests comparing against ground truth, following the structure in our multinomial regression tests
Use case justification: Describe specific scenarios where this approach outperforms Pólya-Gamma augmentation

Concerns about the Jaakkola-Jordan approach:

Approximation vs. exactness: Pólya-Gamma augmentation provides exact posterior sampling, while the Jaakkola-Jordan bound is a variational approximation that may sacrifice accuracy
Conjugacy and mixing: Pólya-Gamma maintains conjugate Gaussian conditionals, producing better mixing and scalability with many covariates. The Jaakkola-Jordan approach loses this conjugacy advantage
Scalability: The typical argument for Jaakkola-Jordan is computational efficiency on very large datasets when Pólya methods use MCMC. However, our existing Pólya implementation is not MCMC-based, so it already scales well to large datasets without the approximation trade-off

Given these points, we'd need to see concrete evidence of improved performance or functionality to justify adding this alternative implementation. Looking forward to seeing the benchmarks!


struct Sigmoid end

@node Sigmoid Stochastic [out, in, ζ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is a logistic node then it needs to be deterministic. Currently it is stochastic but that is wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add new node implementing Jaakkola & Jordan’s lower bound on sigmoid (log-sigmoid) in Bishop PRML

4 participants