[FEA] Concurrent dual simplex and PDLP at the root

Design doc by @chris-maes 

# Unifying the MIP root relaxation solve

# Background

Currently we solve the root relaxation twice in cuOpt

1) Inside of primal heuristics – using PDLP  
2) Inside of Branch and Bound – using Dual Simplex

If either of these solves finishes significantly faster than the other, we are in a situation where either primal heuristics or branch and bound is waiting unnecessarily. This is a particular problem on small and large MIPs. 

* On small MIPs, dual simplex is likely to beat PDLP and primal heuristics are stuck waiting for a low-accuracy PDLP solution.  
* On large MIPs, dual simplex (especially without any presolve) is very slow compared to PDLP, and branch and bound is stuck forever waiting for a basic solution.

Several users of cuOpt’s 25.05 release asked why we supported concurrent LP solves, but not concurrent solves of the LP defined by the root relaxation. We should try to correct this issue in the 25.08 release.

# Purpose

The purpose of this document is to outline a plan to unify the root relaxation solve across primal heuristics and branch and bound. We will discuss some of the technical issues and propose a design. The cuOpt team can then review and comment on this design.

# Technical Issues

There are a few technical issues that any design must consider. In this section we briefly outline the issues

## Concurrency

It is difficult to tell a priori which method PDLP or dual simplex will be faster on a given problem. Thus, it makes sense to perform a concurrent solve of the root relaxation linear program where we race both methods: dual simplex running on the CPU and PDLP running on the GPU.

## Basis Solution Requirements for Branch and Bound 

Branch and bound requires a basic solution of the root relaxation to warm start the nodes in the branch and bound tree (and later to warm start the solution of LPs after cuts have been added). 

Thus, we must run the crossover algorithm on the PDLP solution to get a basis to give to branch and bound.

Primal heuristics like the feasibility pump can use the solution of the PDLP problem (without crossover). Although, they may benefit from the higher accuracy solution that is obtained with crossover.

## Sharing the GPU

My understanding of primal heuristics is that we first run some rounds of no-relaxation heuristics like the feasibility jump before we start the root relaxation solve. These heuristics can generate feasible solutions very quickly (although the quality may not be as good as heuristics that use the relaxation). 

These no-relaxation heuristics run on the GPU. Currently, this is not an issue as dual simplex is used in branch and bound and it runs only on the CPU. So the no relaxation heuristics have 100% access to the GPU.

If we were to start the root-relaxation solve immediately using PDLP on the GPU. Then the no-relaxation heuristics and PDLP would need to share the GPU.

## Feasibility Pump 

The root relaxation solve is used in the feasibility pump. Once the root relaxation is finished the feasibility pump solves a sequence of related linear programs. Theoretically, these LPs differ only in their objective function and so could be warm started. However, 

1. As far as I know we don’t currently take into account any warm start information when solving these LPs.   
2. In practice, we don’t include all integer variables in the objective function. Since the objective function includes an absolute value function, which must be moved into the constraints, this means that currently the constraints change from problem to problem. 

# Design Proposal

We propose the following first design. This design will likely change as cuOpt evolves (e.g. if we add a larger MIP presolve which takes a significant time). We will discuss the reasoning behind this proposal in the next section :

1) Branch and Bound takes over the ownership of the root relaxation solve. This means that the feasibility pump inside primal heuristics will no longer launch PDLP to solve the root relaxation. Instead branch and bound will notify primal heuristics when a solution of the root relaxation is available.  
2) Branch and Bound starts the root solve immediately with PDLP in a separate thread on the CPU and stream on the GPU. This means that no-relaxation heuristics and PDLP will share the GPU.   
3) The feasibility pump will adjust the formulation of the LP objective to include all integer variables. This ensures that only the objective of subsequent LP solves changes (not the constraints). As such the basic information obtained from the root relaxation solve can be used to warm start the primal simplex method. We suggest using the primal simplex method, since if only the objective changes from problem to problem, the previous solution remains primal feasible for the current problem.  
4) We will handle the event that PDLP is able to quickly solve the root relaxation, but crossover takes significantly longer to get to a basic solution, by having branch and bound notify primal heuristics once it has the PDLP solution. Primal heuristics can work with the PDLP solution immediately. And launch subsequent PDLP solves if it has not been notified that a basic solution is available.

# Design Discussion

In this section we discuss some of the decisions made in the design.

First, we chose to share the GPU with streams. We went this way for several reasons:

* It seems easier to manage multiple CPU threads accessing the GPU with streams.  
* On many MIPs (e.g. nearly all of MIPLIB) the LPs that need to be solved are not extremely large. Even if they have 10 million nonzeros, it is unlikely that PDLP will be able to saturate the GPU. This should leave some room for no-relaxation heuristics to work on the GPU.  
* If necessary, we can add some priority to the streams. We can mark the PDLP stream as lower priority to allow the no-relaxation stream to make progress.   
* We should benchmark this baseline approach and understand the tradeoffs before going to a more complicated design. If necessary, there are a few options that can be taken to improve the sharing of the GPU  
  * Do not launch PDLP at all on small problems where we think dual simplex is likely to win  
  * Delay launching PDLP until no-relaxation heuristics have found a solution or some amount of time/work has elapsed.  
* We suggest using primal simplex for the feasibility pump since  warm starting primal simplex from a feasible basis requires only a Phase 2 implementation and usually only requires a small O(100-1K) iterations to get optimal. Such a small amount of iterations can be done efficiently on the CPU. This leaves room for more time to be spent on feasibility jump on the GPU after the root relaxation is complete

# Suggested Implementation Strategy

* Nicolas Blin to start experimenting with launching PDLP concurrently with dual simplex in branch and bound  
* Chris Maes to work on primal simplex for warm starting the feasibility pump   
* Akif Coerduek or Piotr Sielski to change the formulation of the feasibility pump LP to include all integer variables in the objective to allow for warm starting. And change the feasibility pump implementation to call primal simplex and reuse a basic solution from a previous LP solve.

Some coordination will be required in the following areas

* Ensuring the basic solution provided by crossover is the same as a basic solution provided by dual simplex. And so can be used by nodes in the branch and bound tree.   
* Launching a separate CPU thread inside of branch and bound to manage PDLP.  
* Creating a callback or other mechanism to share a solution of the root relaxation from branch and bound with primal heuristics.  
* The calling convention for warm starting primal simplex from a basis.

 

# Appendix: Deriving a feasible basis for FP from the root relaxation

The root relaxation problem (in branch and bound) is given by 

```
Minimize c^T x   
Subject to   A*x == b  
                   l  <= x <= u
```

If we have a basic solution to that problem, we know that every variable x is either:

1. Basic  
2. Nonbasic at its upper bound  
3. Nonbasic at its lower bound

The feasibility pump problem is given by

```
Minimize sum_j  | x_j  - x_j^int |   
Subject to    A*x == b  
                    l<= x <= u
```

Since the objective is piecewise linear, we need to reformulate it to cast it as an LP. The reformulation usually works by creating a new variable `z_j` for each variable `x_j` which must be integer. We want `z_j = | x_j - x_j^int |`. This leads to the reformulation

```
Minimize   sum_j z_j  
Subject to  A*x = b  
                 z_j >= x_j - x_j^int, for all j where x_j must be integer  
                 z_j >= -x_j + x_j^int, for all j where x_j must be integer  
                  l<= x <= u  
                  0 <= z_j <= inf, for all j where x_j must be integer
```
We may now add slack variables `s_j^pos` and `s_j^neg` to the inequality constraints to convert them to equality constraints

```
Minimize  sum_j z_j  
Subject to  A*x + 0 * z + 0 * s = b,  
                  z_j = s_j^pos + x_j - x_j^int, for all j where x_j must be integer  
                  z_j = s_j^neg - x_j + x_j^int, for all j where x_j must be integer  
                  l <= x<=u  
                  0<= z_j<= inf, for all j where x_j must be integer  
                  0<= s_j^pos <= inf, for all j where x_j must be integer  
                  0<= s_j^neg <= inf, for all j where x_j must be integer  
```
Let A have m rows and n columns. And suppose there are p integer variables. We now have m + 2*p constraints and n + 3*p variables. 

We need to come up with a basic/nonbasic assignment for all these variables.  We may keep the basis assignment for the original n variables, so we just need assignments for the additional p variables.

If the optimal solution of the root relaxation x_j^* is equal to x_j^int, the z_j variable will be zero and thus on its lower bound. So we have that   
```
z_j = x_j^* - x_j^int = 0, s_j^pos = 0  
z_j = -x_j^* + x_j^int = 0, s_j^neg = 0
```
This means that z_j, s_j^pos, and s_j^neg are all on their lower bound of 0. We can choose s^j_pos and s_j^neg to be basic. And z_j to be nonbasic at its lower bound. Note this means that the basis will be primal degenerate. Note that since s_j^pos and s_j^neg only appear in one row with a coefficient of 1, this is equivalent to adding columns of the identity to the basis and thus the basis will remain nonsingular. 

If, in optimal solution of the root relaxation, x_j^* is not equal to x_j^int, we have two cases: x_j^* > x_j^int or x_j^* < x_j^int.

For the case where x_j^* > x_j^int, we have that  
```
z_j = x_j^* - x_j^int > 0, s_j^pos = 0  
z_j = s_j^neg - x_j^* + x_j^int, s_j^neg  = 2 ( x_j^* -  x_j^int ) > 0 
```
This means that z_j is basic, s_j^pos is nonbasic at its lower bound, and s^j_neg is basic. Note that since z_j and s^j_neg only appear in one row with a coefficient of 1, this is equivalent to adding columns of the identity to the basis and thus the basis will remain nonsingular.

For the case where x_j^* < x_j^int, we have that   
```
z_j = s_j^pos + x_j^*  - x_j^int, s_j^pos = 2 (-x_j^* + x_j^int) > 0   
z_j = -x_j^* + x_j^int > 0, s_j^neg = 0
```

This means that z_j is basic, s^j_pos is basic, and s^j_neg is nonbasic at its lower bounds. Note that since z_j and s_j^neg only appear in one row with a coefficient of 1, this is equivalent to adding columns of the identity to the basis and thus the basis will remain nonsingular.

 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEA] Concurrent dual simplex and PDLP at the root #143

Unifying the MIP root relaxation solve

Background

Purpose

Technical Issues

Concurrency

Basis Solution Requirements for Branch and Bound

Sharing the GPU

Feasibility Pump

Design Proposal

Design Discussion

Suggested Implementation Strategy

Appendix: Deriving a feasible basis for FP from the root relaxation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEA] Concurrent dual simplex and PDLP at the root #143

Description

Unifying the MIP root relaxation solve

Background

Purpose

Technical Issues

Concurrency

Basis Solution Requirements for Branch and Bound

Sharing the GPU

Feasibility Pump

Design Proposal

Design Discussion

Suggested Implementation Strategy

Appendix: Deriving a feasible basis for FP from the root relaxation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions