Skip to content

Conversation

nguidotti
Copy link
Contributor

@nguidotti nguidotti commented Oct 9, 2025

This PR also fixes the starting bounds for the diving subtrees. As these nodes are detached from the main B&B tree, the bounds from the parent were lost. Now, the bounds are also kept when inserting the nodes in the diving queue.

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

@nguidotti nguidotti requested a review from a team as a code owner October 9, 2025 14:27
@nguidotti nguidotti self-assigned this Oct 9, 2025
@nguidotti nguidotti added this to the 25.10 milestone Oct 9, 2025
@nguidotti nguidotti added bug Something isn't working non-breaking Introduces a non-breaking change labels Oct 9, 2025
@rgsl888prabhu
Copy link
Collaborator

@nguidotti Is this PR meant for 25.10 ?

@nguidotti
Copy link
Contributor Author

nguidotti commented Oct 9, 2025

Yes. This allows B&B to solve the MIPLIB benchmark more consistently. Akif and Alice noticed that we had a regression in some tests, and this PR should help fix it.

@nguidotti nguidotti changed the title [BUG] Fixed starting variable bounds for diving + Diving backtracking [BUG] Fixed starting variable bounds for diving Oct 9, 2025
@nguidotti nguidotti removed the request for review from aliceb-nv October 9, 2025 20:10
// Note that we do not know which thread will execute the
// `exploration_ramp_up` task, so we allow to any thread
// to repair the heuristic solution.
repair_heuristic_solutions();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original intent was just to have a single thread attempt to repair the heuristic solution. We don't want multiple threads to try to repair the same heuristic solution. Is there a way to ensure that only a single thread repairs?

Copy link
Contributor Author

@nguidotti nguidotti Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a mutex for retrieving the solutions in the repair queue. Once all solutions are copied, the repair queue is cleared and the mutex is released (line 390 to 396):

  std::vector<std::vector<f_t>> to_repair;
  mutex_repair_.lock();
  if (repair_queue_.size() > 0) {
    to_repair = repair_queue_;
    repair_queue_.clear();
  }
  mutex_repair_.unlock();

The other threads will see the repair queue is empty and then move on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that: the repair will probably be needed much less since now the tolerances are compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the heuristic thread injects the solution to the B&B, it would not be easier for it to attempt to repair immediately instead of adding to a queue? The only caveat is the heuristic thread can only repair it after the root node solution.


// Set the correct bounds for the leaf problem
leaf_problem.lower = original_lp_.lower;
leaf_problem.upper = original_lp_.upper;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need these lines to restore the bounds to that of the original problem? Since you may modified the bounds of the leaf_problem previously?

f_t now = toc(stats_.start_time);
f_t time_since_last_log = stats_.last_log == 0 ? 1.0 : toc(stats_.last_log);

if (omp_get_thread_num() == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we only want one thread to print?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the tasking model, one thread spawns a task, it goes to a pool and then it is distributed to any available thread. In the ramp-up phase, it will have more threads than tasks, such that some will execute the corresponding task, while others are idling. We do not know which thread will be in each group.

Hence, if we restrict logging to T0 (or any other thread), then the log will only appear when it is active, which may only happen near the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When it is time to report, the code is atomically check if the node is the last one explored. If it is the case, then the thread will print the log. Otherwise, it will skip the report and allow another thread to do it instead.

return;
}

// Set the correct bounds for the leaf problem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. You are restoring here the bounds here. Is there a reason to move it from the original spot?

Copy link
Contributor Author

@nguidotti nguidotti Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two reasons:

  1. The initial bounds for the best first and diving are different. Best-first uses the bounds of the original problem, while diving uses the bounds of the starting node (we need to keep the bounds separated since the node was detached from the main tree).
  2. In Propagate the bounds from the parent to the child nodes #473 , we only need to change the bounds after reaching the bottom of the branch.

@chris-maes
Copy link
Contributor

@nguidotti can you walk me through the changes tomorrow before merging?

// Note that we do not know which thread will execute the
// `exploration_ramp_up` task, so we allow to any thread
// to repair the heuristic solution.
repair_heuristic_solutions();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that: the repair will probably be needed much less since now the tolerances are compatible.

Copy link
Contributor

@chris-maes chris-maes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As we chatted about you might want to have a separate task or thread for doing printing and repairing solutions in the ramp up phase. This can happen after the release.

@rgsl888prabhu
Copy link
Collaborator

/merge

@rapids-bot rapids-bot bot merged commit ecc2566 into NVIDIA:branch-25.10 Oct 10, 2025
172 of 174 checks passed
@nguidotti nguidotti deleted the fix-diving-bounds branch October 11, 2025 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants