Skip to content

Conversation

nicklamiller
Copy link
Contributor

Contributes to: #6983

  • Adds R-squared to regression metrics as R2Metric
    • Since this metric measures goodness of fit i.e. higher is better, I made it a subclass of Metric instead of a subclass of RegressionMetric (like the other regression metrics, which measure loss/error) and used a very similar approach to how AveragePrecisionMetric is defined in binary_metric.hpp:
      class AveragePrecisionMetric: public Metric {
      public:
      explicit AveragePrecisionMetric(const Config&) {
      }
      virtual ~AveragePrecisionMetric() {
      }
      const std::vector<std::string>& GetName() const override {
      return name_;
      }
      double factor_to_bigger_better() const override {
      return 1.0f;
      }
      void Init(const Metadata& metadata, data_size_t num_data) override {
      name_.emplace_back("average_precision");
      num_data_ = num_data;
      // get label
      label_ = metadata.label();
      // get weights
      weights_ = metadata.weights();
      if (weights_ == nullptr) {
      sum_weights_ = static_cast<double>(num_data_);
      } else {
      sum_weights_ = 0.0f;
      for (data_size_t i = 0; i < num_data; ++i) {
      sum_weights_ += weights_[i];
      }
      }
      }
      std::vector<double> Eval(const double* score, const ObjectiveFunction*) const override {
      // get indices sorted by score, descending order
      std::vector<data_size_t> sorted_idx;
      for (data_size_t i = 0; i < num_data_; ++i) {
      sorted_idx.emplace_back(i);
      }
      Common::ParallelSort(sorted_idx.begin(), sorted_idx.end(), [score](data_size_t a, data_size_t b) {return score[a] > score[b]; });
      // temp sum of positive label
      double cur_actual_pos = 0.0f;
      // total sum of positive label
      double sum_actual_pos = 0.0f;
      // total sum of predicted positive
      double sum_pred_pos = 0.0f;
      // accumulated precision
      double accum_prec = 1.0f;
      // accumulated pr-auc
      double accum = 0.0f;
      // temp sum of negative label
      double cur_neg = 0.0f;
      double threshold = score[sorted_idx[0]];
      if (weights_ == nullptr) { // no weights
      for (data_size_t i = 0; i < num_data_; ++i) {
      const label_t cur_label = label_[sorted_idx[i]];
      const double cur_score = score[sorted_idx[i]];
      // new threshold
      if (cur_score != threshold) {
      threshold = cur_score;
      // accumulate
      sum_actual_pos += cur_actual_pos;
      sum_pred_pos += cur_actual_pos + cur_neg;
      accum_prec = sum_actual_pos / sum_pred_pos;
      accum += cur_actual_pos * accum_prec;
      // reset
      cur_neg = cur_actual_pos = 0.0f;
      }
      cur_neg += (cur_label <= 0);
      cur_actual_pos += (cur_label > 0);
      }
      } else { // has weights
      for (data_size_t i = 0; i < num_data_; ++i) {
      const label_t cur_label = label_[sorted_idx[i]];
      const double cur_score = score[sorted_idx[i]];
      const label_t cur_weight = weights_[sorted_idx[i]];
      // new threshold
      if (cur_score != threshold) {
      threshold = cur_score;
      // accumulate
      sum_actual_pos += cur_actual_pos;
      sum_pred_pos += cur_actual_pos + cur_neg;
      accum_prec = sum_actual_pos / sum_pred_pos;
      accum += cur_actual_pos * accum_prec;
      // reset
      cur_neg = cur_actual_pos = 0.0f;
      }
      cur_neg += (cur_label <= 0) * cur_weight;
      cur_actual_pos += (cur_label > 0) * cur_weight;
      }
      }
      sum_actual_pos += cur_actual_pos;
      sum_pred_pos += cur_actual_pos + cur_neg;
      accum_prec = sum_actual_pos / sum_pred_pos;
      accum += cur_actual_pos * accum_prec;
      double ap = 1.0f;
      if (sum_actual_pos > 0.0f && sum_actual_pos != sum_weights_) {
      ap = accum / sum_actual_pos;
      }
      return std::vector<double>(1, ap);
      }
      private:
      /*! \brief Number of data */
      data_size_t num_data_;
      /*! \brief Pointer of label */
      const label_t* label_;
      /*! \brief Pointer of weighs */
      const label_t* weights_;
      /*! \brief Sum weights */
      double sum_weights_;
      /*! \brief Name of test set */
      std::vector<std::string> name_;
      };
  • For this PR I decided to skip the CUDA implementation, but planned to follow-up with that after this gets merged as part of Built-in R2 (R-squared) metric #6983

@nicklamiller nicklamiller marked this pull request as ready for review August 25, 2025 03:42
double local_sum_weights = 0.0f;
#pragma omp parallel for num_threads(OMP_NUM_THREADS()) schedule(static) reduction(+:local_sum_weights, sum_label)
for (data_size_t i = 0; i < num_data_; ++i) {
local_sum_weights += weights_[i];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to give a heads up, I originally updated the sum_weights_ data member directly in this for-loop, but this resulted in CI failures on some of the R-package jobs for several builds on Windows with the following error (workflow run, line with specific error):

... error C3028: 'LightGBM::R2Metric::sum_weights_': only a variable or static data member can be used in a data-sharing clause ...

So now, the local_sum_weights variable is used in the pragma reduction clause and is updated in the loop, and then assigned to sum_weights_ below. I also had to do this for the total_sum_squares_ member by introducing the local_total_sum_squares variable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, makes sense to me! I can't think of a better way to do this, I think this small allocation is totally fine.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blegh I thought I submitted a review yesterday but guess I forgot to click "submit review"!

Thank you so much for this excellent PR! It's rare that I review a PR this large and have 0 comments... but I have 0 comments. You addressed everything I would have asked for... followed the project's style, added lightweight but also very effective tests, updated the docs and that list in the R package.

This is just an awesome PR, and I'm excited to add this to LightGBM. Thanks for your hard work!

@jameslamb
Copy link
Collaborator

I'm not that familiar with C++, so let's see if we can get one other reviewer to look... @borchero could you help us with a review here?

@jameslamb
Copy link
Collaborator

/gha run r-valgrind

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd been hoping for another reviewer but looks like no one is available, and I'd been waiting to see valgrind results but I realize we don't need those because this PR isn't adding R-package tests anyway.

I'm confident enough in my ability to review these changes that I think we should just merge this.

On a re-review I left one more small suggestion. I'm going to just apply that and then merge this if/when CI passes.

Thanks again for the excellent contribution!

@jameslamb
Copy link
Collaborator

I think the "Optional checks" workflow is going to fail (like this) until we get a successful run of the valgrind workflow on this branch.... because I put up #7008 (comment) but the job wasn't triggered, for the reasons described in #7012

Sorry @nicklamiller , hopefully we'll be able to get #7035 or something similar merged soon and then re-run that workflow.

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this contribution!
I have only two minor comments below:

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nicklamiller Thanks for your latest commit! I don't have any new comments.

@StrikerRUS
Copy link
Collaborator

Hmm... Errors in CI jobs looks very strange and unrelated to the PR. However, we don't have these errors in master branch.

@nicklamiller Could you please take a look?

@StrikerRUS
Copy link
Collaborator

@nicklamiller So it looks like y = y.copy() is the cause of strange unrelated CI errors, right?..

@nicklamiller
Copy link
Contributor Author

So it looks like y = y.copy() is the cause of strange unrelated CI errors, right?..

@StrikerRUS Removing y = y.copy() was the issue, I should've posted my response in this main thread, but please see my comment above for more details.

I've added back y = y.copy() in e5b5868 and the seemingly unrelated tests that were failing now pass. The most recent CI run still fails with a new failure, though that appears to be an unrelated, flaky (API rate related) failure:

 urllib.error.HTTPError: HTTP Error 403: rate limit exceeded
The last reported status from workflow "R valgrind tests" is failure. Commit fixes and rerun the workflow.

Since the most recent failure appears to be flaky, I've merged in the master branch and pushed, so hopefully CI passes this time 🤞

@jameslamb
Copy link
Collaborator

That optional-checks failure isn't "flaky". I tried to trigger an optional workflow at #7008 (comment) but then saw it fail with the issues described in #7012.

I've put up #7035 attempting to fix that (and making it easier to fix such things in the future).

@jameslamb
Copy link
Collaborator

I've put up #7035 attempting to fix that (and making it easier to fix such things in the future).

@StrikerRUS if the latest changes here look ok to you, I'd be ok with you (temporarily!) changing the branch protection to allow merging this PR while optional-checks is failing. That way this PR doesn't need to be blocked by #7035.

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Oct 13, 2025

@nicklamiller Oh, sorry, I missed in resolved thread your perfect description why unrelated tests were failing! Thanks a lot for the investigation and the fix!

@jameslamb

changing the branch protection to allow merging this PR

Sure, totally fine!

@StrikerRUS
Copy link
Collaborator

Close-reopen to fix license/cla status.

@StrikerRUS StrikerRUS closed this Oct 13, 2025
@StrikerRUS StrikerRUS reopened this Oct 13, 2025
@StrikerRUS StrikerRUS merged commit 6f0d7cc into microsoft:master Oct 13, 2025
96 of 108 checks passed
@nicklamiller nicklamiller deleted the add-r2-metric branch October 15, 2025 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants