IDAES · MarcusHolly · Feb 24, 2025 · Feb 24, 2025 · Feb 25, 2025 · Feb 25, 2025
@@ -5,7 +5,7 @@ Price takers are entities that must accept market prices since they lack the mar
 to directly influence the market price. Likewise, it is assumed that a price taker's resource or energy
 system is small enough such that it does not significantly impact the market. When coupled with multi-period modeling,
 the price-taker model is able to synthesize grid-centric modeling with steady-state process-centric modeling, as
-depicted in figure below.
+depicted in the figure below.
 
 .. |pricetaker| image:: images/pricetaker.png
   :width: 1200
@@ -32,15 +32,13 @@ The following equations represent the multi-period price taker model, where :mat
        h(d,u_{s,t},δ_{s,t},u_{s,t+1},δ_{s,t+1}) = 0; ∀_{s} ∈ S, t ∈ T
 
 
-The price taker multi-period modeling workflow involves the integration of multiple software platforms into the IDAES optimization model
-and can be broken down into two distinct functions, as shown in the figure below. In part 1, simulated or historical
-ISO (Independent System Operator) data is used to generate locational marginal price (LMP)
+The price taker multi-period modeling workflow is shown in part 1 of the figure below. The price taker model uses
+simulated or historical ISO (Independent System Operator) data to generate locational marginal price (LMP)
 signals, and production cost models (PCMs) are used to compute and optimize the time-varying dispatch schedules for each
-resource based on their respective bid curves. Advanced data analytics (RAVEN) reinterpret the LMP signals and PCM
+resource based on their respective bid curves. Advanced data analytics reinterpret the LMP signals and PCM
 as stochastic realizations of the LMPs in the form of representative days (or simply the full-year price signals).
-In part 2, PRESCIENT uses a variety of input parameters (design capacity, minimum power output, ramp rate, minimum up/down time, marginal cost, no load cost, and startup profile)
-to generate data for the market surrogates. Meanwhile, IDAES uses the double loop simulation to integrate detailed
-process models (b, ii) into the daily (a, c) and hourly (i, iii) grid operations workflow.
+Part 2 describes how a variety of input parameters (design capacity, minimum power output, ramp rate, minimum up/down time, marginal cost, no load cost, and startup profile)
+are used generate data for market surrogates, which, in conjunction with the price taker model, can be used to optimize hybrid energy systems.
 
 .. |hybrid_energy_system| image:: images/hybrid_energy_system.png
   :width: 1200

@@ -19,6 +19,7 @@
 import matplotlib.pyplot as plt
 import numpy as np
 import pandas as pd
+from scipy.interpolate import splrep, splev
 from pyomo.common.dependencies import attempt_import
 import idaes.logger as idaeslog
 
@@ -73,7 +74,11 @@
 
 
 def cluster_lmp_data(
-    raw_data: list, horizon_length: int, n_clusters: int, seed: int = 42
+    raw_data: list,
+    horizon_length: int,
+    n_clusters: int,
+    seed: int = 42,
+    eps: float = 1e-4,
 ):
     """
     Clusters the given price signal into n_clusters using the k-means clustering
@@ -92,6 +97,9 @@
         seed: int,
             Seed value for initializing random number generator within Kmeans
 
+        eps: float,
+            Centroid values below this threshold are set to 0 to limit noise in the data
+
     Returns:
         lmp_data_clusters: dict
             A dictionary of representative day LMP data, indices are indexed
@@ -116,7 +124,7 @@
     labels = kmeans.labels_
 
     # Set any centroid values that are < 1e-4 to 0 to avoid noise
-    centroids = centroids * (abs(centroids) >= 1e-4)
+    centroids = centroids * (abs(centroids) >= eps)
 
     # Create dicts for lmp data and the weight of each cluster
     # By default, the data is of type numpy.int or numpy.float.
@@ -136,6 +144,7 @@
     kmin: int = 2,
     kmax: int = 30,
     method: str = "silhouette",
+    sensitivity: int = 1,
     generate_elbow_plot: bool = False,
     seed: int = 42,
 ):
@@ -158,6 +167,9 @@
             Method for obtaining the optimal number of clusters.
             Supported methods are elbow and silhouette
 
+        sensitivity : int,
+            difficulty of detecting knees, where 0 detects knees the easiest
+
         generate_elbow_plot : bool,
             If True, generates an elbow plot for inertia as a function of
             number of clusters
@@ -181,28 +193,93 @@
 
     k_values = list(range(kmin, kmax + 1))
     inertia_values = []
-    mean_silhouette = []
 
     for k in k_values:
         kmeans = KMeans(n_clusters=k, random_state=seed).fit(samples)
         inertia_values.append(kmeans.inertia_)
 
-        if method == "silhouette":
-            # Calculate the average silhouette score, if the chosen method
-            # is silhouette
-            mean_silhouette.append(silhouette_score(samples, kmeans.labels_))
-
-    # Identify the optimal number of clusters
-    if method == "elbow":
-        raise NotImplementedError(
-            "elbow method is not supported currently for finding the optimal "
-            "number of clusters."
-        )
+    if method == "silhouette":
+        mean_silhouette = []
+        mean_silhouette.append(silhouette_score(samples, kmeans.labels_))
 
-    elif method == "silhouette":
+        # Identify the optimal number of clusters
         max_index = mean_silhouette.index(max(mean_silhouette))
         n_clusters = k_values[max_index]
 
+    # The implementation below is based on
+    # Ville Satopaa, Jeannie Albrecht, David Irwin, Barath Raghavan
+    # Finding a “Kneedle” in a Haystack:
+    # Detecting Knee Points in System Behavior
+    # https://raghavan.usc.edu/papers/kneedle-simplex11.pdf
+
+    elif method == "elbow":
+        # Invert inertia values such that plot is concave down and increasing
+        inverted_inertia_values = [-y for y in inertia_values]
+        # Use a smoothing spline that retains the data's original shape
+        spline = splrep(k_values, inverted_inertia_values, k=3)
+        k_smooth = np.linspace(kmin, kmax + 1, 100)
+        inertia_smooth = splev(k_smooth, spline)
+
+        k_norm = _normalize_values(k_smooth)
+        inertia_norm = _normalize_values(inertia_smooth)
+
+        # Compute the set of differences (x, y) -> (x, y-x)
+        inertia_diff = []
+        for i in range(len(inertia_norm)):
+            set_of_differences = inertia_norm[i] - k_norm[i]
+            inertia_diff.append(set_of_differences)
+
+        # Identify local maxima
+        local_maxima = [
+            (k_norm[i], inertia_diff[i])
+            for i in range(1, len(inertia_diff) - 1)
+            if inertia_diff[i] > inertia_diff[i - 1]
+            and inertia_diff[i] > inertia_diff[i + 1]
+        ]
+
+        # Calculate optimal # of clusters based on the # of local maxima
+        threshold_values = []
+        threshold_triggered = False
+        n_clusters = 0
+
+        if len(local_maxima) == 0:
+            n_clusters = 0
+            _logger.warning(
+                "The optimal number of clusters cannot be determined for this dataset."
+            )
+        elif len(local_maxima) == 1:
+            n_clusters_norm = local_maxima[0][0]
+            ind = k_norm.index(n_clusters_norm)
+            n_clusters = k_values[ind]
+        else:
+            n = len(local_maxima)
+            summation = 0
+            for i in range(0, n - 1):
+                summation += local_maxima[i + 1][0] - local_maxima[i][0]
+            # For each local maxima, compute the threshold and determine the index of the current and next local maxima
+            for i in range(0, n - 1):
+                threshold = local_maxima[i][1] - (sensitivity * summation / (n - 1))
+                threshold_values.append(threshold)
+                l_max_index = inertia_diff.index(local_maxima[i][1])
+                next_l_max_index = inertia_diff.index(local_maxima[i + 1][1])
+                for j in range(l_max_index + 1, next_l_max_index):
+                    if inertia_diff[j] < threshold:
+                        threshold_triggered = True
+                        normalized_optimal_n_clusters = local_maxima[i][0]
+                        index = k_norm.index(normalized_optimal_n_clusters)
+                        n_clusters = x[index]
+                if threshold_triggered:
+                    break
+            # If optimal # of clusters cannot be identified, use the first local maxima
+            if not threshold_triggered:
+                normalized_optimal_n_clusters = local_maxima[0][0]
+                index = k_norm.index(normalized_optimal_n_clusters)
+                n_clusters = k_values[index]
+                _logger.warning(
+                    "The number of optimal clusters could not be accurately identified. "
+                    "Consider lowering the sensitivity."
+                )
+
     else:
         raise ValueError(
             f"Unrecognized method {method} for optimal number of clusters."
@@ -229,24 +306,13 @@
     return n_clusters
 
 
-def locate_elbow(x: list, y: list):
-    """
-    Identifies the elbow/knee for the input/output data
+def _normalize_values(values):
+    normalized_values = []
 
-    Args:
-        x : list
-            List of independent variables
-        y : list
-            List of dependent variables
-
-    Returns:
-        opt_x : float
-            Optimal x at which curvature changes significantly
-    """
-    # The implementation is based on
-    # Ville Satopaa, Jeannie Albrecht, David Irwin, Barath Raghavan
-    # Finding a “Kneedle” in a Haystack:
-    # Detecting Knee Points in System Behavior
-    # https://raghavan.usc.edu/papers/kneedle-simplex11.pdf
+    for i in values:
+        max_value = max(values)
+        min_value = min(values)
+        normalized_value = (i - min_value) / (max_value - min_value)
+        normalized_values.append(normalized_value)
 
-    return len(np.array(x)) + len(np.array(y))
+    return normalized_values
@@ -173,15 +173,15 @@ def my_operation_model(m, design_blk):
         ConfigValue(
             default=True,
             domain=Bool,
-            doc="Should op_mode, startup, shutdown vars be defined?",
+            doc="Boolean flag to determine if `op_mode`, `startup`, and `shutdown` vars should be defined",
         ),
     )
     CONFIG.declare(
         "declare_lmp_param",
         ConfigValue(
             default=True,
             domain=Bool,
-            doc="Should LMP data automatically be appended to the model?",
+            doc="Boolean flag to determine if LMP data should automatically be appended to the model",
         ),
     )
 

@@ -71,7 +71,7 @@
     "horizon_length",
     ConfigValue(
         domain=PositiveInt,
-        doc="Length of each representative day/period",
+        doc="Length of each time horizon",
     ),
 )
 CONFIG.declare(
@@ -99,14 +99,14 @@
 
 # List of arguments needed for adding startup/shutdown constraints
 CONFIG.declare(
-    "up_time",
+    "minimum_up_time",
     ConfigValue(
         domain=PositiveInt,
         doc="Minimum uptime [in hours]",
     ),
 )
 CONFIG.declare(
-    "down_time",
+    "minimum_down_time",
     ConfigValue(
         domain=PositiveInt,
         doc="Minimum downtime [in hours]",
@@ -136,10 +136,10 @@
     ),
 )
 CONFIG.declare(
-    "annualization_factor",
+    "capital_recovery_factor",
     ConfigValue(
         domain=is_in_range(0, 1),
-        doc="Capital cost annualization factor [fraction]",
+        doc="Capital cost annualization factor [fraction of investment cost/year]",
     ),
 )
 CONFIG.declare(
@@ -620,8 +620,8 @@
         self,
         op_block_name: str,
         des_block_name: Optional[str] = None,
-        up_time: int = 1,
-        down_time: int = 1,
+        minimum_up_time: int = 1,
+        minimum_down_time: int = 1,
     ):
         """
         Adds minimum uptime/downtime constraints for a given unit/process
@@ -635,19 +635,19 @@
                 op_block_name. This argument is specified if the design is
                 being optimized simultaneously, e.g., "ngcc_design"
 
-            up_time: int, default=1,
+            minimum_up_time: int, default=1,
                 Total uptime (must be >= 1), e.g., 4 time periods
                 Uptime must include the minimum uptime and the time required
                 for shutdown.
 
-            down_time: int, default=1,
+            minimum_down_time: int, default=1,
                 Total downtime (must be >= 1), e.g., 4 time periods
                 Downtime must include the minimum downtime and the time
                 required for startup
         """
-        # Check up_time and down_time for validity
-        self.config.up_time = up_time
-        self.config.down_time = down_time
+        # Check minimum_up_time and minimum_down_time for validity
+        self.config.minimum_up_time = minimum_up_time
+        self.config.minimum_down_time = minimum_down_time
 
         op_blocks = self._get_operation_blocks(
             blk_name=op_block_name,
@@ -681,15 +681,15 @@
                 blk=start_shut_blk[d],
                 op_blocks=op_blocks[d],
                 install_unit=install_unit,
-                up_time=up_time,
-                down_time=down_time,
+                minimum_up_time=minimum_up_time,
+                minimum_down_time=minimum_down_time,
                 set_time=self.set_time,
             )
 
         # Save the uptime and downtime data for reference
         self._op_blk_uptime_downtime[op_block_name] = {
-            "up_time": up_time,
-            "down_time": down_time,
+            "minimum_up_time": minimum_up_time,
+            "minimum_down_time": minimum_down_time,
         }
 
         # Logger info for where constraint is located on the model
@@ -803,7 +803,7 @@
         lifetime: int = 30,
         discount_rate: float = 0.08,
         corporate_tax_rate: float = 0.2,
-        annualization_factor: Optional[float] = None,
+        capital_recovery_factor: Optional[float] = None,
         cash_inflow_scale_factor: Optional[float] = 1.0,
     ):
         """
@@ -822,7 +822,7 @@
                 Fractional value of corporate tax used in NPV calculations.
                 Must be between 0 and 1.
 
-            annualization_factor: float, default=None,
+            capital_recovery_factor: float, default=None,
                 Annualization factor
 
             cash_inflow_scale_factor: float, default=1.0,
@@ -841,7 +841,7 @@
         self.config.lifetime = lifetime
         self.config.discount_rate = discount_rate
         self.config.corporate_tax = corporate_tax_rate
-        self.config.annualization_factor = annualization_factor
+        self.config.capital_recovery_factor = capital_recovery_factor
         self.config.cash_inflow_scale_factor = cash_inflow_scale_factor
 
         if not self._has_hourly_cashflows:
@@ -902,17 +902,17 @@
             expr=cf.net_profit == cf.net_cash_inflow - cf.fom - cf.corporate_tax
         )
 
-        if annualization_factor is None:
+        if capital_recovery_factor is None:
             # If the annualization factor is not specified
-            annualization_factor = discount_rate / (
+            capital_recovery_factor = discount_rate / (
                 1 - (1 + discount_rate) ** (-lifetime)
             )
 
         cf.lifetime_npv = Expression(
-            expr=(1 / annualization_factor) * cf.net_profit - cf.capex
+            expr=(1 / capital_recovery_factor) * cf.net_profit - cf.capex
         )
         cf.npv = Expression(
-            expr=cf.net_profit - annualization_factor * cf.capex,
+            expr=cf.net_profit - capital_recovery_factor * cf.capex,
         )
 
         self._has_overall_cashflows = True