Skip to content

Add a notebook for partitioning#76

Open
SarahAlidoost wants to merge 2 commits intomainfrom
add_partitioning_nb
Open

Add a notebook for partitioning#76
SarahAlidoost wants to merge 2 commits intomainfrom
add_partitioning_nb

Conversation

@SarahAlidoost
Copy link
Collaborator

@SarahAlidoost SarahAlidoost commented Jan 15, 2026

closes #73

In this notebook, I implemented a different approach than the one in pcse notebook / 11 Optimizing partitioning in a PCSE model.ipynb.

I will update doc after #62

@SarahAlidoost SarahAlidoost marked this pull request as ready for review January 15, 2026 15:35
Copy link
Collaborator

@SCiarella SCiarella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 @SarahAlidoost, the notebook looks nice.

One first technical comment is that all the notebooks need to add:

from diffwofost.physical_models.config import ComputeConfig
ComputeConfig.set_device('cpu')

otherwise they fail if a GPU is available because it would be the default device of the model, but not of the notebook's variables.


In terms of content, I see that we are no longer optimizing a logistic function, but rather optimizing the parameters directly. To me, this seems cleaner (and it is probably faster), but maybe those sigmoid functions have a deeper meaning that we are losing? Anyway, let's wait for someone with more knowledge to comment on this issue.


Finally, I do not fully understand the last figure: it is just the xy coordinates of predicted vs true parameter?

@SarahAlidoost
Copy link
Collaborator Author

SarahAlidoost commented Jan 16, 2026

👋 @SarahAlidoost, the notebook looks nice.

One first technical comment is that all the notebooks need to add:

from diffwofost.physical_models.config import ComputeConfig
ComputeConfig.set_device('cpu')

otherwise they fail if a GPU is available because it would be the default device of the model, but not of the notebook's variables.

Good point 👍 I'll fix it.

In terms of content, I see that we are no longer optimizing a logistic function, but rather optimizing the parameters directly. To me, this seems cleaner (and it is probably faster), but maybe those sigmoid functions have a deeper meaning that we are losing? Anyway, let's wait for someone with more knowledge to comment on this issue.

You’re right! In the notebook pcse notebook / 11 Optimizing partitioning in a PCSE model.ipynb, the partitioning variables (FL and FO) are first estimated using a sigmoid-based approximation (see the FLTB and FOTB classes in cell 17). Based on sampled DVS values (e.g. np.arange(0, 2.1, stepsize)), lookup tables are then created. These tables are eventually passed to wofost72. Then, as we know, in partitioning model using an interpolation approach (via AfgenTrait), the values of FL and FO are estimated again.

The notebook in this PR uses a softmax instead of a sigmoid. It also enforces some physical constraints: the partitioning values (FL and FO) must be positive, the sum of FLTB, FOTB, and FSTB must equal 1, and the x-values (i.e. DVS) must be strictly increasing. The optimization is based on the outputs of partitioning (FL and FO). Thinking about it now, it would probably make sense to rename the generic x and y variables to dvs and the actual partitioning variables, and define DVS as np.arange(0, 2.1, stepsize) or use another approximation between partitioning variables, and DVS.

Finally, I do not fully understand the last figure: it is just the xy coordinates of predicted vs true parameter?

The x axes is DVS and y axis are the the partitioning values (FL , FS and FO), actual vs predicted.

@SarahAlidoost
Copy link
Collaborator Author

👋 @SarahAlidoost, the notebook looks nice.
One first technical comment is that all the notebooks need to add:

from diffwofost.physical_models.config import ComputeConfig
ComputeConfig.set_device('cpu')

otherwise they fail if a GPU is available because it would be the default device of the model, but not of the notebook's variables.

Good point 👍 I'll fix it.

In terms of content, I see that we are no longer optimizing a logistic function, but rather optimizing the parameters directly. To me, this seems cleaner (and it is probably faster), but maybe those sigmoid functions have a deeper meaning that we are losing? Anyway, let's wait for someone with more knowledge to comment on this issue.

You’re right! In the notebook pcse notebook / 11 Optimizing partitioning in a PCSE model.ipynb, the partitioning variables (FL and FO) are first estimated using a sigmoid-based approximation (see the FLTB and FOTB classes in cell 17). Based on sampled DVS values (e.g. np.arange(0, 2.1, stepsize)), lookup tables are then created. These tables are eventually passed to wofost72. Then, as we know, in partitioning model using an interpolation approach (via AfgenTrait), the values of FL and FO are estimated again.

The notebook in this PR uses a softmax instead of a sigmoid. It also enforces some physical constraints: the partitioning values (FL and FO) must be positive, the sum of FLTB, FOTB, and FSTB must equal 1, and the x-values (i.e. DVS) must be strictly increasing. The optimization is based on the outputs of partitioning (FL and FO). Thinking about it now, it would probably make sense to rename the generic x and y variables to dvs and the actual partitioning variables, and define DVS as np.arange(0, 2.1, stepsize) or use another approximation between partitioning variables, and DVS.

Finally, I do not fully understand the last figure: it is just the xy coordinates of predicted vs true parameter?

The x axes is DVS and y axis are the the partitioning values (FL , FS and FO), actual vs predicted.

I fixed the variable naming, plots and approximation of partitioning variables. What still remains is the logic of loss function. right now we use the loss between the output of partitioning modules (FO, FS and FL) and test data. This might not be the right approach, something to be explored more 🤔

@sonarqubecloud
Copy link

Copy link

@ronvree ronvree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I do have some comments/questions/requests though

I noticed a ComputeConfig was added. I’m a bit concerned about the direction in its current form, mainly because it introduces a second “global source of truth” for dtype/device that can be easy to get out of sync with normal PyTorch conventions.

  • It keeps a global dtype that, if I understand correctly, should be the default dtype to initialize float tensors if not specified otherwise. PyTorch already has this built in (see torch.set_default_dtype). If these are not in sync it will cause unexpected behavior. If nothing is implemented it already behaves exactly as intended.
  • It checks whether a GPU is available and if so will use it as default, and to disable this a user has to explicitly disable this by adding ComputeConfig.set_device('cpu') any time a model is used and would give an error otherwise. I think it would be more intuitive to use the cpu by default, for which no extra code is required either.

My impression is that PyTorch Modules generally follow the following convention

class A(nn.Module):

   def __init__(self, *args, dtype=None, device=None):
   	
   	# pass dtype and device to any sub-modules or parameters that are initialized
   	# for any initialized tensor for which dtype or device are None they are initialized to `torch.get_default_dtype` and `'cpu'`, respectively.
   	...
   	
   def forward(self, x):
   	
   	# infer dtype and device from x 
   	# initialize any tensors to be consistent with dtype and device inferred from x
   	... 

and that by following this convention no other global config would be required.

Also in the notebook I think it would be useful to add more comments to the code to give more insight in the individual steps that are done and why it is necessary. For example, I was a bit confused about the get_tables output structure in OptDiffPartitioning. If I understand correclty it sort of interleaves the dvs and parameter values? Is this because Wofost expects the table to be parameterized this way? Some comments in these kind of steps would be super useful

@SarahAlidoost
Copy link
Collaborator Author

Looks good to me! I do have some comments/questions/requests though

I noticed a ComputeConfig was added. I’m a bit concerned about the direction in its current form, mainly because it introduces a second “global source of truth” for dtype/device that can be easy to get out of sync with normal PyTorch conventions.

  • It keeps a global dtype that, if I understand correctly, should be the default dtype to initialize float tensors if not specified otherwise. PyTorch already has this built in (see torch.set_default_dtype). If these are not in sync it will cause unexpected behavior. If nothing is implemented it already behaves exactly as intended.
  • It checks whether a GPU is available and if so will use it as default, and to disable this a user has to explicitly disable this by adding ComputeConfig.set_device('cpu') any time a model is used and would give an error otherwise. I think it would be more intuitive to use the cpu by default, for which no extra code is required either.

My impression is that PyTorch Modules generally follow the following convention

class A(nn.Module):

   def __init__(self, *args, dtype=None, device=None):
   	
   	# pass dtype and device to any sub-modules or parameters that are initialized
   	# for any initialized tensor for which dtype or device are None they are initialized to `torch.get_default_dtype` and `'cpu'`, respectively.
   	...
   	
   def forward(self, x):
   	
   	# infer dtype and device from x 
   	# initialize any tensors to be consistent with dtype and device inferred from x
   	... 

and that by following this convention no other global config would be required.

@ronvree Thanks for the review and taking a close look! We think the ComputeConfig approach is a solid (and cleaner) alternative to using torch.set_default_dtype/device in diffwofost. Relying on PyTorch’s global defaults can get risky for reusable software since they’re process-wide and affect everything created afterward, including third-party code, tests, or models we don’t control. As a rule of thumb, torch.set_default_* is fine for notebooks or tiny scripts. For a modular code we use a config like ComputeConfig e.g.:

dtype = ComputeConfig.get_dtype()
device = ComputeConfig.get_device()
torch.tensor(..., dtype=dtype, device=device)

That said, I noticed that we’re currently setting device/dtype in two places (ComputeConfig and EngineTestHelper(device=..., dtype=...)) Also, I didn’t set device/dtype correctly when creating tensors in the notebooks. We’ll fix that in #84.

One more thing: model.to("cuda") won’t actually do anything here. Instead, the intended pattern is to set the device/dtype via ComputeConfig before running the forward pass. This approach is consistent with how other scientific simulators handle execution context.

@SarahAlidoost
Copy link
Collaborator Author

Also in the notebook I think it would be useful to add more comments to the code to give more insight in the individual steps that are done and why it is necessary. For example, I was a bit confused about the get_tables output structure in OptDiffPartitioning. If I understand correclty it sort of interleaves the dvs and parameter values? Is this because Wofost expects the table to be parameterized this way? Some comments in these kind of steps would be super useful

As discussed, I will not add the notebook for now. See this comment. I changed this PR to draft.

@SarahAlidoost SarahAlidoost marked this pull request as draft January 30, 2026 09:18
@ronvree ronvree marked this pull request as ready for review February 1, 2026 19:06
@ronvree
Copy link

ronvree commented Feb 3, 2026

Looks good to me! I do have some comments/questions/requests though
I noticed a ComputeConfig was added. I’m a bit concerned about the direction in its current form, mainly because it introduces a second “global source of truth” for dtype/device that can be easy to get out of sync with normal PyTorch conventions.

  • It keeps a global dtype that, if I understand correctly, should be the default dtype to initialize float tensors if not specified otherwise. PyTorch already has this built in (see torch.set_default_dtype). If these are not in sync it will cause unexpected behavior. If nothing is implemented it already behaves exactly as intended.
  • It checks whether a GPU is available and if so will use it as default, and to disable this a user has to explicitly disable this by adding ComputeConfig.set_device('cpu') any time a model is used and would give an error otherwise. I think it would be more intuitive to use the cpu by default, for which no extra code is required either.

My impression is that PyTorch Modules generally follow the following convention

class A(nn.Module):

   def __init__(self, *args, dtype=None, device=None):
   	
   	# pass dtype and device to any sub-modules or parameters that are initialized
   	# for any initialized tensor for which dtype or device are None they are initialized to `torch.get_default_dtype` and `'cpu'`, respectively.
   	...
   	
   def forward(self, x):
   	
   	# infer dtype and device from x 
   	# initialize any tensors to be consistent with dtype and device inferred from x
   	... 

and that by following this convention no other global config would be required.

@ronvree Thanks for the review and taking a close look! We think the ComputeConfig approach is a solid (and cleaner) alternative to using torch.set_default_dtype/device in diffwofost. Relying on PyTorch’s global defaults can get risky for reusable software since they’re process-wide and affect everything created afterward, including third-party code, tests, or models we don’t control. As a rule of thumb, torch.set_default_* is fine for notebooks or tiny scripts. For a modular code we use a config like ComputeConfig e.g.:

dtype = ComputeConfig.get_dtype()
device = ComputeConfig.get_device()
torch.tensor(..., dtype=dtype, device=device)

That said, I noticed that we’re currently setting device/dtype in two places (ComputeConfig and EngineTestHelper(device=..., dtype=...)) Also, I didn’t set device/dtype correctly when creating tensors in the notebooks. We’ll fix that in #84.

One more thing: model.to("cuda") won’t actually do anything here. Instead, the intended pattern is to set the device/dtype via ComputeConfig before running the forward pass. This approach is consistent with how other scientific simulators handle execution context.

Thanks for your reply! I completely agree with the concerns regarding torch.set_default_dtype and my suggestion was not to use this to sync with ComputeConfig but rather to not have a globally defined device/dtype for diffwofost. I'm mainly proposing to use the pytorch conventions where the torch defaults are used if nothing is specified, but the user could have different instances on different devices if they explicitly mention this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task] Check if parameters of Partioning should be optimizable

3 participants