FNO example#362
Conversation
joewallwork
left a comment
There was a problem hiding this comment.
Thanks for this contribution, @ma595! I think it'd be a great addition to our examples. However, adding as an example will require setting up in the CI, too, so I provide some info on how to do that.
I can't comment much on your FNO implementation but I have a few suggested edits related to how we've generally got things set up in FTorch.
|
|
||
| return x | ||
|
|
||
| # UNUSED |
There was a problem hiding this comment.
I guess we can drop it?
There was a problem hiding this comment.
See comment above. But yes, this needs resolving before merge.
| # grid = self.get_grid(u.shape, u.device) | ||
| # x = torch.cat((u, grid), dim=-1) # Add grid as extra channel |
There was a problem hiding this comment.
Thanks for pointing this out. I removed the concatenation of the grid (and its creation) out of this class as I was experimenting with non-uniform grids, just out of curiosity. I provided this code in the generate_parametric_sine_data in fno1d_train.py:
FTorch/examples/9_FNO/fno1d_train.py
Lines 58 to 124 in f70cfa3
But it isn't used, and I need to check whether it's useful to actually keep. I doubt whether non-uniform grids are that useful in climate applications.
| ! Infer | ||
| call torch_model_forward(model, in_tensors, out_tensors) | ||
|
|
||
| ! write (*,*) out_data(:) |
There was a problem hiding this comment.
It's great that you've included the script for training, especially because we don't currently have this for the other examples. I ran the training script locally and it was pretty fast, so I'd be happy for this to be added to the CI. Besides, we'd need to add this in order to run the other ctests you add.
There was a problem hiding this comment.
Note that you'll need the patch
diff --git a/examples/CMakeLists.txt b/examples/CMakeLists.txt
index 633cd6d..05c0efd 100644
--- a/examples/CMakeLists.txt
+++ b/examples/CMakeLists.txt
@@ -15,4 +15,5 @@ if(CMAKE_BUILD_TESTS)
add_subdirectory(7_MPI)
endif()
add_subdirectory(8_Autograd)
+ add_subdirectory(9_FNO)
endif()
diff --git a/run_test_suite.sh b/run_test_suite.sh
index 9fbac9e..aa09c7a 100755
--- a/run_test_suite.sh
+++ b/run_test_suite.sh
@@ -83,9 +83,9 @@ fi
# Run integration tests
if [ "${RUN_INTEGRATION}" = true ]; then
if [ -e "${BUILD_DIR}/examples/6_MultiGPU" ]; then
- EXAMPLES="1_Tensor 2_SimpleNet 3_ResNet 4_MultiIO 6_MultiGPU 7_MPI 8_Autograd"
+ EXAMPLES="1_Tensor 2_SimpleNet 3_ResNet 4_MultiIO 6_MultiGPU 7_MPI 8_Autograd 9_FNO"
else
- EXAMPLES="1_Tensor 2_SimpleNet 3_ResNet 4_MultiIO 7_MPI 8_Autograd"
+ EXAMPLES="1_Tensor 2_SimpleNet 3_ResNet 4_MultiIO 7_MPI 8_Autograd 9_FNO"
fi
export PIP_REQUIRE_VIRTUALENV=true
for EXAMPLE in ${EXAMPLES}; doto do this, although these won't pass as-is.
There was a problem hiding this comment.
I tried running python3 fno1d_train.py but got
Traceback (most recent call last):
File "/home/joe/software/ftorch/examples/9_FNO/fno1d_train.py", line 260, in <module>
main()
File "/home/joe/software/ftorch/examples/9_FNO/fno1d_train.py", line 235, in main
validate()
File "/home/joe/software/ftorch/examples/9_FNO/fno1d_train.py", line 183, in validate
loaded_model = torch.jit.load("fno1d_sine.pt")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/joe/.virtualenvs/ftorch/lib/python3.12/site-packages/torch/jit/_serialization.py", line 153, in load
raise ValueError(f"The provided filename {f} does not exist")
ValueError: The provided filename fno1d_sine.pt does not exist
It looks like the model needs to either be saved to file or passed to validate.
Co-authored-by: Joe Wallwork <22053413+joewallwork@users.noreply.github.com>
This PR introduces a FNO example inspired by Hamid's code here. The example trains a model by sampling a sine wave and then loads and calls the model in Fortran.
I welcome comments on whether this should be merged into FTorch as is, or whether it needs to be adapted to showcase online training. Perhaps @jatkinson1000 can comment on this.
Ready for review once the following is completed: