Add local version of the [Colab notebook]( https://github.com/tensor… #847
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It seems that a number of users are having problems running the profiler on their own setups.
Currently, the main source of documentation is a colaboratory notebook - which is therefore limited to Google environments, and has specific packages and jupyter magic set up. And is currently broken, not displaying any data or even the profile tab. The installa and run script only verifies that tensorboard launches - it does not . It is also not suitable ofr users not using
virtualenv
for their package management (I have installed the tensorflow packages through a mamba environment and conda-forge).This commit adds a script that is currently just a copy of the notebook in a vanilla python. With one change that the number of epochs is matched to the setup in the trianing loop (as otherwise no data are collected).
However, as a starting point, I think it ought to be expanded out to include further test cases and any edits made as appropriate to produce the sort of data that is in the /data folder of the repository.
In my case for example, I can open up the data and tensorboard and the profiler correctly display it. But I am getting generic errors that data were not captured for the timesteps involved if I then try to generate my own data. It is not clear what the issue is here, as I have resolved any problems relating to warnings output (and if this script can detect some of those programmatically, then that would be great - e.g. not being able to find the cuda PTI libraries).
Hopefully if a script like this can be expanded, then that would help both users and maintainers with diagnosing issues, especially as it seems the whole TF ecosystem is changing in a way that is breaking links. I think a full reproducible example all the way through from verifying setup through to data generation and then visualisation, that works with GPUs and a user's current installation environment, would be a good thing to add.
Relevant issues: !578 !835 !653 !839 !613
===============================
Add local version of the Colab notebook as a starting point for more reproducible testing for a local installation. This file ideally should be fleshed out to include more tests as part of installation and to ensure that data are being gathered correctly.