Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix part of the problem with notebook losing trust. #6240

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions notebook/services/contents/filemanager.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,29 @@

_script_exporter = None

# workaround nbformat bug:
# nbformat save logic will mutate the notebook if it does not
# have cells ids, so if we sign the notebook,
# the signature of the saved notebook is changed.
# we thus need to add cell ids to the notebook before signing.
try:
from nbformat.validator import normalize
except ImportError:
from nbformat.corpus.words import generate_corpus_id
from copy import deepcopy

def normalize(nbdict, version, version_minor):
nbdict = deepcopy(nbdict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make the deepcopy conditional on version? Not sure how many pre-cell-id notebooks are out there but this would save some time/space for them.


if version >= 4 and version_minor >= 5:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a minor adjustment so lower minor versions are included for future major versions.

Suggested change
if version >= 4 and version_minor >= 5:
if (version == 4 and version_minor >= 5) or version >= 5:

# if we support cell ids ensure default ids are provided
for cell in nbdict["cells"]:
if "id" not in cell:
changes += 1
# Generate cell ids if any are missing
cell["id"] = generate_corpus_id()
return nbdict


def _post_save_script(model, os_path, contents_manager, **kwargs):
"""convert notebooks to Python script after save with nbconvert
Expand Down Expand Up @@ -473,6 +496,8 @@ def save(self, model, path=''):
try:
if model['type'] == 'notebook':
nb = nbformat.from_dict(model['content'])

nb = normalize(nb, nb.get("nbformat"), nb.get("nbformat_minor"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use case where we would NOT want to pull the version info from the notebook? If not, I think it would be cleaner to access the version info within normalize().

self.check_and_sign(nb, path)
self._save_notebook(os_path, nb)
# One checkpoint should always exist for notebooks.
Expand Down