Skip to content

Conversation

ToHold
Copy link
Contributor

@ToHold ToHold commented Sep 12, 2025

Launch Checklist

  • Confirm your changes do not include backports from Mapbox projects (unless with compliant license) - if you are not sure about this, please ask!
  • Briefly describe the changes in this PR.
  • Link to related issues.
  • Include before/after visuals or gifs if this PR includes visual changes.
  • Write tests for all new functionality.
  • Document any changes to public APIs.
  • Post benchmark scores.
  • Add an entry to CHANGELOG.md under the ## main section.

Problem description:

The map is not properly restored after a contextRestored event. Some buffers and resources from the previous context are still being used, which causes the new context to crash and triggers WebGL warnings.

Changes:

Ensure that the style is properly cleared when a contextLost event occurs, so that the contextRestored event can work as expected.
On contextRestored, create a new Style object based on the existing one to fully reinitialize the style state.

Pull request note:

I'm not entirely confident about all the changes in this PR, as this area is new to me. A review with advice and suggestions would be greatly appreciated.
I also need help with testing, since it requires a real WebGL API. Could you provide some guidance or share a sample test setup?

Related issue :

#6242

@HarelM
Copy link
Collaborator

HarelM commented Sep 12, 2025

Thanks for taking the time to open this PR!
There's a webgl mocking library that is being used for unit tests. There are integration tests if you need a real browser.
There are readme files scattered around the project in strategic places.
If there's more specific help you might need let me know.
Generally speaking, it seems like clearing everything is risky, but if you can test and and make sure it works as expected then I'm ok with this change.
I believe the convention around resource clearing is done in most cases with a method called "remove" if I'm not mistaken, but do validate this...

@ToHold ToHold force-pushed the fix-not-working-contextRestored-event branch 2 times, most recently from b6e716c to 509e68e Compare September 18, 2025 14:08
@ToHold
Copy link
Contributor Author

ToHold commented Sep 18, 2025

Thanks for your response.
There is indeed a Puppeteer test environment that allows testing in a real WebGL context. I’ve set up two tests so far:

  • one checks that the map is not empty (i.e., it renders pixels other than just white),

  • the other ensures that there are no more warnings related to the context restore issue.

Regarding the naming convention for destruction methods, I noticed that several classes already use destroy, so it seemed consistent to follow the same pattern.

For clearing everything, I tested that the “polygons” and “custom” layers render correctly again afterward. Specifically, I copy the Style when the context lost event occurs, then recreate it on context restore. My thinking was that destroying everything first and then rebuilding it would help avoid memory leaks. However, I’m not completely sure if I’m destroying everything properly.
Is there someone familiar with the Style handling who could review this part?

Finally, do you think I should add more tests, such as rendering tests, to verify that the map looks identical after the restore?

@ToHold
Copy link
Contributor Author

ToHold commented Sep 19, 2025

I added a commit introducing an operation called loseAndRestoreContext for test-render.
Should I add this operation to all test-render tests, or only to some of them?

@HarelM
Copy link
Collaborator

HarelM commented Sep 19, 2025

I don't see an added value in adding this to the render tests... Render test are more static in most cases, they also should be paired to native, which I'm not sure context loss exists there...

@ToHold
Copy link
Contributor Author

ToHold commented Sep 19, 2025

Render tests compare the rendered output as image data against the expected image data. They use Puppeteer, which runs a real headless browser (usually Chromium or Chrome). This allows the tests to use WebGL in the browser to render layers, styles, etc.

Since this PR destroys and recreates the Style, I added an operation to check some render tests. Right now, I’m running it locally with all the style.json (I added LoseAndRestoreContext operation to them). Most tests are passing, but a few specific ones are timing out. I’m investigating the cause and working on fixing it.

@ToHold
Copy link
Contributor Author

ToHold commented Sep 19, 2025

I was just asking whether you want me to add the LoseAndRestoreContext operation to all existing tests, since it could slow them down.

@ToHold
Copy link
Contributor Author

ToHold commented Sep 19, 2025

In my opinion, if all render tests pass with the LoseAndRestoreContext operation enabled, the PR can be considered fully safe.

@ToHold ToHold force-pushed the fix-not-working-contextRestored-event branch from add3bad to 68f8044 Compare September 19, 2025 14:32
@HarelM
Copy link
Collaborator

HarelM commented Sep 19, 2025

As a feature test, that's a great tests, but I don't want to run this on every PR, the integration test you added is good enough for keeping the functionality working for the long run...

@ToHold
Copy link
Contributor Author

ToHold commented Sep 19, 2025

So I’ll remove the last commit with the LoseAndRestoreContext operation in the render tests?
I’ll still investigate why some of them were failing and fix those issues.

@HarelM
Copy link
Collaborator

HarelM commented Sep 19, 2025

Yes please

@ToHold ToHold force-pushed the fix-not-working-contextRestored-event branch from e752487 to 953b44a Compare September 23, 2025 08:37
@ToHold
Copy link
Contributor Author

ToHold commented Sep 23, 2025

I fixed an issue where images added via map.addImage were not being restored after a context restore.

Also, I should mention that custom layers need to be re-added after a context restore. In test/examples/add-a-3d-model-to-globe-using-threejs.html, this works well because the example adds the custom layer inside map.on('style.load', ...), so it automatically gets added again when the context is restored.

@ToHold ToHold force-pushed the fix-not-working-contextRestored-event branch from 1a117a8 to 9f7f5ab Compare September 26, 2025 09:14
@HarelM
Copy link
Collaborator

HarelM commented Sep 26, 2025

BTW, what about customer layers? I don't think they withstand context lost, do they?

@ToHold
Copy link
Contributor Author

ToHold commented Sep 26, 2025

BTW, what about customer layers? I don't think they withstand context lost, do they?

They need to be restored either from the contextrestore event fired by the map or by listening to the style.load event, which is triggered when the style is restored.
Would you like me to propose another solution?


// Remove event listeners
this.setEventedParent(null);
this.dispatcher.remove();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about unregistering to events from workers? Also what about cleaning workers data etc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not very familiar with this part of the code yet; I need to study it more to fully understand what you’re suggesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you thinking about something like this ?

// Clear workers 
this.dispatcher.workerPool.release(this.map._mapId);
this.dispatcher.actors.forEach((actor) => {
    actor.remove();
});
this.dispatcher.workerPool.workers.forEach((worker) => {
    worker.terminate();
});

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dispatcher is registering events in style.ts file:

this.dispatcher.registerMessageHandler(MessageType.getGlyphs, (mapId, params) => {

I'm guessing these need to be unregistered to avoid receiving messages for a destroyed style.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding workers, these are created when the app starts, but I'm not sure how stateless they are, so I would recommend looking at what is stored in worker.ts file and see if clean up is needed there as well:
https://github.com/maplibre/maplibre-gl-js/blob/main/src/source/worker.ts

@HarelM
Copy link
Collaborator

HarelM commented Sep 26, 2025

I've added a few comments.

@ToHold ToHold force-pushed the fix-not-working-contextRestored-event branch from ead05d6 to 903c34b Compare October 3, 2025 08:43
Style should be cleared properly on contextLost event to ensure contextRestored event works as expected.
contextRestored should create a new Style object from the existing one.
@ToHold ToHold force-pushed the fix-not-working-contextRestored-event branch from 903c34b to 47aea35 Compare October 3, 2025 08:49
this._serializedLayers = {};

// Reset other internal state
this.stylesheet = null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving all this initialization logic to a method and maybe use the same method in the constructor.

centerClampedToGround: true
};

let oldContextStyle: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should define a global variable... I'm guessing that this won't work when there are multiple maps in the same page...

@HarelM
Copy link
Collaborator

HarelM commented Oct 5, 2025

They need to be restored either from the contextrestore event fired by the map or by listening to the style.load event, which is triggered when the style is restored.
Would you like me to propose another solution?

If we this is not handled automatically at least a warnOnce should be used to let a developer seeing this to know that they need to handle this problem.
If there's a way to actually solve this without any interaction from the developer it might be a better DX.

In what scenario this happens that all this code is needed? I'm a bit lost around the motivation of this PR, that slowly becomes bigger and bigger...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants