Skip to content

Enable probes. #3268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Apr 24, 2025
Merged

Enable probes. #3268

merged 24 commits into from
Apr 24, 2025

Conversation

polina-c
Copy link
Contributor

@polina-c polina-c commented Apr 22, 2025

Contributes to #3254

@polina-c polina-c marked this pull request as draft April 22, 2025 01:41
Copy link

github-actions bot commented Apr 22, 2025

PR Health

Breaking changes ✔️
Package Change Current Version New Version Needed Version Looking good?
Changelog Entry ✔️
Package Changed Files

Changes to files need to be accounted for in their respective changelogs.

Coverage ✔️
File Coverage

This check for test coverage is informational (issues shown here will not fail the PR).

API leaks ✔️

The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.

Package Leaked API symbols
License Headers ✔️
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
Files
no missing headers

All source files should start with a license header.

Unrelated files missing license headers
Files
pkgs/samples/lib/brick_breaker.dart
pkgs/samples/lib/fibonacci.dart
pkgs/samples/lib/google_ai.dart
pkgs/samples/lib/hello_world.dart
pkgs/samples/lib/main.dart
pkgs/samples/lib/sunflower.dart

@polina-c polina-c marked this pull request as ready for review April 22, 2025 02:40
@polina-c polina-c requested a review from johnpryan April 22, 2025 02:40
@polina-c polina-c marked this pull request as draft April 22, 2025 02:44
@polina-c polina-c marked this pull request as ready for review April 22, 2025 02:47
@polina-c polina-c marked this pull request as draft April 22, 2025 02:52
@polina-c polina-c removed the request for review from johnpryan April 22, 2025 03:07
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding testing to the dartpad front-end is fantastic!

I'm curious - how stable are these golden images? They seem relatively semantic (likely pretty stable?).

Copy link
Contributor Author

@polina-c polina-c Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The images are little different on different environments. And, with flutter changes the images may modify.
So, yes, there is some level of flackiness. It is worth it, because of number of errors the goldens are catching.

The flakiness can be tuned by golden diff tolerance that is set in flutter_test_config.dart.

It is how most apps implement it.

@polina-c polina-c requested a review from johnpryan April 22, 2025 18:23
@polina-c polina-c marked this pull request as ready for review April 22, 2025 18:24
@polina-c polina-c requested a review from devoncarew April 22, 2025 18:27

import 'model.dart';

export 'model.dart';

@visibleForTesting
int activeHttpRequests = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this variable? Is there a way to test the app without it? I'm worried that this will cause some maintenance overhead in the future. For example, if we forget to increment / decrement this value in other places in the code where we are making HTTP requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment why it is needed.

I do not see other simple way how to test without it.

Another option is to to create DartPadHttpClient, that will incapsulate this variable.

But I do not see large difference with what we have now: ServicesClient incapsulates it pretty well and updates for all types of the http requests: get, post, stream etc.

If things change and we will forget to support this variable, we will get tests failing, debug them and figure out how to make it better.

For now it seems to be ok.

I am open for other suggestions.

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a comment explaining why it's needed - could you help me understand why we need this? Does pumpAndSettle not run the app long enough or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my machine, I'm able to comment out the await waitForRequestsToComplete(tester); line, and the test still passes, so I'm wondering if we can take this out for now?

Copy link
Contributor Author

@polina-c polina-c Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If failed for me number of times. It will introduce flakiness. You do not want it. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea, maybe we could run the tests against a local server instead? Maybe that would reduce flakiness here.

Copy link
Contributor Author

@polina-c polina-c Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduced flackiness is still flackiness. I wrapped standard http client with DartPadHttpClient. So, now no way to forget something.

About what is 'right': there is no 'right' and 'wrong' here. Every app and product choose their way to test things. Some test environments to not allow http requests, some allow them. Flutter tests allow them, and some flutter developers are doing them.

Yes, I see howserver_test.dart is ramping up the backend locally: server = await EndpointsServer.serve(0, sdk, null, 'nnbd_artifacts');, but:

  • it will fail for AI requests
  • If i test UI against it, it will not ensure backward compatibility with what is hosted in prod. Yes, we try to release backend and front end roughly at the same time, but there is race condition, and one of the releases may fail.

For v0 of testing I would stick with real requests because:

  1. It will double check our backend is working as expected. If we switch to local backend, we will stop testing prod is working.
  2. I can take care about cloud logging (is it _logger.info or something else?). However, while DartPad internal and external developers people work with DartPad, they generate more fake traffic, than unit tests would do.
  3. To ensure ui/backend parity we just need changes to be backward compatible. It is good thing, because it naturally enforces the the PRs to be roll-back safe.
  4. I am not sure how to ramp up local backend in unit tests without exposing api keys. Or, we can ramp it up knowing that we will not test AI, that may start failing any time. It seems to be good thing that we will be alerted about AI failures by daily test run (do we want to make it hourly?).

It is possible to have testing for both local and prod backend, with local one skipping ai testing, but, assuming we need testing for prod anyway, I do not see why we may want to add testing for fake backend.

Thoughts?

Copy link
Contributor Author

@polina-c polina-c Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general my point is:

  1. This solution is simplest and cheapest possible, covers things end-to-end, and ready to merge
  2. It is better than no UI testing at all
  3. We can turn off testing or improve it any time when we see something is wrong with it
  4. But before improving it I want to know where is the issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thank you for offering to improve our testing in DartPad. This PR was not quite what I was expecting after our discussion last week, so I want to make sure we are aligned on the overall testing strategy first before we jump to solutions that don't work in the long-term.

Overall, I don't think this is a very valuable test for a few reasons:

  1. It's flaky, and requires extra code to track HTTP requests to reduce the flakiness
  2. It's a golden test, which is primarily used for testing UI fidelity. It's not the right tool for testing the backend or other UI functionality.
  3. It uses the actual DartPad backend, which messes up our user metrics (which we desperately need right now, since we don't have a proper Dart analytics package to use).

About what is 'right': there is no 'right' and 'wrong' here. Every app and product choose their way to test things. Some test environments to not allow http requests, some allow them. Flutter tests allow them, and some flutter developers are doing them.

I worry that it's not maintainable in the long-term, and it's important that we make sure the changes we are making align with our overall testing strategy, since we are interested in adding tests so that we can confidently make changes to the codebase over the long-term (starting with #3235). There's a lot of functionality that this PR doesn't include that I think are important (application state, compilation + running, hot reload, user interaction, having testable code structure, etc.)

It will double check our backend is working as expected. If we switch to local backend, we will stop testing prod is working.
I am not sure how to ramp up local backend in unit tests without exposing api keys. Or, we can ramp it up knowing that we will not test AI, that may start failing any time. It seems to be good thing that we will be alerted about AI failures by daily test run (do we want to make it hourly?).

This is why we need to align on our testing strategy, I don't think this is the right approach for testing the genUI backend. We should have a separate tool for a backend health-check. I don't think a golden test is the right tool for the job here.

We can turn off testing or improve it any time when we see something is wrong with it

Sure, but I see some problems that I'd like to resolve before we jump to that step.

But before improving it I want to know where is the issue

I think we should first specify how we want our UI tests to work before we start fixing issues with them. Otherwise, we will be moving in the wrong direction.

Copy link
Contributor Author

@polina-c polina-c Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version took couple hours to put together. Now I know what is possible and we kicked of the discussion.

Thank you for your comments and, yes, let's align on the test strategy.

@polina-c polina-c requested a review from johnpryan April 23, 2025 01:24
@polina-c polina-c closed this Apr 23, 2025
@polina-c
Copy link
Contributor Author

polina-c commented Apr 24, 2025

Aligned on tech details here: #3270

@polina-c polina-c reopened this Apr 24, 2025
@polina-c polina-c changed the title Enable test widgets. Enable probes. Apr 24, 2025
@@ -653,7 +665,7 @@ class DartPadAppBar extends StatelessWidget implements PreferredSizeWidget {
bottom: bottom,
actions: [
// Hide the Install SDK button when the screen width is too small.
if (constraints.maxWidth > smallScreenWidth)
if (constraints.maxWidth >= minLargeScreenWidth)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changes intentional? What's the reason behind changing the breakpoint behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to set width to this value and I want screen to be still wide, to test for overload.
This was my idea, that did not work out, because test rendering sizes are little different than on mac and web, may be because they are on ubuntu.
But it is still good, because eventually we may want to test for overload for web platform.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this affect the user experience?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, not 'overload', but 'overflow'.

Overflow means there may be visual cuts that developer did not expect. In debug mode it is error, in prod it is not.

See: https://medium.com/@ugamakelechi501/solving-widget-overflow-issues-in-flutter-layouts-a-beginner-s-guide-d56b5de5af2d

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change of > to >= will not change user experience.

@polina-c polina-c requested a review from johnpryan April 24, 2025 17:28
Copy link
Contributor

@johnpryan johnpryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@polina-c polina-c merged commit ed5c28b into dart-lang:main Apr 24, 2025
13 checks passed
@polina-c polina-c deleted the test-widgets branch April 24, 2025 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants