-
Notifications
You must be signed in to change notification settings - Fork 2
feat: add lazy loading #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -20,7 +20,7 @@ class DimensionType(Enum, metaclass=FlexibleEnumMeta): | |||
) | |||
|
|||
|
|||
@dataclass(frozen=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since models can change now after they're lazy loaded, dataclasses aren't frozen anymore.
assert dims == metric.dimensions | ||
assert model_list_equal(dims, metric.dimensions) | ||
|
||
with subtests.test("measures"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were missing an integration test for measures, oops...
} | ||
notLazyOptionalA { | ||
...fragmentA | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note how the .gql_fragments()
method does not return lazyA
nor manyA
if lazy=True
@mirnawong1 I think we might need to update our docs for this once it gets merged and released! |
no worries, thanks for the tag @serramatutu ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💥 Awesome stuff!
This commit introduces lazy fetching of GraphQL fields. Now, the `GraphQLFragmentMixin.get_fragments()` method has a new `lazy` argument, which will make it skip certain fields that are considered "large". A field is lazy-loadable when: 1. It is a `List`, `Union` or `Optional` of `GraphQLFragmentMixin`. 2. It is not marked as `NOT_LAZY`. This will make a difference when fetching things like metrics. In "eager" mode, the client will fetch all subfields of each metrics, including dimensions and entities, which makes the response potentially very large. Now, if the client is "lazy", the `.metrics()` method will only return the metrics themselves, and the `dimensions` and `entities` fields will be empty. Certain things like saved query exports don't need lazy fields as their child objects are not large, so it's worth it to just fetch everything in one go. I added two tests for this functionality. One is to make sure that the `get_fragments()` method returns the expected GraphQL fragments for lazy fields. The other is to ensure that all lazy-loadable fields have a default value which can be used to initialize the property locally when it's not initialized from server data. In the next commit, I'll wire this through the client to make it actually work in the APIs.
This commit adds a private `_client` property to all `GraphQLFragmentMixin` which will get auto-populated by the loading client. This is so that methods such as `Metric.load_dimensions()` will be able to refer back to the client to make requests.
This is for type checking
35870f8
to
9c37e03
Compare
hey @serramatutu - looks like this is merged now and will work on a pr! |
thank you @mirnawong1 !!! |
hey @serramatutu , i've created a docs pr to add lazy loading to the python sdk docs - can you give a look when you have a chance just to make sure it's looking ok? |
Summary
This PR introduces new functionality to allow models to lazy load certain large fields if users want to. This is only used in
Metric.dimensions
,Metric.entities
andMetric.measures
for now, but can easily be expanded to other fields in the future if we deem necessary.There are no breaking changes.
End-user API
This is what it looks like from an end user perspective:
The
asyncio
equivalent is:Base implementation
The bulk of the work happened in
base.py
, andmetric.py
uses the new functionality in theMetric
model (more on that later). Other changes are tests and "plumbing" of thelazy
paramater which needs to get passed around.I added a
_lazy_loadable_fields
property that gets added to each subclass ofGraphQLModelMixin
onGraphQLModelMixin._register_subclasses()
. When registering a new subclass, it will iterate over all fields in that subclass and determine whether the field is lazy-loadable. It makes that decision by:NOT_LAZY
in the metadata. If that's true, the field is not added to_lazy_loadable_fields
.Optional[...]
,Union[...]
orList[...]
, then it's also considered not lazy-loadable.GraphQLModelMixin
, then it's considered lazy-loadable and is added to_lazy_loadable_fields
.This makes it possible for us to tag "light" fields as
NOT_LAZY
[1], like I did for saved queries. In that case, I believe it's better to just fetch everything at once to minimize round trip time. I did this mainly because otherwise it would be pretty annoying for the user in some cases where our API has multiple levels of nested objects likesaved_query.query_params.metrics.name
.Then, I made a minor modification to
GraphQLModelMixin.gql_fragments()
, which now accepts alazy: bool
param. Iflazy=True
, it will omit lazy fields from the fragment definition.Finally,
GraphQLModelMixin._register_subclasses()
will createload_{field}()
methods in the object, which will wrap a_load_{field_name}
method with a sync/async loader that will also set the field after the loader returns. These_load_{field_name}
methods will be specific of each model implementation, which can now useself._client
to make requests.To test all this, I added some tests on
test_base.py
to make sure that the_lazy_loadable_fields
property gets initialized properly for subclasses, and that the GraphQL fragments contain the expected GraphQL text and dependencies depending if we're usinglazy
or not.I also added some sanity check tests to assist developers in catching bad implementations in unit tests instead of having to wait for integration tests to fail with runtime errors. It ensures every lazy loadable field needs a default value (like an empty list, or
None
etc) and a corresponding_load_{field}
method.[1] It is perfectly valid to ask why I made the default be lazy, while
NOT_LAZY
is opt-in, and not the other way around. My rationale was that if we ever add new fields to the API, we want to autodetect if they are supposed to be lazy (i.e return a list), and make developers have to opt-out if they think it's unnecessary. This way we hopefully won't end up with slow fields in the future.Metric
implementationI made
dimensions
,entities
andmeasures
lazy. I had to create two new classes:SyncMetric
andAsyncMetric
, which are only for annotating the return type ofload_{field}
depending on the client. The sync clients returnSyncMetric
while the async clients returnAsyncMetric
. This is all for typing only, and at runtime everything is just regularMetric
. I made both of these classes inherit fromABC
to make sure users don't use them directly, and I added docs to warn them that they shouldn't.Integration testing
I added a new test for lazy loading dimensions, entities and measures of a metric in our integration test suite. The regular "eager" tests continue to work normally.
Docs
I added a new example to
examples/
, and I added a new section to our README briefly explaining when/why to use lazy-loading.