feat: add lazy loading #81

serramatutu · 2025-04-14T15:41:48Z

Summary

This PR introduces new functionality to allow models to lazy load certain large fields if users want to. This is only used in Metric.dimensions, Metric.entities and Metric.measures for now, but can easily be expanded to other fields in the future if we deem necessary.

There are no breaking changes.

End-user API

This is what it looks like from an end user perspective:

from dbtsl import SemanticLayerClient

def main():
    sl = SemanticLayerClient(..., lazy=True)
    with sl.session():
        metrics = sl.metrics()
        metric = metrics[0]
        assert metric.dimensions == []
        loaded = metric.load_dimensions()
        assert len(loaded) > 0
        assert loaded == metric.dimensions
        
main()

The asyncio equivalent is:

import asyncio
from dbtsl.asyncio import AsyncSemanticLayerClient

async def main():
    sl = AsyncSemanticLayerClient(..., lazy=True)
    async with sl.session():
        metrics = await sl.metrics()
        metric = metrics[0]
        assert metric.dimensions == []
        loaded = await metric.load_dimensions()
        assert len(loaded) > 0
        assert loaded == metric.dimensions
        
asyncio.run(main())

Base implementation

The bulk of the work happened in base.py, and metric.py uses the new functionality in the Metric model (more on that later). Other changes are tests and "plumbing" of the lazy paramater which needs to get passed around.

I added a _lazy_loadable_fields property that gets added to each subclass of GraphQLModelMixin on GraphQLModelMixin._register_subclasses(). When registering a new subclass, it will iterate over all fields in that subclass and determine whether the field is lazy-loadable. It makes that decision by:

Checking if the field has NOT_LAZY in the metadata. If that's true, the field is not added to _lazy_loadable_fields.
If the field is not Optional[...], Union[...] or List[...], then it's also considered not lazy-loadable.
If the inner argument of the type annotation is also a GraphQLModelMixin, then it's considered lazy-loadable and is added to _lazy_loadable_fields.

This makes it possible for us to tag "light" fields as NOT_LAZY [1], like I did for saved queries. In that case, I believe it's better to just fetch everything at once to minimize round trip time. I did this mainly because otherwise it would be pretty annoying for the user in some cases where our API has multiple levels of nested objects like saved_query.query_params.metrics.name.

Then, I made a minor modification to GraphQLModelMixin.gql_fragments(), which now accepts a lazy: bool param. If lazy=True, it will omit lazy fields from the fragment definition.

Finally,GraphQLModelMixin._register_subclasses() will create load_{field}() methods in the object, which will wrap a _load_{field_name} method with a sync/async loader that will also set the field after the loader returns. These _load_{field_name} methods will be specific of each model implementation, which can now use self._client to make requests.

To test all this, I added some tests on test_base.py to make sure that the _lazy_loadable_fields property gets initialized properly for subclasses, and that the GraphQL fragments contain the expected GraphQL text and dependencies depending if we're usinglazy or not.

I also added some sanity check tests to assist developers in catching bad implementations in unit tests instead of having to wait for integration tests to fail with runtime errors. It ensures every lazy loadable field needs a default value (like an empty list, or None etc) and a corresponding _load_{field} method.

[1] It is perfectly valid to ask why I made the default be lazy, while NOT_LAZY is opt-in, and not the other way around. My rationale was that if we ever add new fields to the API, we want to autodetect if they are supposed to be lazy (i.e return a list), and make developers have to opt-out if they think it's unnecessary. This way we hopefully won't end up with slow fields in the future.

`Metric` implementation

I made dimensions, entities and measures lazy. I had to create two new classes: SyncMetric and AsyncMetric, which are only for annotating the return type of load_{field} depending on the client. The sync clients return SyncMetric while the async clients return AsyncMetric. This is all for typing only, and at runtime everything is just regular Metric. I made both of these classes inherit from ABC to make sure users don't use them directly, and I added docs to warn them that they shouldn't.

Integration testing

I added a new test for lazy loading dimensions, entities and measures of a metric in our integration test suite. The regular "eager" tests continue to work normally.

Docs

I added a new example to examples/, and I added a new section to our README briefly explaining when/why to use lazy-loading.

serramatutu · 2025-04-14T16:09:38Z

dbtsl/models/dimension.py

@@ -20,7 +20,7 @@ class DimensionType(Enum, metaclass=FlexibleEnumMeta):
 )


-@dataclass(frozen=True)


Since models can change now after they're lazy loaded, dataclasses aren't frozen anymore.

serramatutu · 2025-04-14T16:12:48Z

tests/integration/test_sl_client.py

-        assert dims == metric.dimensions
+        assert model_list_equal(dims, metric.dimensions)
+
+    with subtests.test("measures"):


We were missing an integration test for measures, oops...

serramatutu · 2025-04-14T16:16:46Z

tests/test_models.py

+        }
+        notLazyOptionalA {
+            ...fragmentA
+        }


note how the .gql_fragments() method does not return lazyA nor manyA if lazy=True

serramatutu · 2025-04-14T16:19:21Z

@mirnawong1 I think we might need to update our docs for this once it gets merged and released!

mirnawong1 · 2025-04-14T16:29:52Z

no worries, thanks for the tag @serramatutu !

DevonFulcher

💥 Awesome stuff!

This commit introduces lazy fetching of GraphQL fields. Now, the `GraphQLFragmentMixin.get_fragments()` method has a new `lazy` argument, which will make it skip certain fields that are considered "large". A field is lazy-loadable when: 1. It is a `List`, `Union` or `Optional` of `GraphQLFragmentMixin`. 2. It is not marked as `NOT_LAZY`. This will make a difference when fetching things like metrics. In "eager" mode, the client will fetch all subfields of each metrics, including dimensions and entities, which makes the response potentially very large. Now, if the client is "lazy", the `.metrics()` method will only return the metrics themselves, and the `dimensions` and `entities` fields will be empty. Certain things like saved query exports don't need lazy fields as their child objects are not large, so it's worth it to just fetch everything in one go. I added two tests for this functionality. One is to make sure that the `get_fragments()` method returns the expected GraphQL fragments for lazy fields. The other is to ensure that all lazy-loadable fields have a default value which can be used to initialize the property locally when it's not initialized from server data. In the next commit, I'll wire this through the client to make it actually work in the APIs.

This commit adds a private `_client` property to all `GraphQLFragmentMixin` which will get auto-populated by the loading client. This is so that methods such as `Metric.load_dimensions()` will be able to refer back to the client to make requests.

This is for type checking

mirnawong1 · 2025-04-15T11:57:32Z

hey @serramatutu - looks like this is merged now and will work on a pr!

serramatutu · 2025-04-15T11:58:15Z

thank you @mirnawong1 !!!

mirnawong1 · 2025-04-15T12:38:04Z

hey @serramatutu , i've created a docs pr to add lazy loading to the python sdk docs - can you give a look when you have a chance just to make sure it's looking ok?

serramatutu requested a review from a team as a code owner April 14, 2025 15:41

serramatutu commented Apr 14, 2025

View reviewed changes

DevonFulcher approved these changes Apr 15, 2025

View reviewed changes

serramatutu added 8 commits April 15, 2025 13:24

feat: added SyncMetric and AsyncMetric

a081e29

This is for type checking

docs: changie

9dfc4ef

docs: add example for lazy loading

9b77df0

docs: add lazy loading to README

5dbdfee

fix: small error in README and _attach_self_to_parsed_response

48f71cd

test: remove useless test

9c37e03

serramatutu force-pushed the serramatutu/lazy branch from 35870f8 to 9c37e03 Compare April 15, 2025 11:37

serramatutu merged commit de0f9a1 into main Apr 15, 2025
4 checks passed

serramatutu deleted the serramatutu/lazy branch April 15, 2025 11:55

mirnawong1 mentioned this pull request Apr 15, 2025

Add lazy loading for Python SDK dbt-labs/docs.getdbt.com#7199

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add lazy loading #81

feat: add lazy loading #81

Uh oh!

serramatutu commented Apr 14, 2025 •

edited

Loading

Uh oh!

serramatutu Apr 14, 2025

Uh oh!

serramatutu Apr 14, 2025

Uh oh!

serramatutu Apr 14, 2025

Uh oh!

serramatutu commented Apr 14, 2025

Uh oh!

mirnawong1 commented Apr 14, 2025

Uh oh!

DevonFulcher left a comment

Uh oh!

Uh oh!

mirnawong1 commented Apr 15, 2025

Uh oh!

serramatutu commented Apr 15, 2025

Uh oh!

mirnawong1 commented Apr 15, 2025

Uh oh!

Uh oh!

		@@ -20,7 +20,7 @@ class DimensionType(Enum, metaclass=FlexibleEnumMeta):
		)


		@dataclass(frozen=True)

feat: add lazy loading #81

feat: add lazy loading #81

Uh oh!

Conversation

serramatutu commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

End-user API

Base implementation

Metric implementation

Integration testing

Docs

Uh oh!

serramatutu Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

serramatutu Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

serramatutu Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

serramatutu commented Apr 14, 2025

Uh oh!

mirnawong1 commented Apr 14, 2025

Uh oh!

DevonFulcher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mirnawong1 commented Apr 15, 2025

Uh oh!

serramatutu commented Apr 15, 2025

Uh oh!

mirnawong1 commented Apr 15, 2025

Uh oh!

Uh oh!

serramatutu commented Apr 14, 2025 •

edited

Loading

`Metric` implementation