Skip to content

1s -> 200ms response time fix #347

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Zaczero opened this issue Mar 31, 2025 · 16 comments
Closed

1s -> 200ms response time fix #347

Zaczero opened this issue Mar 31, 2025 · 16 comments

Comments

@Zaczero
Copy link
Contributor

Zaczero commented Mar 31, 2025

When profiling the application I have noticed that most of the time is spent on running jsonable encoder from FastAPI. This is a little known fact about FastAPI, but it's much more efficient for methods to return Response instances rather than Python objects which need to be processed through jsonable encoder (to handle nested Pydantic instances etc.). Having applied the fix below, we have noticed a significant performance improvement, especially on large search responses, and I would like to share this "fix" with the upstream:

https://gist.github.com/Zaczero/00f3a2679ebc0a25eb938ed82bc63553

It basically wraps all compatible endpoints, to return Response instances directly (thus skipping the jsonable encoder logic). Cons: this breaks Pydantic support for response objects but it could be worked around by improving the wrapper logic but since it's not used here (afaik), it doesn't really matter.

@jonhealy1
Copy link
Collaborator

Hi @Zaczero. I personally think this would be a good modification.

@rhysrevans3
Copy link
Collaborator

Could this be added to the core stac-fastapi sounds like it could be useful to other backends?

@jonhealy1
Copy link
Collaborator

We could add it here first and then see what they think?

@vincentsarago
Copy link
Member

It basically wraps all compatible endpoints, to return Response instances directly (thus skipping the jsonable encoder logic). Cons: this breaks Pydantic support for response objects but it could be worked around by improving the wrapper logic but since it's not used here (afaik), it doesn't really matter.

well this means that without modification no pydantic validation would be done 🤷

I'm a bit surprised about the title of this issue, could you share a bit more how you got this and which endpoint you're hitting?

Can you pin point exactly what/where fastapi is slow?
I think Backend Client should return Dictionaries so end up here https://github.com/fastapi/fastapi/blob/master/fastapi/encoders.py#L264-L298

@Zaczero
Copy link
Contributor Author

Zaczero commented Apr 7, 2025

I simply measured it. There is a pretty high cost to using JSONable encoder which supports encoding Pydantic models (which are not used currently), compared to encoding the response in the highly optimized orjson (ORJSONResponse) or the Python built-in JSON C module (JSONResponse). It makes such a big difference that in my projects I never write responses in Pydantic models, and I patch it out—it's simply not worth the performance loss (at least for production-scale apps). Sadly, FastAPI doesn't make that obvious; I don't know if it's even documented. The link you sent is correct: when an endpoint returns a Response object, all of this code is skipped (and as you read it you will notice it's not cheap).

@vincentsarago
Copy link
Member

in https://stac-utils.github.io/stac-fastapi/benchmarks.html we benchmark both with pydantic validation (dict -> pydantic model -> json) and without (dict -> json).

Note: we're not using ORJSON but in stac-fastapi the default JSONResponse will use orsjon if found in the env.

Image Image

for a /items with 1000 items there is not a huge difference between both responses time (number of iterations)

@Zaczero
Copy link
Contributor Author

Zaczero commented Apr 7, 2025

@vincentsarago I looked at the benchmark code and it doesn't seem to do measure what I'm describing here. But I could be wrong. It only affects the enable_response_models field but doesn't change whether endpoints return a dict/list object or Response (where the performance difference lies). I think it would be nice to run those benchmarks with/without the patch applied as they should produce different results. Even if you don't return a Pydantic model/enforce one, FastAPI will run the response data through the jsonable_encoder (which is slow) and the simplest way around that (I found so far) is to make all endpoints return Response objects directly.

@vincentsarago
Copy link
Member

I looked at the benchmark code and it doesn't seem to do measure what I'm describing here

Yeah sorry I realized that after posting 😓, but if you wanted to benchmark your solution you could use the same approach.

I don't work on the elasticsearch backend, I'm only here because it was mentioned that we could implement that in stac-fastapi directly, which I'm not sure it would be 1. easy 2. should be encouraged.

IMO users wanting to avoid fastapi serialization should create their own client

class NoSerialCoreClient(CoreClient):
    def item_collection(
        self,
        collection_id: str,
        bbox: Optional[List[Union[float, int]]] = None,
        datetime: Optional[Union[str, datetime]] = None,
        limit: int = 10,
        token: str = None,
        **kwargs,
    ) -> GeoJSONResponse:  # type: ignore
        data = super().item_collection(collection_id, bbox, datetime, limit, token, **kwargs)
        return GeoJSONResponse(data)

FYI: in stac-utils/stac-fastapi#650 we removed direct response from the endpoint to make sure the users could use validation

Here is a quick benchmark which I've done modifying the stac-fastapi benchmark.

------------------------------- benchmark 'Items (1000)': 3 tests --------------------------------
Name (time in ms)           Min                Max               Mean             Median          
--------------------------------------------------------------------------------------------------
direct_response          1.4680 (1.0)       2.7764 (1.0)       1.5690 (1.0)       1.5281 (1.0)    
fastapi                 29.6050 (20.17)    53.4827 (19.26)    31.9962 (20.39)    29.9022 (19.57)  
pydantic_validation     30.0490 (20.47)    57.5301 (20.72)    42.3902 (27.02)    30.5405 (19.99)  
--------------------------------------------------------------------------------------------------

--------------------------------------- benchmark 'Items (250)': 3 tests ---------------------------------------
Name (time in us)              Min                    Max                   Mean                Median          
----------------------------------------------------------------------------------------------------------------
direct_response           696.4999 (1.0)       1,050.0000 (1.0)         762.5093 (1.0)        751.2080 (1.0)    
fastapi                 7,755.6659 (11.14)    11,278.7921 (10.74)     8,088.3039 (10.61)    7,868.5409 (10.47)  
pydantic_validation     7,769.4170 (11.15)    30,841.5830 (29.37)    10,509.7958 (13.78)    7,979.8329 (10.62)  
----------------------------------------------------------------------------------------------------------------

@Zaczero
Copy link
Contributor Author

Zaczero commented Apr 7, 2025

users wanting to avoid fastapi serialization should create their own client

In my opinion, most users would prefer high performance over some rarely used serialization features (to my knowledge, I've yet to see a project using them). It's much better to make this an opt-in rather than opt-out feature. As you can see, we're talking about ~500% performance improvement here. The STAC is also intended to operate at scale, so I think it's a no-brainer. The patch code I uploaded in the issue can easily be switched on/off behind a boolean flag without any code changes. In case someone accidentally or intentionally uses Pydantic-specific serialization features, it will simply error out, making it easy to spot.

@vincentsarago
Copy link
Member

vincentsarago commented Apr 7, 2025

https://github.com/falkben/fastapi/blob/2106ac47872bd3ad7a7f9072b71f4cbc63d00983/docs/en/docs/advanced/custom-response.md#use-orjsonresponse

it was part of the FastAPI doc at one point 😅

ref fastapi/fastapi#8165 (comment)

I agree that performance could be better

@Zaczero @jonhealy1 how do you feel about moving this issue to stac-fastapi?

@Zaczero
Copy link
Contributor Author

Zaczero commented Apr 7, 2025

I don't mind it as long as it doesn't take too much time to integrate. My end goal is to simply have faster response times because currently they're not very good.

@vincentsarago
Copy link
Member

I don't mind it as long as it doesn't take too much time to integrate

well that depends on quality of PR/Tests and discussion with maintainers of project 😅

My end goal is to simply have faster response times because currently they're not very good.

As mentioned, you can already achieve this with a customized application 🤷

@Zaczero
Copy link
Contributor Author

Zaczero commented Apr 7, 2025

We applied this patch privatelly. But ideally, we would want to avoid divergence from the upstream, if that makes sense.

@jonhealy1
Copy link
Collaborator

@vincentsarago maybe we could duplicate this issue? stac-fastapi-elasticsearch was partly created in the hopes of improving on the performance of stac-fastapi-pgstac. I think this is a change we would definitely want to see here, even if it doesn't make it to the stac-fastapi parent repository.

@vincentsarago
Copy link
Member

@Zaczero would you be willing to add a PR in stac-fastapi to add your patch? I can do it but I don't want to spoil you from being the author 🤷

@vincentsarago
Copy link
Member

I've implemented this in stac-utils/stac-fastapi#817

please feel free to review 🙏

jonhealy1 added a commit that referenced this issue Apr 22, 2025
**Related Issue(s):**

- #347 

**Description:**

#### v4.0.0a2 release 

#### Added
- Added support for high-performance direct response mode for both
Elasticsearch and Opensearch backends, controlled by the
`ENABLE_DIRECT_RESPONSE` environment variable. When enabled
(`ENABLE_DIRECT_RESPONSE=true`), endpoints return Starlette Response
objects directly, bypassing FastAPI's jsonable_encoder and Pydantic
serialization for significantly improved performance on large search
responses. **Note:** In this mode, all FastAPI dependencies (including
authentication, custom status codes, and validation) are disabled for
all routes. Default is `false` for safety. A warning is logged at
startup if enabled. See [issue
#347](#347)

#### Changed
- Updated test suite to use `httpx.ASGITransport(app=...)` for FastAPI
app testing (removes deprecation warning).
- Updated stac-fastapi parent libraries to 5.2.0. 
- Migrated Elasticsearch index template creation from legacy
`put_template` to composable `put_index_template` API in
`database_logic.py`. This resolves deprecation warnings and ensures
compatibility with Elasticsearch 7.x and 8.x.
- Updated all Pydantic models to use `ConfigDict` instead of class-based
`Config` for Pydantic v2 compatibility. This resolves deprecation
warnings and prepares for Pydantic v3.
- Migrated all Pydantic `@root_validator` validators to
`@model_validator` for Pydantic v2 compatibility.
- Migrated startup event handling from deprecated
`@app.on_event("startup")` to FastAPI's recommended lifespan context
manager. This removes deprecation warnings and ensures compatibility
with future FastAPI versions.

**PR Checklist:**

- [x] Code is formatted and linted (run `pre-commit run --all-files`)
- [x] Tests pass (run `make test`)
- [x] Documentation has been updated to reflect changes, if applicable
- [x] Changes are added to the changelog
@jonhealy1 jonhealy1 mentioned this issue Apr 24, 2025
4 tasks
jonhealy1 added a commit that referenced this issue Apr 24, 2025
**Description:**

**Changes from 3.2.5:**

#### Added
- Added support for dynamically-generated queryables based on
Elasticsearch/OpenSearch mappings, with extensible metadata augmentation
[#351](#351)
- Included default queryables configuration for seamless integration.
[#351](#351)
- Added support for high-performance direct response mode for both
Elasticsearch and Opensearch backends, controlled by the
`ENABLE_DIRECT_RESPONSE` environment variable. When enabled
(`ENABLE_DIRECT_RESPONSE=true`), endpoints return Starlette Response
objects directly, bypassing FastAPI's jsonable_encoder and Pydantic
serialization for significantly improved performance on large search
responses. **Note:** In this mode, all FastAPI dependencies (including
authentication, custom status codes, and validation) are disabled for
all routes. Default is `false` for safety. A warning is logged at
startup if enabled. See [issue
#347](#347)
and [PR
#359](#359).
- Added robust tests for the `ENABLE_DIRECT_RESPONSE` environment
variable, covering both Elasticsearch and OpenSearch backends. Tests
gracefully handle missing backends by attempting to import both configs
and skipping if neither is available.
[#359](#359)

#### Changed
- Refactored database logic to reduce duplication
[#351](#351)
- Replaced `fastapi-slim` with `fastapi` dependency
[#351](#351)
- Changed minimum Python version to 3.9
[#354](#354)
- Updated stac-fastapi api, types, and extensions libraries to 5.1.1
from 3.0.0 and made various associated changes
[#354](#354)
- Changed makefile commands from 'docker-compose' to 'docker compose'
[#354](#354)
- Updated package names in setup.py files to use underscores instead of
periods for PEP 625 compliance
[#358](#358)
  - Changed `stac_fastapi.opensearch` to `stac_fastapi_opensearch`
  - Changed `stac_fastapi.elasticsearch` to `stac_fastapi_elasticsearch`
  - Changed `stac_fastapi.core` to `stac_fastapi_core`
  - Updated all related dependencies to use the new naming convention
- Renamed `docker-compose.yml` to `compose.yml` to align with Docker
Compose V2 conventions
[#358](#358)
- Removed deprecated `version` field from all compose files
[#358](#358)
- Updated `STAC_FASTAPI_VERSION` environment variables to 4.0.0 in all
compose files
[#362](#362)
- Bumped version from 4.0.0a2 to 4.0.0 for the PEP 625 compliant release
[#362](#362)
- Updated dependency requirements to use compatible release specifiers
(~=) for more controlled updates while allowing for bug fixes and
security patches
[#358](#358)
- Removed elasticsearch-dsl dependency as it's now part of the
elasticsearch package since version 8.18.0
[#358](#358)
- Updated test suite to use `httpx.ASGITransport(app=...)` for FastAPI
app testing (removes deprecation warning).
[#359](#359)
- Updated stac-fastapi parent libraries to 5.2.0.
[#359](#359)
- Migrated Elasticsearch index template creation from legacy
`put_template` to composable `put_index_template` API in
`database_logic.py`. This resolves deprecation warnings and ensures
compatibility with Elasticsearch 7.x and 8.x.
[#359](#359)
- Updated all Pydantic models to use `ConfigDict` instead of class-based
`Config` for Pydantic v2 compatibility. This resolves deprecation
warnings and prepares for Pydantic v3.
[#359](#359)
- Migrated all Pydantic `@root_validator` validators to
`@model_validator` for Pydantic v2 compatibility.
[#359](#359)
- Migrated startup event handling from deprecated
`@app.on_event("startup")` to FastAPI's recommended lifespan context
manager. This removes deprecation warnings and ensures compatibility
with future FastAPI versions.
[#361](#361)
- Refactored all boolean environment variable parsing in both
Elasticsearch and OpenSearch backends to use the shared `get_bool_env`
utility. This ensures robust and consistent handling of environment
variables such as `ES_USE_SSL`, `ES_HTTP_COMPRESS`, and
`ES_VERIFY_CERTS` across both backends.
[#359](#359)

#### Fixed
- Improved performance of `mk_actions` and `filter-links` methods
[#351](#351)
- Fixed inheritance relating to BaseDatabaseSettings and ApiBaseSettings
[#355](#355)
- Fixed delete_item and delete_collection methods return types
[#355](#355)
- Fixed inheritance relating to DatabaseLogic and BaseDatabaseLogic, and
ApiBaseSettings
[#355](#355)

**PR Checklist:**

- [x] Code is formatted and linted (run `pre-commit run --all-files`)
- [x] Tests pass (run `make test`)
- [x] Documentation has been updated to reflect changes, if applicable
- [x] Changes are added to the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants