Support Unified Tracing with `LangfuseConnector` Across Main and Sub-Pipelines in Haystack #1605

immortal3 · 2025-04-04T04:07:14Z

Please provide guidance or documentation on how to configure Haystack pipelines with LangfuseConnector to ensure a single, unified trace across both main pipelines and sub-pipelines. If this functionality is not currently supported, I would like to propose it as a feature request to enhance traceability in complex pipeline setups.

Description:
When using Haystack pipelines that include sub-pipelines alongside the LangfuseConnector, it's currently unclear how to maintain a consistent trace context throughout the entire execution flow.

At present, it appears that each pipeline (main and sub-pipelines) may generate separate traces, which makes it difficult to monitor or debug the full journey of a request from the top-level pipeline into its nested components in Langfuse.

Reproducible Example:

Here's a simplified example to demonstrate the issue. We have a main pipeline and two sub-pipelines. A LangfuseConnector is included in the main pipeline for tracing purposes.

from haystack import Pipeline
...

# Sub-pipeline 1: Embedding and Retrieval
@component
class RetrievalPipeline:
    def run():
        pipeline = Pipeline()
        document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")
        text_embedder = SentenceTransformersTextEmbedder()
        retriever = InMemoryEmbeddingRetriever(document_store=document_store)

        pipeline.add_component("text_embedder", text_embedder)
        pipeline.add_component("retriever", retriever)
        pipeline.connect("text_embedder.embedding", "retriever.query_embedding")

        return pipeline.run()

# Sub-pipeline 2: Answer Generation
@component
class GenerationPipeline:
    def run():
        pipeline = Pipeline()
        generator = OpenAIGenerator(api_key="YOUR_API_KEY")  # Replace with actual key or use env var
        pipeline.add_component("generator", generator)

        return pipeline.run()

# Main pipeline
main_pipeline = Pipeline()

langfuse_connector = Langfuse()
retrieval_pipeline = RetrievalPipeline()
generation_pipeline = GenerationPipeline()

main_pipeline.add_component("langfuse", langfuse_connector)
main_pipeline.add_component("retrieval", retrieval_pipeline)
main_pipeline.add_component("generation", generation_pipeline)

main_pipeline.connect("retrieval", "generation")

query = "What is Haystack?"
main_pipeline.invoke()

Expected Behavior:
We would expect a single trace in Langfuse that includes the full execution of the main pipeline along with its sub-pipelines (retrieval and generation steps). This would provide an end-to-end view of the request lifecycle in a unified format.

Additional Notes:
It seems there is a span_handler parameter available in the LangfuseConnector, and we wonder if customizing it could allow propagation of trace context across sub-pipelines. If there are any recommended approaches or workarounds using this, documentation or examples would be greatly appreciated.

Feature Request (if not currently supported):
If unified trace context propagation is not currently supported out of the box, please consider adding this functionality. It would significantly improve the developer experience and observability for users building modular, nested pipelines in Haystack.

The text was updated successfully, but these errors were encountered:

sjrl · 2025-04-04T06:44:38Z

Hey @immortal3 thanks for raising!

One request:

Would it be possible for you to try using the OpenTelemetryTracer from haystack.tracing.opentelemetry import OpenTelemetryTracer to see if that works? I'm trying to understand if the workaround shown here also applies in your case. We have run into this issue when trying to capture the traces of haystack pipelines executed from within SuperComponent which is our builtin component that lets you wrap a complete pipeline and use it like a single component.

immortal3 · 2025-04-04T09:29:55Z

@sjrl I tried OpenTelemetryTracer using langfuse endpoint as mentioned in above thread, but it still does the same thing (multiple traces). And on top of it, It's logging other operation like ES which we don't want since Pricing depends on Events.

To give you more context, we are already on langsmith and for Pipeline, Everything is working quite well, even this above mentioned Sub-pipeline. Traces are coming in single waterfall structure.

Since, We want to optimize latency, we tried switch to AsnycPipeline and tried langfuse since it's officially documented in haystack. So, both of these issues are intertwined #1604

For AsnycPipeline, if we can get langsmith working, It would be great, because from UI wise, langfuse is missing quite a lot. We are directly using @Traceable from langsmith on top of Component.run methods.(https://docs.smith.langchain.com/reference/python/run_helpers/langsmith.run_helpers.traceable)

If you're willing to help with langsmith, Should i create new issue for it ?

julian-risch · 2025-04-04T13:10:51Z

(related to deepset-ai/haystack-experimental#217 )

sjrl · 2025-04-08T09:57:42Z

Hey @immortal3 I've opened a PR #1624 that contains the fix to this issue. Once that's merged and released then our Langfuse integration properly collects the traces of sub-pipelines under the main one.

immortal3 · 2025-04-08T14:15:08Z

@sjrl After release, Do we have to use SuperComponent to wrap our pipelines for single trace ? Or Current setup will work ?

sjrl · 2025-04-08T14:39:57Z

@immortal3 no your current set up should also work. You can see in the test I made here doesn't use super components, but just custom components that wrap a pipeline.

sjrl · 2025-04-11T05:42:04Z

Hey @immortal3 this has been merged now and is available in langfuse-haystack==0.10.1 here

immortal3 · 2025-04-11T05:44:09Z

Great. Thanks for quickly jumping in and fixing it. 🙌

immortal3 added the feature request Ideas to improve an integration label Apr 4, 2025

immortal3 mentioned this issue Apr 4, 2025

AsyncPipeline Creates Multiple Traces Instead of a Single Unified Trace in Langfuse #1604

Closed

julian-risch added the P1 label Apr 4, 2025

julian-risch assigned sjrl Apr 7, 2025

sjrl mentioned this issue Apr 8, 2025

feat: Unify traces of sub-pipelines within pipelines with Langfuse #1624

Merged

sjrl linked a pull request Apr 9, 2025 that will close this issue

feat: Unify traces of sub-pipelines within pipelines with Langfuse #1624

Merged

sjrl closed this as completed in #1624 Apr 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Unified Tracing with `LangfuseConnector` Across Main and Sub-Pipelines in Haystack #1605

Support Unified Tracing with `LangfuseConnector` Across Main and Sub-Pipelines in Haystack #1605

immortal3 commented Apr 4, 2025 •

edited

Loading

sjrl commented Apr 4, 2025

immortal3 commented Apr 4, 2025

julian-risch commented Apr 4, 2025

sjrl commented Apr 8, 2025

immortal3 commented Apr 8, 2025

sjrl commented Apr 8, 2025

sjrl commented Apr 11, 2025

immortal3 commented Apr 11, 2025

Support Unified Tracing with LangfuseConnector Across Main and Sub-Pipelines in Haystack #1605

Support Unified Tracing with LangfuseConnector Across Main and Sub-Pipelines in Haystack #1605

Comments

immortal3 commented Apr 4, 2025 • edited Loading

sjrl commented Apr 4, 2025

immortal3 commented Apr 4, 2025

julian-risch commented Apr 4, 2025

sjrl commented Apr 8, 2025

immortal3 commented Apr 8, 2025

sjrl commented Apr 8, 2025

sjrl commented Apr 11, 2025

immortal3 commented Apr 11, 2025

Support Unified Tracing with `LangfuseConnector` Across Main and Sub-Pipelines in Haystack #1605

Support Unified Tracing with `LangfuseConnector` Across Main and Sub-Pipelines in Haystack #1605

immortal3 commented Apr 4, 2025 •

edited

Loading