Langfuse

Community Elixir SDK for Langfuse - Open source LLM observability, tracing, and prompt management.

Note: This is an unofficial community-maintained SDK, not affiliated with or endorsed by Langfuse GmbH.

Features

Tracing - Create traces, spans, generations, and events for LLM observability
Scoring - Attach numeric, categorical, and boolean scores to traces and observations
Sessions - Group related traces into conversations
Prompts - Fetch, cache, and compile version-controlled prompts
Client API - Full REST API access for datasets, models, and management
OpenTelemetry - Optional integration for distributed tracing
Instrumentation - Macros for automatic function tracing
Data Masking - Redact sensitive data before sending to Langfuse
Async Batching - Non-blocking event ingestion with configurable batching

Installation

Add langfuse to your list of dependencies in mix.exs:

def deps do
  [
    {:langfuse, "~> 0.1.0"}
  ]
end

For OpenTelemetry integration, add the optional dependencies:

def deps do
  [
    {:langfuse, "~> 0.1.0"},
    {:opentelemetry_api, "~> 1.4"},
    {:opentelemetry, "~> 1.5"}
  ]
end

Configuration

Configure Langfuse in your config/config.exs:

config :langfuse,
  public_key: "pk-...",
  secret_key: "sk-...",
  host: "https://cloud.langfuse.com"

Or use environment variables:

export LANGFUSE_PUBLIC_KEY="pk-..."
export LANGFUSE_SECRET_KEY="sk-..."
export LANGFUSE_HOST="https://cloud.langfuse.com"

Configuration Options

Option	Type	Default	Description
`public_key`	string	-	Langfuse public key (or `LANGFUSE_PUBLIC_KEY`)
`secret_key`	string	-	Langfuse secret key (or `LANGFUSE_SECRET_KEY`)
`host`	string	`https://cloud.langfuse.com`	Langfuse API host
`environment`	string	`nil`	Environment tag (e.g., "production", "staging")
`enabled`	boolean	`true`	Enable/disable SDK
`flush_interval`	integer	`5000`	Batch flush interval in ms
`batch_size`	integer	`100`	Maximum events per batch
`max_retries`	integer	`3`	HTTP retry attempts
`debug`	boolean	`false`	Enable debug logging
`mask_fn`	function	`nil`	Custom function for masking sensitive data

Quick Start

Tracing

trace = Langfuse.trace(
  name: "chat-request",
  user_id: "user-123",
  metadata: %{source: "api"},
  version: "1.0.0",
  release: "2025-01-15"
)

span = Langfuse.span(trace,
  name: "document-retrieval",
  type: :retriever,
  input: %{query: "test"}
)
span = Langfuse.update(span, output: retrieved_docs)
span = Langfuse.end_observation(span)

generation = Langfuse.generation(trace,
  name: "chat-completion",
  model: "gpt-4",
  input: [%{role: "user", content: "Hello"}],
  model_parameters: %{temperature: 0.7}
)

generation = Langfuse.update(generation,
  output: %{role: "assistant", content: "Hi there!"},
  usage: %{input: 10, output: 5, total: 15}
)
generation = Langfuse.end_observation(generation)

Langfuse.score(trace, name: "quality", value: 0.9)

Span Types

Spans support semantic types for better organization in the Langfuse UI:

Langfuse.span(trace, name: "agent-loop", type: :agent)
Langfuse.span(trace, name: "tool-call", type: :tool)
Langfuse.span(trace, name: "rag-chain", type: :chain)
Langfuse.span(trace, name: "doc-search", type: :retriever)
Langfuse.span(trace, name: "embed-text", type: :embedding)
Langfuse.span(trace, name: "generic-step", type: :default)

Sessions

Group related traces into sessions:

session_id = Langfuse.Session.new_id()

trace1 = Langfuse.trace(name: "turn-1", session_id: session_id)
trace2 = Langfuse.trace(name: "turn-2", session_id: session_id)

Langfuse.Session.score(session_id, name: "satisfaction", value: 4.5)

Prompts

Fetch and use prompts from Langfuse:

{:ok, prompt} = Langfuse.Prompt.get("my-prompt")
{:ok, prompt} = Langfuse.Prompt.get("my-prompt", version: 2)
{:ok, prompt} = Langfuse.Prompt.get("my-prompt", label: "production")

compiled = Langfuse.Prompt.compile(prompt, %{name: "Alice", topic: "weather"})

generation = Langfuse.generation(trace,
  name: "chat",
  prompt_name: prompt.name,
  prompt_version: prompt.version,
  input: compiled
)

Prompts are cached by default. To invalidate:

Langfuse.Prompt.invalidate("my-prompt")
Langfuse.Prompt.invalidate("my-prompt", version: 2)
Langfuse.Prompt.invalidate_all()

Use fallback prompts when fetch fails:

fallback = %Langfuse.Prompt{
  name: "my-prompt",
  prompt: "Default template: {{name}}",
  type: :text
}

{:ok, prompt} = Langfuse.Prompt.get("my-prompt", fallback: fallback)

Scores

Score traces, observations, or sessions:

Langfuse.score(trace, name: "quality", value: 0.85)

Langfuse.score(trace,
  name: "sentiment",
  string_value: "positive",
  data_type: :categorical
)

Langfuse.score(trace,
  name: "hallucination",
  value: false,
  data_type: :boolean
)

Langfuse.score(trace,
  name: "feedback",
  value: 5,
  comment: "Excellent response",
  metadata: %{reviewer: "human"}
)

API Coverage

This SDK covers the core Langfuse API. See the Langfuse API Reference for full documentation.

Tracing (via SDK)

Feature	Function	Status
Create trace	`Langfuse.trace/1`	Supported
Create span	`Langfuse.span/2`	Supported
Create generation	`Langfuse.generation/2`	Supported
Create event	`Langfuse.event/2`	Supported
Create score	`Langfuse.score/2`	Supported
Update observation	`Langfuse.update/2`	Supported
End observation	`Langfuse.end_observation/1`	Supported
Batch ingestion	`Langfuse.Ingestion`	Supported

Prompts

Operation	Function	Status
Get prompt	`Client.get_prompt/2`	Supported
List prompts	`Client.list_prompts/1`	Supported
Create prompt	`Client.create_prompt/1`	Supported
Update labels	`Client.update_prompt_labels/3`	Supported

Datasets

Operation	Function	Status
Create dataset	`Client.create_dataset/1`	Supported
Get dataset	`Client.get_dataset/1`	Supported
List datasets	`Client.list_datasets/1`	Supported
Delete dataset	`Client.delete_dataset/1`	Supported

Dataset Items

Operation	Function	Status
Create item	`Client.create_dataset_item/1`	Supported
Get item	`Client.get_dataset_item/1`	Supported
Update item	`Client.update_dataset_item/2`	Supported
List items	`Client.list_dataset_items/1`	Supported
Delete item	`Client.delete_dataset_item/1`	Supported

Dataset Runs

Operation	Function	Status
Create run	`Client.create_dataset_run/1`	Supported
Get run	`Client.get_dataset_run/2`	Supported
List runs	`Client.list_dataset_runs/2`	Supported
Delete run	`Client.delete_dataset_run/2`	Supported
Create run item	`Client.create_dataset_run_item/1`	Supported
List run items	`Client.list_dataset_run_items/1`	Supported

Traces & Sessions

Operation	Function	Status
Get trace	`Client.get_trace/1`	Supported
List traces	`Client.list_traces/1`	Supported
Get session	`Client.get_session/1`	Supported
List sessions	`Client.list_sessions/1`	Supported

Observations

Operation	Function	Status
Get observation	`Client.get_observation/1`	Supported
List observations	`Client.list_observations/1`	Supported

Scores

Operation	Function	Status
Create score	`Langfuse.score/2`	Supported
Get score	`Client.get_score/1`	Supported
List scores	`Client.list_scores/1`	Supported
Delete score	`Client.delete_score/1`	Supported

Score Configs

Operation	Function	Status
Create config	`Client.create_score_config/1`	Supported
Get config	`Client.get_score_config/1`	Supported
List configs	`Client.list_score_configs/1`	Supported

Models

Operation	Function	Status
Create model	`Client.create_model/1`	Supported
Get model	`Client.get_model/1`	Supported
List models	`Client.list_models/1`	Supported
Delete model	`Client.delete_model/1`	Supported

Health & Auth

Operation	Function	Status
Auth check	`Langfuse.auth_check/0`	Supported
Health check	`Client.get("/api/public/health")`	Via raw API

Not Yet Implemented

The following Langfuse API features are not yet implemented but can be accessed via Client.get/2, Client.post/2, Client.patch/2, and Client.delete/1:

Annotation Queues
Comments
Media (file uploads)
Metrics
Projects management
Organizations management
SCIM provisioning

Client API Examples

{:ok, _} = Langfuse.auth_check()

{:ok, dataset} = Langfuse.Client.create_dataset(name: "eval-set")
{:ok, datasets} = Langfuse.Client.list_datasets()

{:ok, item} = Langfuse.Client.create_dataset_item(
  dataset_name: "eval-set",
  input: %{query: "test"},
  expected_output: %{answer: "response"}
)
{:ok, _} = Langfuse.Client.update_dataset_item(item["id"], status: "ARCHIVED")

{:ok, run} = Langfuse.Client.create_dataset_run(
  dataset_name: "eval-set",
  name: "experiment-1"
)

{:ok, model} = Langfuse.Client.create_model(
  model_name: "gpt-4-turbo",
  match_pattern: "(?i)^(gpt-4-turbo)$",
  input_price: 0.01,
  output_price: 0.03,
  unit: "TOKENS"
)
{:ok, models} = Langfuse.Client.list_models()

{:ok, observations} = Langfuse.Client.list_observations(trace_id: trace.id)
{:ok, observation} = Langfuse.Client.get_observation(observation_id)

{:ok, prompt} = Langfuse.Client.get_prompt("my-prompt", version: 1)

{:ok, config} = Langfuse.Client.create_score_config(
  name: "quality",
  data_type: "NUMERIC",
  min_value: 0,
  max_value: 1
)

Instrumentation

Use macros for automatic function tracing:

defmodule MyApp.Agent do
  use Langfuse.Instrumentation

  @trace name: "agent-run"
  def run(input) do
    process(input)
  end

  @span name: "process-step", type: :chain
  def process(input) do
    call_llm(input)
  end

  @generation name: "llm-call", model: "gpt-4"
  def call_llm(input) do
    # LLM call here
  end
end

OpenTelemetry Integration

For applications using OpenTelemetry, Langfuse can receive spans via a custom span processor:

config :opentelemetry,
  span_processor: {Langfuse.OpenTelemetry.SpanProcessor, []}

Or configure programmatically:

Langfuse.OpenTelemetry.Setup.configure()

Map OpenTelemetry attributes to Langfuse fields:

:otel_tracer.with_span "llm-call", %{attributes: %{
  "langfuse.type" => "generation",
  "langfuse.model" => "gpt-4",
  "langfuse.input" => Jason.encode!(messages),
  "langfuse.output" => Jason.encode!(response)
}} do
  # Your code here
end

See Langfuse.OpenTelemetry for full documentation.

Data Masking

Redact sensitive data before sending to Langfuse:

config :langfuse,
  mask_fn: &MyApp.Masking.mask/1

defmodule MyApp.Masking do
  def mask(data) do
    Langfuse.Masking.mask(data,
      patterns: [
        ~r/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/,
        ~r/\b\d{3}-\d{2}-\d{4}\b/
      ],
      replacement: "[REDACTED]"
    )
  end
end

Or use the built-in masking:

config :langfuse,
  mask_fn: {Langfuse.Masking, :mask, [[
    patterns: [~r/secret_\w+/i],
    keys: ["password", "api_key", "token"]
  ]]}

Telemetry

The SDK emits telemetry events for observability:

Event	Measurements	Metadata
`[:langfuse, :ingestion, :flush, :start\|:stop\|:exception]`	`duration`	`batch_size`
`[:langfuse, :http, :request, :start\|:stop\|:exception]`	`duration`	`method`, `path`, `status`
`[:langfuse, :prompt, :fetch, :start\|:stop\|:exception]`	`duration`	`name`, `version`
`[:langfuse, :prompt, :cache, :hit\|:miss]`	-	`name`, `version`

:telemetry.attach(
  "langfuse-logger",
  [:langfuse, :http, :request, :stop],
  fn _event, measurements, metadata, _config ->
    duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
    Logger.info("Langfuse HTTP #{metadata.method} #{metadata.path}: #{duration_ms}ms")
  end,
  nil
)

Langfuse.Telemetry.attach_default_logger()

Testing

The SDK provides helpers for testing applications that use Langfuse:

config :langfuse, enabled: false

defmodule MyApp.TracingTest do
  use ExUnit.Case
  import Langfuse.Testing

  setup do
    start_supervised!({Langfuse.Testing.EventCapture, []})
    :ok
  end

  test "traces are created" do
    MyApp.Agent.run("test input")

    assert_traced("agent-run")
    assert_generation_created("llm-call", model: "gpt-4")
  end
end

For mocking HTTP calls:

Mox.defmock(Langfuse.HTTPMock, for: Langfuse.HTTPBehaviour)

config :langfuse, http_client: Langfuse.HTTPMock

Graceful Shutdown

The SDK automatically flushes pending events on application shutdown. For explicit control:

Langfuse.flush()

Langfuse.flush(timeout: 10_000)

Langfuse.shutdown()

Runtime Configuration

Reload configuration at runtime (useful for feature flags):

Application.put_env(:langfuse, :enabled, false)
Langfuse.Config.reload()

License

MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
.github/workflows		.github/workflows
config		config
lib		lib
test		test
.envrc		.envrc
.formatter.exs		.formatter.exs
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
devenv.nix		devenv.nix
devenv.yaml		devenv.yaml
mix.exs		mix.exs
mix.lock		mix.lock

Folders and files

Latest commit

History

Repository files navigation

Langfuse

Features

Installation

Configuration

Configuration Options

Quick Start

Tracing

Span Types

Sessions

Prompts

Scores

API Coverage

Tracing (via SDK)

Prompts

Datasets

Dataset Items

Dataset Runs

Traces & Sessions

Observations

Scores

Score Configs

Models

Health & Auth

Not Yet Implemented

Client API Examples

Instrumentation

OpenTelemetry Integration

Data Masking

Telemetry

Testing

Graceful Shutdown

Runtime Configuration

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages