Skip to content

EventHub Writer fails due to Throttling of EventHub, configuration settings have no impact.  #679

@steffenmarschall

Description

@steffenmarschall

Bug Report:

Actual Behavior

We have a rather huge streaming Dataframe (42.000.000 rows) which we want to send to our Azure Eventhub. The EventHub is scaled with 15 TUs.
However any run trying to send this data fails, due to throttling of EventHub. The exception that is being shown is:

StreamingQueryException: [STREAM_FAILED] Query [id = ..., runId = ...] terminated with exception: Job aborted due to stage failure: Task XX in stage 9.0 failed 4 times, most recent failure: Lost task 61.3 in stage 9.0 (TID 1963) (10.179.0.21 executor 7): com.microsoft.azure.eventhubs.ServerBusyException: The request was terminated because the entity is being throttled. Error code : 50002. Sub error : 101. Please wait 4 seconds and try again. To know more visit https://aka.ms/sbResourceMgrExceptions and https://aka.ms/ServiceBusThrottling

We tried to lower the sending rate with the following options:

  • maxEventsPerTrigger (i.e. to 100)
  • eventhubs.threadPoolSize (i.e. to 1)
  • eventhubs.operationTimeout (i.e. to 15 minutes)

However none of these had any measureable impact on the Sending Rate to the EventHub.

Additional Info:

We stream from a DeltaTable, each version has usually ~42.000.000 added rows.
We use the AvailableNow Trigger and try to checkpoint. However the job usually fails before reaching any checkpoint.

Expected behavior

Adjusting the settings will lower/increase throughput when writing to Azure Event Hub.

Please let us know on how to configure the EventHubWriter so we are able to send large data without failing due to throttling.

Configuration

  • Databricks/Spark version: 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12)
  • spark-eventhubs artifactId and version: com.microsoft.azure:azure-eventhubs-spark_2.12:2.3.22

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions