Skip to content

Event Hubs

First PublishedByAtif Alam

Azure Event Hubs is a fully managed, real-time data streaming platform capable of processing millions of events per second. It’s Azure’s equivalent of Apache Kafka or AWS Kinesis.

ServiceModelThroughputRetentionBest For
Event HubsStreaming (partitioned log)Millions/sec1–90 days (or Capture)Telemetry, logs, analytics
Service BusMessage broker (queues/topics)Thousands/secUp to 14 daysTransactional messaging
Event GridEvent routing (push)Millions/sec24 hr retryReacting to events
Queue StorageSimple queueModerate7 daysBasic task queuing

Rule of thumb: Use Event Hubs when you need high-throughput, ordered, replayable event streaming. Use Service Bus when you need reliable message processing with features like sessions, dead-letter, and transactions.

Namespace, Event Hub, Partitions, Consumer Groups

Section titled “Namespace, Event Hub, Partitions, Consumer Groups”
Event Hubs Namespace (container, billing unit)
└── Event Hub: "telemetry-events" (like a Kafka topic)
├── Partition 0: [e1] [e4] [e7] [e10] ...
├── Partition 1: [e2] [e5] [e8] [e11] ...
├── Partition 2: [e3] [e6] [e9] [e12] ...
├── Consumer Group: "$Default" (each CG reads all partitions independently)
└── Consumer Group: "analytics"
ConceptDescription
NamespaceContainer for one or more Event Hubs; defines the billing and throughput tier
Event HubA named stream (equivalent to a Kafka topic)
PartitionOrdered sequence of events; enables parallel reads; partition key determines placement
Consumer groupIndependent view of the stream; each group tracks its own offset per partition
EventA data record (body + properties + metadata)

Events with the same partition key go to the same partition, guaranteeing order for that key:

Partition key: "device-123" → always goes to Partition 1
Partition key: "device-456" → always goes to Partition 0
Within Partition 1, events for device-123 are strictly ordered.
Events across partitions have no ordering guarantee.
Terminal window
# Create a namespace
az eventhubs namespace create \
--resource-group myapp-rg \
--name myapp-events \
--sku Standard \
--location eastus
# Create an Event Hub with 4 partitions, 7-day retention
az eventhubs eventhub create \
--resource-group myapp-rg \
--namespace-name myapp-events \
--name telemetry \
--partition-count 4 \
--message-retention 7
# Create a consumer group
az eventhubs eventhub consumer-group create \
--resource-group myapp-rg \
--namespace-name myapp-events \
--eventhub-name telemetry \
--name analytics
from azure.eventhub import EventHubProducerClient, EventData
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
producer = EventHubProducerClient(
fully_qualified_namespace="myapp-events.servicebus.windows.net",
eventhub_name="telemetry",
credential=credential
)
# Send a batch of events
with producer:
batch = producer.create_batch()
batch.add(EventData('{"device": "sensor-1", "temp": 22.5}'))
batch.add(EventData('{"device": "sensor-2", "temp": 23.1}'))
producer.send_batch(batch)
# Send with a partition key (order guarantee for this key)
with producer:
batch = producer.create_batch(partition_key="device-123")
batch.add(EventData('{"temp": 22.5, "ts": "2026-02-17T10:00:00Z"}'))
batch.add(EventData('{"temp": 22.7, "ts": "2026-02-17T10:01:00Z"}'))
producer.send_batch(batch)
  • Azure CLIaz eventhubs eventhub send
  • Kafka protocol — Event Hubs supports the Apache Kafka protocol (no code changes for Kafka producers).
  • Azure Functions — Event Hubs output binding.
  • Application Insights / Azure Monitor — Can export to Event Hubs.
from azure.eventhub import EventHubConsumerClient
from azure.eventhub.extensions.checkpointstorageblob import BlobCheckpointStore
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
# Checkpoint store: tracks which events each consumer group has processed
checkpoint_store = BlobCheckpointStore(
blob_account_url="https://myappstorage.blob.core.windows.net",
container_name="eventhub-checkpoints",
credential=credential
)
consumer = EventHubConsumerClient(
fully_qualified_namespace="myapp-events.servicebus.windows.net",
eventhub_name="telemetry",
consumer_group="analytics",
credential=credential,
checkpoint_store=checkpoint_store
)
def on_event(partition_context, event):
body = event.body_as_str()
print(f"Partition {partition_context.partition_id}: {body}")
# Checkpoint: save progress so we don't re-process on restart
partition_context.update_checkpoint(event)
with consumer:
consumer.receive(on_event=on_event, starting_position="-1") # -1 = beginning

The consumer SDK uses Azure Blob Storage to store checkpoints — the last processed offset per partition per consumer group. This enables:

  • Resume after restart — Pick up where you left off.
  • Load balancing — Multiple consumer instances share partitions automatically.
function_app.py
import azure.functions as func
import logging
app = func.FunctionApp()
@app.event_hub_message_trigger(
arg_name="events",
event_hub_name="telemetry",
connection="EventHubConnection",
consumer_group="$Default",
cardinality="many"
)
def process_telemetry(events: list[func.EventHubEvent]):
for event in events:
logging.info(f"Event: {event.get_body().decode()}")

Event Hubs exposes a Kafka-compatible endpoint — existing Kafka producers and consumers work with zero code changes:

# Kafka producer config pointing to Event Hubs
bootstrap.servers=myapp-events.servicebus.windows.net:9093
security.protocol=SASL_SSL
sasl.mechanism=OAUTHBEARER
# ... (Azure identity-based auth)

Use cases:

  • Migrate from self-managed Kafka to a fully managed service.
  • Use Kafka ecosystem tools (Kafka Connect, Kafka Streams) with Event Hubs as the backend.

Capture automatically writes events to Azure Blob Storage or Data Lake Storage in Avro format — zero-code archival for batch analytics:

Terminal window
# Enable Capture
az eventhubs eventhub update \
--resource-group myapp-rg \
--namespace-name myapp-events \
--name telemetry \
--enable-capture true \
--capture-destination blob \
--storage-account myappstorage \
--blob-container event-capture \
--capture-interval 300 \ # flush every 5 minutes
--capture-size-limit 314572800 # or every 300 MB

Captured data can be queried with Azure Synapse, Databricks, or Data Lake Analytics.

TierThroughput UnitsPartitionsRetentionKey Features
Basic1–20 TUs32 max1 dayLow cost, limited features
Standard1–40 TUs32 max1–7 daysConsumer groups, Capture, Kafka
PremiumProcessing Units (PUs)100 maxUp to 90 daysDedicated resources, VNet, dynamic partitions
DedicatedCapacity Units (CUs)UnlimitedUp to 90 daysSingle-tenant, highest throughput

Throughput Unit (TU): 1 TU = 1 MB/s ingress + 2 MB/s egress (or 1,000 events/s ingress).

Auto-inflate automatically scales TUs up when traffic increases:

Terminal window
az eventhubs namespace update \
--resource-group myapp-rg \
--name myapp-events \
--enable-auto-inflate true \
--maximum-throughput-units 20
telemetry Event Hub
├── Consumer Group: "realtime" ──► Dashboard service (live metrics)
├── Consumer Group: "analytics" ──► Data pipeline (Synapse, Databricks)
└── Consumer Group: "alerts" ──► Alerting service (threshold checks)

Each consumer group reads the full stream independently — no message loss.

Azure Stream Analytics can process Event Hubs data in real time with SQL-like queries:

-- Detect high temperature from IoT sensors
SELECT DeviceId, AVG(Temperature) as AvgTemp, System.Timestamp() as WindowEnd
FROM telemetry TIMESTAMP BY EventEnqueuedUtcTime
GROUP BY DeviceId, TumblingWindow(minute, 5)
HAVING AVG(Temperature) > 30
Application ──► Event Hubs ──► Capture ──► Blob Storage (archive)
└──► Azure Function ──► Log Analytics / Elastic
FeatureEvent HubsApache KafkaAWS Kinesis
ManagedFully managedSelf-managed (or Confluent)Fully managed
ProtocolAMQP + KafkaKafkaAWS proprietary
PartitionsUp to 100 (Premium)UnlimitedUp to 500 shards
RetentionUp to 90 daysConfigurableUp to 365 days
ThroughputTU-based (auto-inflate)Broker-basedShard-based
Capture/archiveBuilt-in (Avro to Blob)Kafka ConnectFirehose (separate service)
Consumer groupsUp to 20 (Standard)UnlimitedShared/Enhanced fan-out
  • Event Hubs is for high-throughput event streaming — telemetry, logs, real-time analytics.
  • Events are distributed across partitions; use a partition key for ordering guarantees per key.
  • Consumer groups allow multiple consumers to independently read the full stream.
  • Checkpointing (in Blob Storage) enables resume-after-restart and load balancing.
  • Capture archives events to Blob/Data Lake for batch analytics — zero code.
  • Event Hubs supports the Kafka protocol — migrate from self-managed Kafka with no code changes.
  • Use Event Hubs for streaming; use Service Bus for reliable message processing.