Apache Druid
Real-time analytics database for sub-second queries
Apache Druid is profiled here as a Backend tool for engineering teams. Read about features, pricing, and how it compares to related options in the tools directory.
Description
Apache Druid is an open-source real-time analytics database, created at Metamarkets by Eric Tschetter and Fangjin Yang, designed for fast aggregation queries over large event streams. It ingests data from streaming sources like Kafka and from batch files, then serves slice-and-dice queries with sub-second latency across high-cardinality data. Druid powers user-facing analytics and operational dashboards where many users run interactive queries at once, and its distributed design scales ingestion and querying independently.
Key Capabilities:
Sub-second aggregation queries over large, high-cardinality datasets
Streaming ingestion from Kafka and Kinesis with exactly-once handling
Batch ingestion from files and object storage
A columnar format with bitmap indexes tuned for analytics
A distributed architecture that scales ingestion and querying separately
Native support for time-series and event data
Alternative tools
- Anomalo
Automated data quality monitoring with machine learning
- RudderStack
Warehouse-native customer data pipeline and Segment alternative
- Storj
Distributed S3-compatible storage across a global network
- Wasabi
S3-compatible hot cloud storage without egress fees
- Better Auth
Framework-agnostic authentication library for TypeScript
- Ory
Open-source identity, authentication, and access control
