dlt
Open-source Python library for building data pipelines
dlt is profiled here as a Data Ingestion tool for engineering teams. Read about features, pricing, and how it compares to related options in the tools directory.
Description
dlt, the data load tool, is an open-source Python library from dltHub, the company founded by Matthaus Krzykowski and Adrian Brudaru. It loads data from APIs, files, and databases into warehouses and other destinations, handling schema inference, evolution, and normalization in code, so pipelines live in the team's own repository. Engineers add a decorator to a Python function and get a production pipeline with retries, state, and incremental loads handled for them. Running anywhere Python runs, it fits inside notebooks, scripts, and orchestrators without standing up separate ingestion infrastructure. A REST API connector toolkit lets teams build sources for in-house services in a few lines.
Key Capabilities:
Pipelines defined as Python functions with minimal boilerplate
Automatic schema inference and evolution on load
Incremental loading with state tracking
Verified sources for common APIs plus a REST API connector toolkit
Destinations including BigQuery, Snowflake, DuckDB, and vector stores
Apache 2.0 license that runs anywhere Python runs
Alternative tools
- Fivetran
Fully managed, automated data movement at scale
