Procedural vs Declarative ETL: Delta Live Table as a Declarative ETL Solution

VivekR
3 min readApr 18, 2023

--

Extract Transform Load (ETL) Source: datachannel

ETL (Extract, Transform, Load) is a critical process for organizations that need to move data from one system to another. There are two main approaches to building ETL pipelines: procedural and declarative ETL. In this article, we will explore the differences between these two approaches and provide real-world analogies for each.

Procedural ETL

Procedural ETL involves writing code that explicitly outlines the steps to transform data from source to target. It is a more hands-on approach that requires developers to define each step of the ETL process, including extracting data from source systems, transforming it into a format suitable for the target system, and then loading the transformed data into the target system. Procedural ETL requires extensive coding and is typically more time-consuming and error-prone. Examples of ETL tools that use procedural ETL include Apache Nifi, Apache Airflow, and Talend.
For example, suppose you want to build an ETL pipeline to extract data from a CSV file and load it into a SQL database. With procedural ETL, you would have to write code to read the CSV file, parse the data, transform it into SQL format, and then load it into the database.

Declarative ETL

Declarative ETL is a more abstract approach that focuses on defining the desired outcome of the ETL process. In declarative ETL, the developer defines the desired end state of the transformed data, and the ETL tool automatically generates the code to transform the data into that end state. It is a more high-level approach that does not require the developer to write low-level code. Examples of ETL tools that use declarative ETL include AWS Glue, Google Cloud Dataflow, and Azure Data Factory.
For example, suppose you want to build an ETL pipeline to extract data from a CSV file and load it into a SQL database. With declarative ETL, you would define the desired end state of the transformed data in terms of the schema of the SQL database. The ETL tool would then automatically generate the code to read the CSV file, transform the data, and load it into the database

Real World Analogies

A real-world analogy for procedural ETL could be building a car from scratch. You would have to source all the individual components, such as the engine, transmission, wheels, and seats, and then assemble them together to build the car. Similarly, in procedural ETL, you would have to write code to extract, transform, and load the data.

A real-world analogy for declarative ETL could be ordering a custom-built car. You would specify the desired features, such as the color, engine size, and interior, and the car manufacturer would build the car to those specifications. Similarly, in declarative ETL, you would define the desired end state of the transformed data, and the ETL tool would automatically generate the code to transform the data into that end state.

Delta Live Table as Declarative ETL

Delta Live Table is a declarative ETL because it allows developers to define the desired end state of the data, and the Delta Engine automatically generates the code to transform the data into that end state. With Delta Live Table, developers can simply define the desired schema of the target table, and the Delta Engine automatically handles the complex tasks of data ingestion, data merging, and data transformation. This enables developers to focus on high-level data modeling tasks, without worrying about the low-level details of the ETL process.

If you found the article to be helpful, you can buy me a coffee here:
Buy Me A Coffee.

--

--

VivekR
VivekR

Written by VivekR

Data Engineer, Big Data Enthusiast and Automation using Python

No responses yet