How much do you know about Domain-Driven Design (DDD)? It's a design approach to software development where the language and structure of software code match the business domain. The concept comes from a 2003 book by Eric Evans. And it influences software architects, information architects, data engineers, and computer science professionals who organize code and solve some seriously stressful software problems.
Domain-Driven Design is a super-successful concept with brilliant business logic benefits. But what happens when you add Extract, Transform, and Load (ETL) into the mix?
Are you looking for a powerful ETL tool to support domain-driven projects? Integrate.io has a solution. This no-code point-and-click platform streamlines ETL so you can focus on the datasets that grow your enterprise and improve decision-making. Learn more here.
- What is Domain-Driven Design?
- ETL in a DDD Context
- How Integrate.io Data Transformation Helps DDD
- Conclusion
What is Domain-Driven Design?
DDD circles back to the microservices architecture style, which structures an application as a cluster of independently deployable and loosely coupled services — different from the traditional monolithic architecture style that structures an application as a single unit. Think of DDD like a car made up of independent components — brakes, wheels, tires, etc. DDD is similar, but its components are domain objects like entities and value objects.
The purpose of DDD? To create an abstract model from a problem domain — information that a team needs to examine to solve a problem — to support various technologies. This methodology centers on the domain logic and core domain, making it easier to create complex designs based on domain models. There's less emphasis on documentation and more on a team's collective learning experience.
The Unified Stack for Modern Data Teams
Get a personalized platform demo & 30-minute Q&A session with a Solution Engineer
DDD boasts many benefits:
But how does DDD apply to data?
Because DDD views an application as a cluster and not a single unit, proponents of this concept encourage decentralizing traditional monolithic architecture and rethinking data locality and ownership. Instead of moving data from domains to a centralized data lake or warehouse (like Snowflake or Microsoft Azure SQL), domains should serve and host their own datasets.
"Instead of imagining data flowing from media players into some sort of centralized place for a centralized team to receive, why not imagine a player domain owning and serving their datasets for access by any team for any purpose downstream?" says Zhamak Dehghani, a principal technology consultant at global software consultancy ThoughtWorks. She proposes we shift our thinking from "push and ingest" to a "serving and pull model" across domains. There will be fewer dependencies and better data ownership execution and orchestration.
This revolutionary approach turns the concept of data ownership on its head, suggesting teams duplicate data in different domains and transform these domains into a particular shape that's suitable for that domain.
ETL in a DDD Context
Extract, Transform, Load is the bedrock of data integration, seamlessly moving data from sources not optimized for big data analytics to destination systems that are. Data-driven teams used ETL pipelines in millions of business domains long before Eric Evans even introduced the concept of DDD. The process works like this:
-
ETL extracts and aggregates data from a data source — such as metadata from a database, sales data from Oracle, or web services data from an open-source relational database — that doesn't support analytics.
- It transforms the data into a readable format.
- It loads the data to a final destination — typically a data warehouse.
- Data engineers now have usable and agile data for analytics and business algorithms.
So far, so good.
But few proponents of the DDD approach mention the benefits of ETL. Or mention ETL at all.
ETL leverages DDD in various ways. Teams that implement domain-driven projects deal with millions of objects and terabytes of data. Much of this data is historical, and it exists on a bunch of databases and sources. Instead of decentralization, where domains serve and host datasets, it's far easier to process historical data through big data pipelines and keep data for domain models in a centralized destination. In this context, at least, "push and ingest" is more effective than "serve and pull."
The Unified Stack for Modern Data Teams
Get a personalized platform demo & 30-minute Q&A session with a Solution Engineer
Eric Evans conceptualized DDD almost 17 years ago, and the principles of this concept have since grown. Teams adopting DDD now need to contextualize historical data to understand the logic and patterns behind domain and data models. And this is where ETL comes in. Teams can pull data from data sources to a lake or warehouse and then a data analytics tool without affecting the quality of domain models. These additional data insights help, rather than hinder, the domain-driven approach.
ETL processes aren't perfect. For example, problems persist when representing data that comes from relational databases — probably the most common data source in ETL. Relational databases that use object-orientated programming (OOP) languages like Python or C++, for example, represent data very specifically. And when some OOP languages use by-reference attributes, and others have objects made up of other objects, and others do something else entirely, it gets messy.
Eric Evans talked about this problem in his book. He called for a "ubiquitous language" that improves communications between software developers and domains. He also coined NoSQL, a non-tabular database that stores data differently from relational databases, solving OOP language issues.
Ultimately, it all depends on the type of ETL you use. The right one removes language ambiguities, handles data without compromising domain model integrity, and centralizes data in a way that optimizes the DDD lifecycle.
How Integrate.io Data Transformation Helps DDD
Integrate.io is a no-code, cloud-based, point-and-click ETL solution that moves data from various data stores and sources, transforms the data into usable formats (improving standardization), and loads it to a final destination for real-time business intelligence analytics optimization. When used in a DDD context, Integrate.io prepares raw data for domains driven by analytics for business processes. It's used in addition to Domain-Driven Design.
With Integrate.io's simple API, domain-driven teams can process relational data, CSV, weblogs, CRM data, SaaS data, JSON, and other formats for more effective analysis and scalability. The solution also removes the common problems associated with ETL, such as automation and OOP language issues. As a technology-agnostic platform, this powerful ETL tool ingests data of all types, regardless of age or quality.
Integrate.io allows teams to be data-driven as well as domain-driven.
Recommended Reading: How Can Integrate.io Assist You as a Solution Architect?
Conclusion
The Unified Stack for Modern Data Teams
Get a personalized platform demo & 30-minute Q&A session with a Solution Engineer
DDD and ETL exist in harmony, as long as you choose the right tool. While some DDD proponents call for a decentralized "serve and pull" approach to data flows, keeping data in a centralized destination proves more productive for many teams of solution architects, allowing for historical data metrics and visualization that improve software processes and functionality.
While Eric Evans didn't mention the ETL workflow in his original book, the notion of DDD has since expanded, and data integration has become a necessity for most teams. ETL, therefore, adds another layer of benefits for domain-driven organizations.
How can Integrate.io benefit your domain-driven approach? Click here to schedule an intro call.