This is a guest post with exclusive content by Bill Inmon. Bill Inmon “is an American computer scientist, recognized by many as the father of the data warehouse. Inmon wrote the first book, first magazine column, held the first conference, and was the first to offer classes in data warehousing.” — Wikipedia.
Our critical points:
- Data warehousing requires data integration
- Integration is complex, risky, hard to do, imprecise, and requires research
- A data warehouse's major value is having a foundation of integrated data
Data warehouses are the whack-a-mole of technology. Like the carnival where the mole sticks its head up out of a random hole and you take a whack at it, data warehousing just keeps popping up — sometimes, in the unlikeliest of places.
This whack-a-mole act of data warehousing is especially impressive because there is no vendor nor any organization behind data warehousing. Data warehouses are supported solely and only by the end user. There is no committee, no company, no organization that sits around and makes decisions about any data warehouse. The data warehouse has a life of its own.
- Gunning for Data Warehouses
- A Murder of Data
- The Resuscitation of Data Warehouses
So, who tried to kill the data warehouse? Who kept taking swings at the ever-reappearing mole that kept randomly popping out of the hole?
The Unified Stack for Modern Data Teams
Get a personalized platform demo & 30-minute Q&A session with a Solution Engineer
Gunning for Data Warehouses
What’s the problem with the data warehouse? Why did people want to kill the data warehouse? Well, there were a lot of issues. But the primary issue is that people dread data integration. Data warehousing requires data integration. Integration is complex, risky, hard to do, imprecise, and requires research. Integrating data requires using your brain and using elbow grease. Vendors and most IT professionals just hate doing that.
Corporations had huge silos of information that couldn't communicate. These silos were an impediment to analytical processing across the enterprise. The only way to break these silos apart was to integrate the data found in them and place the integrated data into a data warehouse. There simply was no other way.
No ifs, ands, or buts.
But vendors and most IT professionals just didn't have the backbone or the intellect to integrate the siloed data. So, the silos remained, and extracting enterprise-level data continued to be an elusive, unreachable goal.
Vendors would rather walk across a bed of fiery red-hot coals barefoot than go back and integrate data. The problem is that the major value of a data warehouse is in having a foundation of integrated data.
A Murder of Data
There have been several major attempts at exterminating and/or bypassing a data warehouse:
Some of these efforts were well funded and well advertised. Other efforts were merely casual. But all of them failed to kill the data warehouse.
In fact, some of these efforts actually added to and bolstered data warehouse architecture.
The Resuscitation of Data Warehouses
People began to realize — data warehouses wouldn't die. Data warehousing was not dead. In fact, people found that adding data marts to a data warehouse was a very good thing to do. Data marts allowed you to customize data and, at the same time, ensure the data's integrity. So, Ralph Kimball’s contribution of data marts and the dimensional model unintentionally added to a data warehouse's value.
Big Data added a dimension of scalability for data warehouses that had not existed before. The data in the data warehouse with a low probability of access fit quite conveniently in Big Data. The people that promulgated Big Data never saw it that way, but that was a positive consequence of Big Data — one more unintentional value-add for data warehouses.
The people who pushed for data lakes inadvertently pushed new kinds of data into the data warehouse. With a data lake, analog, IoT, and textual data all found their way into a data warehouse.
The people who tried to kill data warehouses ended up unintentionally expanding the capabilities and usefulness of data warehouses.
So, here lies the data warehouse — RIP.
Resilient Information Processing, not Rest In Peace.
The data warehouse lives on despite the best efforts to kill or ignore it.
With Integrate.io, you get the best of all the worlds — a new ETL platform with blazing fast CDC, reverse ETL, and deep Ecommerce capabilities. Schedule an intro call to learn more.
Bill Inmon, the father of the data warehouse, has authored 65 books and was named by Computerworld as one of the ten most influential people in the history of computing. Bill's company, Forest Rim Technology, is a Castle Rock, Colorado company. Bill Inmon and Forest Rim Technology provide a service to companies by helping them hear their customers' voices. See more at www.forestrimtech.com