This is a guest post for Integrate.io written by Bill Inmon, an American computer scientist recognized as the “father of the data warehouse.” Inmon wrote the first book and first magazine column about data warehousing, held the first conference about this topic, and was the first person to teach data warehousing classes.
Five things you should know about this topic:
- Bill Inmon, the father of the data warehouse, says organizations need to define what their technical architecture will encompass.
- A well-built architecture can improve focus, direction, and prioritization.
- Inmon recommends organizations refrain from building an architectural statement without answering all the questions listed below.
- Organizations should factor these questions into the design of their architecture.
- Integrate.io is the leading low-code data integration platform that moves data to a supported destination that makes up part of an organization’s technical architecture.
When doing long-term planning for an organization, it is really helpful to start with a statement of what the architecture is and what that architecture will encompass. The architectural definition of the technology that will be used will serve as a long-term guide for making technical decisions. In addition, a properly built architecture serves as an instrument of focus, direction, and prioritization for the organization.
There are many benefits to having a technology blueprint for the future, but how do you define your technical architecture? Learn more below.
Integrate.io is a low-code data warehouse integration platform for retailers like you. With its out-of-the-box connectors, you can move data to a supported data destination—a critical component of your technical architecture. Integrate.io’s philosophy is to simplify data integration with its jargon-free environment. Try Integrate.io yourself with a 14-day free trial.
An Architecture for the Future
There is an order in which a technology blueprint should be built. If your organization does not follow this order, the result will not be a complete or well-thought-out architectural statement.
So there are some basic questions that you need to answer before any consideration is given to technology.
WARNING: Do not try to build an architectural statement until you can fully and accurately answer all of the questions below. It is MANDATORY that you answer each of these questions.
The questions are not presented in any particular order. You must answer ALL of them and factor them into the design of your architecture. No one question is more important or less important than any other question.
Volume of Data
- What volume of data must the architecture encompass?
This question does not need a specifically calculated answer. Instead, you must consider the relative volumes to be encountered. It is impossible to know what specific volumes of data you will encounter at this point. But you should estimate the relative volumes. The volumes of data will help determine:
- What number of platforms you will need
- What types of platforms you will need
What Kind of Data?
- What kinds of data will you need?
There are three basic types of data – structured, textual, and analog/IoT data. Does the architecture need to include all of these types? Only one of these types?
Different platforms need different treatments:
- Analog/IoT data needs to be segmented into low probability of access of data and high probability of access. There is far too much analog/IoT data to even consider putting all of the data into a single platform
-
Textual data needs to be transformed from text into a database for it to become useful. Typically, textual ETL is used for that purpose. In addition, textual ETL must be subject to the same type of analysis that analog/IoT data is subjected to. Not all textual data is of high importance. And some textual data is of very high importance. There is much in textual data that is extraneous that must be edited out before it is useful.
-
Structured data is notoriously divided into silos of information that cannot be analyzed or otherwise used together. Before structured data can be used in an architecture, it must be transformed from applications-oriented data into corporate data.
There are then multiple issues that relate to the inclusion of data in an architecture that will lead into the future. The types of data that will be encountered must be taken into consideration before any architectural plans can be made because the different types of data require different treatments.
The Unified Stack for Modern Data Teams
Get a personalized platform demo & 30-minute Q&A session with a Solution Engineer
Integrate.io makes it easy to move data, including structured and textual data, into a supported data destination for analysis. You can streamline the data integration process with Integrate.io’s ETL, ELT, Reverse ETL, and super-fast Change Data Capture (CDC) tools without assembling an expensive data engineering team. Set up an ETL trial or an ELT trial now!
What Types of Processing?
- What type of processing will the data be used for? What are the different processing types?
There are many different types of processing that can occur with a technical architecture. Some of these types include:
- Mission-critical up-to-the-second transaction processing
- Long-term analytical processing
- Conversation and sentiment analysis
- End user sandbox processing
- Casual conversations
- Heuristic processing
- Recreation
Each of these types of processing have their own implications in terms of what technology should be used. In order to define a technical architecture, you need to understand what processing will occur.
What Is the Business Value?
- What is the business value/business usage of the data that will encompass the architecture?
Different types of data have different levels of business value. The business value of the data will go a long way to determine how the data will be used. The business value should be understood at the outset in order to set the stage for further design and architecture.
If data has low business value, it can be treated one way. If data has high business value, it can be treated another way. Understanding the business value of the data in the architecture will greatly influence how you configure your architecture.
Who Is the User of the Data?
- Who is the ultimate user of the data? Will it be management? Sales reps? Inventory teams? Will it be the end user?
Understanding exactly who the user is — all of the communities of users — is key to making many important architectural decisions.
Stated differently, an architect cannot make proper architectural decisions WITHOUT having a vision of who the ultimate users of the system will be.
Operational Considerations
- What operational considerations must be made and accounted for?
Different environments have different operational requirements.
- Transaction processing must account for response time, system availability, data integrity, transaction integrity, and data governance frameworks like GDPR and CCPA
- Companies specializing in healthcare products must account for HIPAA
-
Textual-based systems must account for text-to-database transformations
- Analytical systems must support heuristic processing.
- Analog/IoT data must account for the separation of data into high probability of usage and low probability of usage
The operational characteristics of how you process data must be known before you can define your architecture. These features must be known in advance before any architectural decisions can be made.
Archival of Data
- How is data to be archived? When, how, and under what conditions is archival of data to be accomplished?
Many architects never even consider the archiving of data. Yet, data archiving is a necessary feature of architecture. If you don’t archive data, it grows eternally, and that is not a possibility. So archiving should not be an afterthought.
One aspect of data archiving is: How can the archived data be retrieved if needed? This consideration is an important one. Without making this consideration, the archive turns into the “dead letter” office.
Data Entrance
- How does data enter the architecture?
Data can enter the architecture in many ways. The architect needs to enumerate the ways that data can enter the architecture.
In some cases, the sheer volume of data that enters the architecture is an issue. In other cases, the integrity of the data is of paramount importance. For example, when dealing with credit card transactions). In other cases, the speed with which data enters the system is of special importance.
The architect must account for the nature and integrity of data that is needed for the architecture.
Broad Relationships of Data
- What broad relationships are there between the types of data?
The broad relationships between one type of data and another greatly influence the architectural decisions that need to be made.
Different types of relationships require different treatments:
- How is analog/IoT data related to other data?
-
How is textual data related to other data?
- How are traditional data relationships in structured data related to other data?
These questions and more are required for analysis before an architecture can be formulated.
Business Processes
- What business processes will rely on the data in the architecture?
The architect needs to know what business processes will be affected by the architecture that has been defined. If the data is business-critical, the architecture must account for that.
If data is not business critical, that too must be taken into account.
How Many Users?
- How many users will there be? What is the nature of the usage of data?
The architect needs to know approximately how many users of the system there will be. The number of users and the nature of the end users’ business have a great impact on the ultimate architecture.
Current Status
- What is the current state of the existing systems?
The existing systems may be siloed. The existing systems may be well established. The existing systems may be non-existent. The status of the existing systems to be included in your architecture needs to be assessed and factored into your architectural plans.
Level of Data Integrity and Transaction Integrity
- What level of integrity of data and transaction processing is required?
Different data and different types of processing require very different types of care when it comes to establishing an architecture.
Security and Privacy
- What level of security and privacy does the data require?
Different kinds of data require different levels of security and privacy. The issues of security and privacy must be factored into an architecture.
Data Model
- Has there been an architectural rendition — a model of the data — created?
A data model of the high level components of the architecture is a good idea for several reasons:
- It is easy to overlook certain data. A data model makes it difficult to overlook certain data
- It is easy to misinterpret the meaning or significance of certain data. A data model helps prevent this oversight.
- A data model allows there to be context for all of your data processes. Context is difficult to find or understand without a data model.
At this point, the data model should not be at a detailed level. Instead, the data model — for this purpose — should be at a high level, where only the major entities of the corporation are involved.
In addition to the data model, if textual data is to be considered, has the ontological/taxonomical model been considered and identified?
Geographical Distribution of Data
- Data can be geographically distributed over a wide area.
Data spread over a wide area can have great implications. There may be different languages that need to be accommodated. There may be different privacy requirements. There may be different government regulations. In short, the wide spread of data over different countries may cause technical challenges.
Data Life Cycle
- The data life cycle must be considered.
Data has its own life cycle. The cycle needs to be identified and factored into the technical architecture.
Directory of Data
- In order to build an enterprise-level architecture, it will be necessary to have a directory of the types of data and their relationships to each other.
The directory needs to include structured data, textual data, and analog/IoT data. Considerations of ongoing usage and ongoing maintenance of the directory must be considered as well.
Final Word
The Unified Stack for Modern Data Teams
Get a personalized platform demo & 30-minute Q&A session with a Solution Engineer
All of the questions above must be examined and factored into your architectural definition. If ANY of these issues and questions are omitted, the resulting architecture will not suffice and will be at risk. It might take a while to define your technical architecture, but you can make more valuable technical decisions for your organization. Schedule a 14-day demo today!
Databases, data lakes, and data warehouses are critical components of your technical architecture. Integrate.io is the low-code platform that moves data from sources to a supported destination via out-of-the-box connectors. That means you can integrate data for analysis without advanced programming or data engineering knowledge.
Bill Inmon, the father of the data warehouse, has authored 65 books. Computerworld named him one of the 10 most influential people in the history of computing. Inmon’s Castle Rock, Colorado-based company Forest Rim Technology helps companies hear the voice of their customers. See more at www.forestrimtech.com.