No matter the vintage or sophistication of your organization's data warehouse (DW) and the environment around it, it probably needs to be modernized in one or more ways. DW modernization takes many forms. Common scenarios range from software and hardware server upgrades to the periodic addition of new data subjects, sources, tables, and dimensions.
As data types and data velocities continue to diversify, many users are likewise diversifying their software portfolios to include tools and data platforms that are built for new and big data. A few organizations are even decommissioning current DW platforms to replace them with modern ones that are optimized for today's requirements in big data, analytics, real-time, and cost control.
No matter what modernization strategy is in play, all require significant adjustments to the logical and systems architectures of the extended data warehouse environment.
Most of the trends that are driving the need for data warehouse modernization boil down to four broad issues:
Organizations demand business value from big data. In other words, users are not content to merely manage big data and other valuable data from new sources, such as Web applications, machines, devices, social media, and the Internet of things. Because big data and new data tend to be exotic in structure and massive in volume, users need new platforms that scale with all data types if they are to achieve business value.
The age of analytics is here. Many firms are aggressively adopting a wide variety of analytic methods so they can compete on analytics and understand evolving customers, markets, and business processes. There is a movement from "analyst intuition" and statistics to empirical data-science-driven insights. Furthermore, today's consensus says that the primary path to big data's business value is through the use of so-called "advanced" forms of analytics based on technologies for mining, predictions, statistics, and natural language processing (NLP). Each analytic technology has unique data requirements, and DWs must modernize to satisfy all of them.
Real-time data presents new challenges. Technologies and practices for real-time data have been around and successfully used for years. Yet, many organizations are behind in this area, and so it's a priority for their data warehouse modernization projects. Even organizations that have succeeded with real-time data warehousing and similar techniques will now need to refresh their solutions so that real-time operations scale up to exponential data volumes, streams, and greater numbers of concurrent users and applications. Furthermore, real-time technologies must adapt to a wider range of data types, including schema-free and evolving ones.
Open source software (OSS) is now ensconced in data warehousing. Ten years ago, Linux was the only OSS product commonly found in the technology stack for DWs, BI, analytics, and data management. Today, TDWI regularly encounters OSS products for reporting, analytics, data integration, and big data management. This is because OSS has reached a new level of functional maturity while still retaining desirable economics. A growing number of user organizations are eager to leverage both characteristics.