Thursday, October 22, 2015

Time to Revitalize Your Data Warehouse



The convergence of information, technology and analytics in the era of Big Data is revolutionizing how we need to approach our data solutions. We now have capabilities to collect all data into a repository in a scalable way which can serve people with information to make more informed decisions. The challenge is that many companies have been building data warehouses and reporting systems for years. Recent studies have shown that organizations, both small and large are investing more in analytics and maintaining or reducing their traditional data warehouse spending. To me this seems an odd combination. How could one reduce data warehouse spending and increase analytics? My conclusion is that companies are maintaining their current data warehouses and are building new data solutions which are being built to complement their current warehouses. So if we consider all things maybe the true number is that data spending is increasing overall just in new ways.

This split focus between analytics and data warehousing is at odds with an enterprise approach to data. Organizations must consider how they can evolve their existing data warehouses and enable better analytics not through a revolution but rather an evolution. We must look at how Big Data is impacting our business and data processes and adjust how we architect our data warehouses.
According to Forbes, the average spending on data projects in 2015 was $7.4 million. Enterprise organizations spent $13.8M while SMB’s spend an average of $1.6M. With this level of funding the value must be realized. So how can we evolve our data warehouses? The key is finding the space where Big Data makes the most sense. 

Today many organization are at the some point towards the development of data hubs and landing areas using Hadoop technology. We tend to concentration on the landing and staging areas where we can make the most impact with the least amount of disruption. By replacing these components in the data lifecycle, we can build a new region where data is collected and prepared to meet with analytic needs, replacing these areas which were based in a relational database at a significant cost. The new expanded Landing and Staging areas are now built with Big Data and analytics in mind in addition to the traditional needs for business reporting.  Data Architects like myself are looking at creating an environment which collects all of the data and then prepares it into a conformed arrangement where data can be served up in a structured manner to supply data to the data warehouse while providing an environment where unstructured and structured data can supply raw information to the data scientists and data analysts for their analysis. This approach is one which is quite intuitive but also one which enables a better data architecture as we are separating the various parts of our solution. By separating the landing and staging we can use the technology which best suits it today while being mindful of the future. The same would apply to the high performance analytic platforms. So we may choose an RDBMS like Oracle or Netezza which today is the most appropriate platform for traditional BI but tomorrow could bring us a new technology which will be too appealing to ignore. So by separating the functionality and technology we can evolve our data warehouse in a more agile way. 

The use case of replacing your landing and staging with Hadoop is one which serves many purposes including reducing costs and extending capacity but primarily it creates a new environment to support modern advanced analytics. This data evolution is needed to ensure that your data warehouse changes with times or gets left behind in the highly competitive business world. Now is the time to consider to renew your data warehouse architecture and see how Big Data can help to elevate your business reporting and analytics