Analytics for business differentiation is not a new topic. However, the path to get to an effective and efficient analytics strategy, one that derives the best value out of data, can be daunting and risky.
In the past few years, some patterns have begun to emerge from successful implementations. These patterns offer insights on how businesses can establish some foundational principles that best leverage their investment in data. The following are the new rules for developing analytics capabilities:
There is no ‘big data’ and ‘enterprise data.’
Enterprises tend to split analytics into traditional data warehouse (DW) enterprise data and predictive-analytics-focused big data platforms. Value comes from all data, especially when unstructured data blends with enterprise data (finance, supply chain, customer interactions, etc.). We’re moving away from DWs and multiple repositories of data and moving toward aggregated data.
Traditional DWs are no longer relevant.
They require too much structure, are complicated to develop and maintain, and don’t move at the pace of business. Building new platforms on data warehouse architecture is likely not a sustainable investment. Single, data-lake style repositories can house all enterprise data, as well as high-volume and high-velocity data. Yes, it is hard work to migrate to a new architecture, but the returns are significant.
All data is useful.
Within logical bounds and proper triage, if an enterprise produces any data, it needs to be in the enterprise data lake. On the flip side, if data isn’t in the lake, it doesn’t exist. This doesn’t imply that practices such as data security, sensitivity and relevance to the business need to be discarded, though. Also, mature information life cycle policies should be established to retire/purge/cold-store data beyond a particular time period. A lake can’t become an infinite repository of data.
The adoption of ML and AI is evolutionary.
However, it’s not the first thing to do. Artificial intelligence (AI) and machine learning (ML) can be powerful tools and can only be adopted once there is a strong foundation of quality data that is being managed, secured and curated. The success of ML and AI capabilities relies heavily on the quality, variety and volume of data a business possesses. Powerful solutions such as enterprise information chatbots, enterprise cognitive capabilities and more can be much easily developed once a robust data foundation is built in the enterprise and the business has enough confidence in its reporting and business intelligence capabilities.
Extract-load-transform has replaced extract-transform-load.
The traditional practice of loading data that has been absolutely vetted with a complete use case is now being replaced with the idea that, because storage and compute is relatively cheaper, data is first loaded. Transformation must now become a future choice and should only be done when needed. This is also referred to as just-in-time transformation and analytics. Of course, the loading of data should be done within reason and business relevance. Building just-in-time analytics engines requires unique skill sets, but it is a worthwhile investment.
A data lake’s primary value is analytics, not operational reporting.
Most software-as-a-service (SaaS) providers offer reporting capabilities out of the box. These are the best places to develop operational reporting, but not necessarily the enterprise data lake. It is built into the cost we already pay to the provider, who will be glad to offer more powerful solutions. It is not due to the lack of a lake’s capabilities, though. Real-time reporting capabilities are part of a robust design, but using a data lake for operational reporting should only be done in very targeted unique scenarios that do not duplicate vendor-offered solutions.
Enterprise data never moves to a vendor.
Every enterprise’s IP and competitive advantage resides in its data. An enterprise is solely responsible for its cultivation and growth. A third-party provider is not. While it might seem enticing to move enterprise customer data, enterprise finance data, supply chain data and other data into a SaaS provider’s tool that already offers ERP or CRM solutions, it is important to recognize that potential value is lost when an enterprise is tied to a provider. Migration in the future — when your business needs to change — will not be easy and is sometimes impossible.
Data mining and science is nondeterministic.
Like any science, data mining efforts can yield fabulous results or nothing at all. If we knew what to expect, it would be called reporting, not mining. There is value in continually making the effort to find potentially unique insights that give your business a competitive advantage. Organizations need to be comfortable with trying out new things and accepting failure.
The best data scientists are developed within your own organization.
Cultivating, training and encouraging a subject matter expert (SME) within your organization to adopt data science practices has proven to be significantly valuable in the long term. There is an obvious need for a few well-trained data scientists in an enterprise, but the roster needs to be larger to effectively exploit the value of data. These numbers should come from retraining and repurposing internal SMEs. A combination of career data scientists and new ones trained within your organization can provide the expertise and experience needed to make data mining extremely effective.
Enterprises still need to put in the hard work of bringing their business stakeholders together to see the value in collective data. To do this, they must demonstrate some sort of revenue gain or increased efficiency. Many of the rules outlined above will help you do just that.