Musings of a Lifelong Learner
My Dashboard

Data To Insights

In his book 'Theory of Constraints', Eliyahu M Goldratt explains the maturation of every science discipline. The three distinct stages that every science has gone through are classification, correlation and effect-cause-effect. The first stage of classification includes those of starts and galaxies by greeks, indians and similar old civilization. The second stage is correlating - example is Ptolemy in Alexandria about earth being the center of revolution of planets. The third stage is credited to Sir Issac Newton for asking the question, Why? Why do apples fall down instead of flying in all directions? Only when this third stage is reached or only when cause and effect are established and logical deduction or explanations are suddenly mandatory, do we fully recognize that a subject matter is science [1]. Common sense is the highest praise for a logical derivation for a very clear explanation. These make a lot of sense in Data Science as well. In mankind's quest for gaining more from available data and improve quality of life, lots have been achieved in the recent years.

As in Figure, the DIKW pyramid represents the quest. In this pyramid, data are created through abstractions or measurements taken from the world. Information involves the data that have been processed, structured or contextualized, so that it is meaningful to humans. Knowledge is information that has been interpreted and understood by human so that she can act on it if required. Wisdom is acting on knowledge in the appropriate or responsible way [2].

As in Figure, we need to climb up the data science pyramid from data sources to decision making. Cross Industry Standard Process for Data Mining (CRISP-DM), stays relevent as the best approach.

The CRISP -DM Lifecycle considers these steps: Business understanding involves determining business objectives, assessing the situation, determine the goals and prepare a plan. Data understanding involves collecting initial data, describe, explore, verify data quality. Data preparation involves selecting data, cleaning it, construct, integrate and format data. The modeling step involves, selecting modeling techniques, generate test design, build and assess the model. Evaluation includes evaluating the results, review and determine the next steps. As next step, deployment is about preparing a deployment plan, monitoring and maintain as per plan, prepare final reports so that the project can be reviewed. It is found that 80 percentage of time is spent on gathering and preparing data.
While these concepts and tools exist, how to leverage them in the best way to create customer value is discussed and elaborated in Big Data MBA. The book discusses big data maturity model involving, business monitoring, business insights, business optimization, data monetization and business metamorphosis [3]. Much of Data science and algorithms and approaches are described in these websites: KDnuggets and DataScience central. An essential read in this area is McKinsey report on Age of Analytics.

References



Demographic

Language

Country

City

System

Browser

OS

More

Target

Users

Active

Geo

Interests