An outlier or black swan (Black swan theory, 2012) is a strange value, it is difficult to detect when is going to befall.
The black swan theory or theory of black swan events is a metaphor that describes an event that is a surprise (to the observer),it has a major impact, and after the fact is often inappropriately rationalized with the benefit of hindsight. For instance (Taleb, 2007), the 11-S attack was a kind of an outlier, theoretically nobody could foresee this event: it was the first of its kind, and it had a major impact in the whole world.
There is a huge discussion on the matter is wheter you should or should not apply this technology to your business. If your business is related with human behaviour and its dynamics, pattern discover or complex dynamic systems we believe that the answers could be “yes”, the whole information is essential: outliers are intrinsically inside.
Days ago we were listening to Leslie Valiant from Harvard in an Alan Turing sinopsium at “Fundación Ramón Areces” (Valiant L, 2012) talking about evolution and mathematics, functions and objectives, time and space; minimum changes in the target function or in the nature of the algorithm could trigger unpredictable changes. A set of interest papers are talking about critical transitions too, such as (Lade SJ, Gross T, 2012) “Critical transitions are sudden, long-term changes in complex systems that occur when a threshold is crossed” or (Safarzyńska K et. al, 2012) “… interest has arisen in the study of large-scale socio-technical transitions to an environmentally sustainable economy”.
Statistics provides us with a “Survey Methodology”, and it could be useful to obtain a perfect representative sample. However, sometimes we are not able to achieve this perfection point, even classic statistics manage such outliers like residue and often disregard them. At the moment we are able to harvest a huge amount of information and process it in semi-realtime (e.g.: by means of Storm and Hadoop) checking the whole set of data to discover potential outliers. Perhaps we are in the process to creating or applying specific algorithms to detect these outliers, black sawns, etc … and react in a constrain-time world.
As Pedro Bernal Gutiérrez (Spanish Centre for National Defence Studies ex-director and Army Lt. Col. air) informed in an amazing cryptography speech days ago regarding to Bomba and Colossus machines (Colossus, 2012) in the Second World War; reacting time is important from a military point of view, but also for our business (from fraud detection to change in consumer habits) but is also important to react on time to this crucial, strange and beautiful outliers or black swan events, and it could be achieved through big data technology, exploring the whole search space looking for it.
(Black swan theory, 2012) Black swan theory
(Lade SJ, Gross T, 2012) Early warning signals for critical transitions: a generalized modeling approach.
(Safarzyńska K et. al, 2012) Evolutionary theorizing and modeling of sustainability transitions.
(Colossus, 2012) Colosuss computer
(Meta S. Brown, 2012) Big Data Blasphemy: Why Sample?