Part I. Finance as a Data-Driven Ecosystem: Time More than Data is the Ultimate Limit, or Why Data Have Angles.
I. Introduction.
The artificial intelligence (AI) revolution is touching the asset management (AM) industry, and plenty of actors seem ill prepared to fully embrace this major change. Why does AI feel more like a major threat than an opportunity to move forward in the asset allocation industry? Why is it so difficult to view a harmonious integration?
The answers are possibly because the change involves decision makers whose skills were considered out of reach for algorithm-based devices, when the digitalization of some aspects of the AM business, such as the streamlining of several repetitive but also “seemingly-expert” tasks, had already taken place. In any case, the arrival of AI represents a quite logical consequence of the sequential increase in computational capabilities associated with Moore’s law. If computational calculus is becoming less expensive and more powerful and tons of additional digitalized data are available, then it is just absurd not to take advantage of it in a domain where human decision making is complex, the expertise is highly sophisticated and (if possible) quickly delivered.
This discourse seems to perfectly fit the reality of the financial sector. Indeed, more than in other sectors, decision makers operate in an ecosystem that creates and digest huge amounts of data and everyone is eagerly figuring out how to use the data in forecasting to optimize their decision processes.
On that note, it is well known that the AM industry has always been keen on convincing its customers to use data-driven decision-making tools. For instance, what could be more convincing to an investor regarding the validity of an investment proposal (bearing in mind that before becoming a choice, the investment is always a proposal) than a historical graph of the past performance of a selected financial vehicle or a series of historical financial ratios associated with a given asset?
Generally speaking, two clues often characterized a standard AM offer:
On one hand, any AM investment proposal implies a subsequent customer choice and, on the other hand, the proposal is often built around data-driven arguments, which take for granted that clients want to receive this sort of information. Yet despite all the recent “know your customer” efforts and clients’ digital profile this idea of providing clients with a proposal based on data-driven information is at most an educated guess about each client’s expectations. Basically, we are guessing about customer’s ability to understand a proposal, and we assume that client would easily approve a data-driven offer.
Moving away from quantitative-based suggestions implies opening a Pandora’s box of tricky questions.
If numerical arguments are not enough, then what should we use instead as additional arguments?
Should we open a transparent discussion about a proposal by discussing the data used, and how far should we be ready to push? To what extent do we meet this challenge? Is there a natural boundary in our ability to discuss a proposal? If yes, based on what considerations?
In a sense, the rest of this paper could be read as an attempt to answer those questions and explain why they will remain significant even in a hyper-connected and soon heavily disintermediated financial world.
Depending too much on data-driven solutions has led to an infinitely more debatable and dangerous message among investors:
If a series of wealth allocation decisions are made based on solidly grounded data-driven analyses, then clients become incline to expect positive return and a properly master risk factor.
In other words, humans are easily impressed by numbers because, on one hand, they seem clear and straightforward; numbers do not lie. Thus, if an action is based on them, it ought to be right. On the other hand, a presentation based on data tends to blur the boundary between simple knowledge and full understanding of a given phenomenon. The core message of this paper it is precisely to offer a series of hints allowing a more fruitful discussion of this second aspect.
II. From Data, to Data biases and Data Angles.
II.A. The notion of Data biases and why is so important in science.
By definition of a rigorous, scientific approach, when data are used to describe an economy in a given time period, then the information vehiculated by the data is considered in a strictly unidimensional way: Given the data definition and the methodology used to determine a value, it must be taken as a cornerstone from which science and knowledge can be built. Yet, financial/social “data” are eminently historical, i.e. contextual, and thus call for a much more cautious approach.
For instance, the way in which a measure is defined is infinitely more convoluted. Because, the numerical “rock” is, by definition, fragile due mainly to its historical unicity; effectively, it is both impossible to recreate the same measure under the same conditions and nothing ensures that this numerical value is generated by a stable (at least for a period of time) random variable.
Finance and economics are social sciences: Everything we collect as a number is unique because the ecosystem surrounding each value appears, by definition, only once. Any statistical (thus mathematical) analysis of these data should always acknowledge this key feature.
In this context the famous Nate Silver quote “there is no such thing as unbiased data. Bias is the natural state of all data [added emphasis is mine]” takes its full value. This claim sounds intuitive and easily believable (which is why it is so often quoted) but, in the end, fully grasping its essence is very difficult.
For instance, what is the exact meaning of the term “bias” in this sentence? Are we likely to face more bias in the social data realm?
If this is the case, then is it a sign of a high degree of complexity? Why is it important in finance and economics to accept and carry with us this complexity instead of refusing it?
To answer these questions, I point out to a key fact:
While numerical data measures “something” of a phenomenon, this measure will never be accurate because we define it by using a historically based data definition. As such, this does not imply to disparage the measure, but this just reminds of the importance of the historical context and the fragility of any measured value.
II.B. From Data Biases to Data Angles.
Let’s consider an example of bias due to a rigid, inflexible definition of a measure.
The unemployment rate has been statistically defined given a certain ecosystem and (quite) strong separation between employed and unemployed statuses.
Today, that barrier is blurred and various types of work relationships exist: How valid, then, is the unemployment rate as an indicator of job market frictions in a given country? Basically, we should try to modify the measure. But then by how much? How do we determine the new, better performing definition? Moreover, how do we judge and redefine the historical values? Is it not better to create and elaborate a web of new measures to describe the complexity? Fair solution, but, if we do this then we will lack of historical values.
To create a valid answer for this evolution we need time: The phenomenon is evolving, so we must describe it and we must agree on a new theoretical frame.
Why? Because as Albert Einstein once wrote:
“It is the theory [i.e. the concepts we set and how we use them] that decides what can be observed”.
Now, a detailed discussion of this key methodological aspect would take too long and is too complex; we will not fiddle while Rome burns [1]. -All notes at the end of the paper-
At this point, I must stress two points:
First, despite the historical definition of a measure, its evolution still tells us something. For example, regardless of the definition of “unemployment rate” we still consider that an increasing unemployment rate is a bad signal.
Second, given the definition and the value associated with a measure, the value is, a priori, true. Our discourse here is not an unscientific one; no government plot is in motion to produce false, or tainted data.
What matters it is that a data series is likely to lose its value with time because it decays -e.g., as the reality changes, the old definition becomes more irrelevant- and no longer describes with sufficient precision what it is supposed to.
Yet, there is no plot to force the acceptance of this measure and we can always improve our knowledge of a phenomenon by starting, for instance, another measure, which implies basically, to embrace another theoretical frame or at least change the old one. [2]
It becomes clear that any data series implies a definition which carries an angle:
The data is biased because we establish a single “socially” accepted and shared way to interpret it, and we continue to carry it out.
Numerical data defines a value and a theoretical and historical background simultaneously. By definition, a data point or even a vast set of data points cannot define the “full reality” because the real-world is complex and dynamic, and thus so should be our representation of the world.
Therefore, all data comes out from a filter (more precisely a series of filters) because it takes for granted the validity of a vast number of definitions used to set the measures in use. The historical time-frame is a huge burden that we cannot refuse to carry with us. The real big challenge it is to constantly acknowledge the presence of this burden and thus to remain modest about what we know (and how) and what we do not know (and how much).
III. From Data Angles to Ethical Concerns.
III.A. Why Discourses and Ethical Considerations Matter.
In the previous paragraph we just start to grasp that too often, we tend to forget that a given number is polyhedral in essence, so its significance might change in the blink of an eye. Therefore, its usage in any decision-making process, thus also in finance, needs to be constantly challenged.
Why? Because an allocation proposal must always be accepted by an (often reluctant) investor; therefore, words matter as much as numbers when a proposal is analysed at a given time. Again, words are needed to picture the context.
On that note, we have the chance to witnesses one of the most extreme displays of the power of words and discourses overcoming that of numbers and data: The Tesla case.
Here, we have a firm whose market capitalization is greater than those of well-established and profitable car producers, e.g. Ford and G.M, despite pass and present financial losses. Here, the value of the stock is used to holistically evaluate Tesla’s future and thus the expected (great) success of a hypothetical (for now) mass production of its cozy electric cars; a big bet, albeit so typical on a capitalist, idealized saga.
Now, the foreseen numbers and data are interpreted and considered from a single, very optimistic but fragile, angle:
Consider a coming change, for instance, a more serious cost assessment or possible turmoil of the batteries supply chain process, the core element in an electric car, in the future. This scenario’s change in a single stock might generate a domino effect: In Tesla’s case, does the shift affect the car industry or the tech industry? How will the entire Tech ecosystem be affected if a catastrophic scenario materializes?
This extreme case shows how numbers or their absence are always used in sync with a discourse, which embraces a broad holistic view of reality and in particular of the future.
As individuals, we accept or refuse the angle from which data (or even their absence) are interpreted if we perceive as good the picture of the future associated with that choice in a particular domain. Our choices are based on a biased (e.g. gloomy or rosy) judgement.
Finally, the essence, the real engine motivating an investor’s choice, appears, and it is always based on ethical and therefore moral considerations: It is always a battle between good and bad that determines any AM costumer’s choices.
One of the main illusions of the AM industry is the belief that clients do not constantly wear ethical lenses: They are always on investors’ noses, even more so when they are evaluating and discussing numbers.
Here a hint to consider: What a client is looking for-and this will be more and more the case, as the data overflow environment is just ahead of us- is a series of ethical considerations allowing him or her to overstep the numerical presentations and results.
This is basically why a client (and we are all clients) at the moment of an important decision wants to understand and not simply know.
Let’s stop here and breathe deeply. The last couple of paragraphs shed some light on the element that is easily and often drowned in numbers when an AM proposal is delivered. Now, in defence of the usual AM proposal, ethical considerations are terribly difficult to determine and set a priori. But are we really so sure about this? Or can we try to highlight a relatively easy procedure? Is it out there?
Clearly, we believe this is the case. To illustrate some of its main features, let’s deeply analyse some more examples.
III. B. From Data Angles to Ethical Concerns: Some Further Examples.
III.B.i. The Facebook case: What is good or bad in a given number?
To start with, we can refer to a recent case in which a key number that was used to justify the success of a firm’s strategy, suddenly became a major sign of its weakness. Unsurprisingly, we highlight the famous Facebook case. Historically, the existence of a broad (and growing) user base was a strength and a sign of Facebook’s success. Everything was based on a simple equation: More users implies more possibilities to sell user profiles to other firms keen on organizing tailored marketing campaigns on the platform. Besides, as any economic management textbook will confirm, big is always a priori better because scaling of production costs is possible with large numbers.
Clearly, any financial analyst seriously following Facebook’s stock was aware of and (openly) pleased with this definition of Facebook’s core business plan.
However, quite certainly, some of those analysts were worried about this business plan resilience.
Indeed, any wise and therefore sceptical analysis of a (mainly) new phenomenon is always based on a series of basic questions:
How long will the “good star” shine over the new business?
What factors are likely to dim this light, and when?
Ultimately, time is the sharp sword that will cut the Gordian numerical knot and determine if those numbers were “good” or “bad”.
Those analysts were justly concerned with how the company managed its users’ profiles day after day and the criteria for selling those profiles to the external companies: They saw the possible poison source, but would they leave the business and refuse to buy its stock when the company is surfing on its momentum? No way. After all, historically, the firm’s data (and chiefly the increasing number of users) were there to prove that this part of operations was smoothly managed – if we exclude the European buzz from an obscure Austrian dude -and under control.
So why search for fire, if (at this point in time) there is no smoke? Why not follow the majority? Following the group is not only temping it also economically rewarding because investors gain money and please their clients.
Thus, the sceptical analysts are just silenced, drowned out by numbers and Mark’s globally worldwide recognized genius.
Still, in the blink of an eye, the same number became a weakness because of the simple principle that more users means then more potential damages. One starts with Brexit but ends up with Trump, German election and you name it, in a typical avalanche effect.
Thus, the entire structure becomes more fragile, if the Facebook’s number of users had been “small”, the fire would have been easier to be contained. In other words, something that was seen as a benefit suddenly became a major source of problems because the number of users was interpreted using a single restrictive angle/perspective.
By doing so, the firm and the majority of the analysts devoted to following and tackling the platform giant forgot an ancient warning raised when this business was in its infancy: Facebook is a social-media and as a medium necessarily educates its users. Facebook clients are informed via their feeds, which subjects the clients’ perceptions and interpretations of the world to a series of filters. Here, one needs to be extremely careful when manipulating those filters for commercial reasons: Facebook users use the platform because they trust the firm and its ethics. Alternatively, the vast majority of users are ready to accept a certain degree of commercial noise on the feed and Web-personalized ads would appear funny and sometimes useful.
We all concede to these noises in the real world; we are used to them, and we know they are the way firms earn their money. But users do not accept fallacious attempts to reshuffle their political opinions. This is not about suggesting the best car given that I live in Geneva and I often drive on mountain roads; this is about what I am supposed to think about other human beings and the way in which I want to organize my life with others!
Here, the trust factor is disrupted, damaged and weakened, putting the survival of the entire network at risk.
III.B.ii. The Food Industry case and the Short-term Long-term Risk fallacy.
At this stage, one could always say that this sceptical data analysis does not apply often.
But can one be sure? Consider the food industry. All major actors there are using too much sugar in many precooked dishes. We all know that, hundreds of news pieces, scientific studies and details about processed foods are coming from everywhere.
However, companies in this sector have started to recognize the problem, as indicated by the famous Milan Declaration of two years ago, in which all producers agreed to voluntarily reduce the usage of sugar. The presence of sugar is still unhealthily high -chiefly in the case of processed foods sold in developing countries- by healthy standard because sugar is very addictive to consumers. Its presence ensures a positive consumer’ experience and consumer loyalty, and the producer can use a very cheap ingredient to ensure plenty of nice features in any precooked dishes.
But, do many financial analysts covering this industry track the risks inherent in continuing this policy? Is it not time to evaluate the likelihood of a possible legal action and the consequences for the entire industry?
One of the roles of financial apparatuses after all, is to help efficiently allocate savings, which entails transparent judgements and foresee the avoidance of risks and threats characterizing a particular business activity, considering that those activities take place in real time.
A discussion needs to begin at this point about the misleading usage of “short term” and “long term” in the analysis of human activities.
Take the last example we discussed. Again, adding sugar represents a risk factor for the food industry, but is it a short-term or a long-term risk?
No one knows. If tomorrow a new important study proves that the majority of our diabetes problems are coming from the food industry and this study happens to be backed by several experts and spreads over the Internet, would it be an expression of a short or long-term risk?
Honestly, it is just something that explodes thanks to the normal flow of time and its endless supply of surprises.
When we analyse human activities and risks, we should minimize the use of those two concepts. The materialization of something happens on time but remains a question of fatalism more than any short- or long-time horizon. Therefore, if an accident does not happen, there is only one radical explanation: luck. The reader should note that without any accident, we cannot know the system’s resilience. Facing some disasters (on a mild scale, if possible) here and there, is favourable for the survival of the humankind and its technological progress.
The last Japanese tsunami provides a very nice case study of a poorly anticipated fatal accident. The event materialized, and the infrastructure was apparently ready to support it up to a certain threshold: The surprise lay in the tsunami’s intensity, and the problem’s depth was proven by the tragedy of the Fukushima nuclear plant.
This extreme case had exactly the same chance of occurring the day after the Fukushima plant opened as it did years afterwards. There is (was) no short-term/long-term argument here; there is (was) only the possibility that a disastrous event would occur whose impact would initiate a chain reaction far too complex to be fully understood and mastered.
The Japanese case clearly illustrates how the problem lay in the plant’s structure (its business model so to speak) rather than the extreme event.
The extreme event and its full potential destructive force had been there since the plant was built. Moreover, the fact that the plant was built created the probability of an accident-before the plant, the risk was clearly zero-: Generally speaking, it is the fact of doing something new that generates a chain reaction, which, eventually, might end up in a serious issue. The natural disaster was magnified by technological achievement, which in a blink of an eye was revealed as a nightmare.
But after all, is this not the usual way of any human endeavor? If we decide to do something, then we modify (often more so at the beginning) our risk exposure, and if the technology we decide to use is complex, then it is likely that the problem, in the case of an accident, will be very difficult to handle. Here, we can better appreciate Pascal’s quote: “All of humanity’s problems stem from man’s inability to sit quietly in a room alone.”
We are constantly searching new solutions and new ways of doing and mastering processes, and this generates future rewards and risks; risks, in turn, that cannot be fully evaluated because the extend of the damage remains unknown: We know the existence of a risk but we do not understand fully the mechanism which generates it (e.g. how an earthquake extremely powerful will occur).
Considering this fact, I strongly believe that a serious data-driven AI investigation may shed some fresh light, helping us in evaluating all sorts of scenarios and preparing better answers to potential problems and related risks. Specifically, rather than spending too much time trying to foreseen a chimerical “when”, we can better prepare for the “after”. Alternatively, an AI approach to envisioning and preparing contingent plans seems very interesting because it would bring many more side effects and possibly hidden correlations.
Returning to the food industry example, if one refers to the numerical results of a firm that sells processed sugary foods, then one can determine how much profit was made from an unhealthy product. Here, once again, is this number positive or negative? If the selling trend is upward, then what social consequences should the firm prepare for? Which angle does the firm prefer? Which angle does it decide to stress? And which angle to consider if you are a financial analyst?
Usually, since the “water” is quiet, sceptical and superficial sector analysts accept a smooth and continuous firm’s senior management discourse. Everything is fine folks, just enjoy. The numbers do not lie, and they are all positive. Why bother, then? No fire, no smoke.
It is the reign of the “business as usual” complacency, which is reinforced by the firms’ numbers and data widely accepted by the financial community. We can see a spiral created by the usual force, that is trust. Indeed, ultimately, numbers are trusted to be “valid” signals of the present and future status of the business.
III.B.iii. The GDP Saga: A bad Interpretation of a Historical Evolution can be very costly.
Consider a last example.
The chiefly debated macro numbers used to indicate an economy’s health are its GDP and its historical evolution. Everyone knows that today’s GDP is something we cannot read with a perspective from 20 or even 5 years ago. Indeed, any economist readily recognizes that elements like major sectors shift affect overall economic activity, such as the change of developed economies from an industrial base to a digitalized-immaterial service base-it is much more difficult to properly capture the value of a service than the value of an industrial item- or a demographically changing regime, such as fewer young people and more retired workers.
Those cases are constant sources of new interrogations about the type of information we can grasp from this statistic. For example, can we really judge the overall economic activity when services are valued at their cost of production? What about all the work done without direct compensation because, for instance, there is no more a clear cut between working and no working hours? And what about activities like the personal marketing and networking added values done on platforms like LinkedIn, how can all this be evaluated?
Nonetheless, the election of president Trump and the Brexit vote stressed another, more hidden, and often forgotten element: Although GDP is an indicator of economic activity, it is, by definition, silent about the distribution or concentration of the economic activity in a given territory.
In the famous Mark Blyth anecdote, he decided to participate at a Remain (counter-Brexit) rally in northern England. At the Q&A session, a member of the audience said, “You guys for Remain you have a serious issue here: You continue to say that the GDP has grown as never before since the UK has been in the EU. Still, this is your GDP in London not ours here!”
This story reveals the core of the issue. A politician’s objective is definitely to ensure GDP growth, but, simultaneously, he/she must ensure that the whole territory would enjoy it.
In other words, once again, the number is telling something, which a priori is uncontroversial, but we may miss the forest for the trees by paying too much attention to it. Its value should always be considered with other data so to embrace a more holistic view. Our understanding is not fixed but rather is changing and begging new perspectives. In short, we should always remember those Henry Miller wise words not so far from the Einstein’s ones previously quoted: “What goes wrong is not the world, it’s our way of looking at it ”.
IV. Data, Time and Data-Decision Process: A first Assessment.
All these examples share a common theme: Data are not tricky, per se; they become tricky as time goes by. Thus, if I want to define an optimal data-driven choice, then I would assume that data definitions remain constant, which in turn, implies an abstraction of time that must inevitably go by.
Here, the core of the problem is defined.
If we acknowledge that agents are living in a truly dynamic ecosystem, then data definitions, or the way in which they see and understand the data, should not be written in stones, but if agents’ choices are considered optimal, maximizing their objective, then those definitions must be set in stones.
This is a paradox: Widespread, fully coherent maximization behaviours among agents require a set of fixed data definitions, but a satisfactory recognition of the real world and its dynamics implies the changeable nature of those definitions. We need time to understand everything that surround us, but, due to our maximization choice, we do not have it, because an optimal choice needs to be implemented.
We are all facing this issue. On one hand, we set an action to maximize our objective, which requires stability, when, on the other hand, by carrying out the optimal action we embrace the presence of time and its dynamics, and therefore the fact that nothing is really “stable” or fixed as assumed, which implies risks, failures and misjudgments.
The Facebook saga perfectly illustrates this dilemma. Since the inception, the number of users was the measure used to judge its success. This was and is right, Facebook is running on ads-to paraphrase the words of Mark Zuckerberg in front the US Congress- and more users imply more business and therefore more money.
But more business implies successful Web-ad campaigns, which require tailored commercial messages, which imply the creation and maintenance of a state-of-the-art user’ database (rich in numerical details) as tailored-ad campaigns require a deep knowledge of users.
The simple reference to the number of users was hiding something much more complex and highly sensitive: the presence of a database whose content is not transparently presented to the public but frequently used and presented to corporations ready to run their Web ad campaign on this platform. In Facebook’s eyes, more users are beneficial if and only if more details about them are simultaneously extracted and added to the database.
Here is the business plan’s weakness: To maximize its profit, the firm needs to show that they own “rich” user’ profiles to third parties ready to run “campaigns”. But, who guarantees the morally “fair” usage of those “rich” user’ data once a third party is allowed to collect/access users’ Facebook data? A “strict” Facebook policy would, quite certainly, seriously dent its profits by increasing dramatically its costs and by preventing some companies to run their campaigns.
In any case, considering a numerical value from a single, apparently undisputable, angle is a fallacy. Undisputable angles do not exist as time passes. Everything is doomed to be reviewed through various perspectives. Basically, time’s main role is to reveal fragility and ambiguities in historical numerical data and, by doing so, degrading and possibly destroying the knowledge build on those numbers.
Hence, we might clearly know that Facebook is a firm running on ads, and that growing number of users was positive, but it was only with time that the features of Mark Zuckerberg’s business were eventually unveiled and therefore understood by many.
Ultimately, the understanding was there to be acquired from day one with minimal effort, but the knowledge that a growing number of users was correlated with an increase of revenue, was enough.
One could simply compare Facebook’s ad revenue and Google’s to judge the quality of the two business plans. But few understood how these new forms of business extracted their revenue, that is why having a database of digital profiles is extremely important.
Time is simply a mechanism pushing people toward the understanding of a phenomenon instead of accepting “simple” knowledge with the help of human nature and qualities like curiosity, willingness, pride and fear of being the only who doesn’t understand.
V. Data, Time and Data-Decision Process: Conclusions.
One can draw at least two main conclusions:
A. The real “magic” word that does not appear in any discussion about data but should always be considered is trust. A person does something because he or she has a project or a plan that will be completed in the future. To realize that plan, one needs a “trusty” vision of the past and present, so to confidently project the future. Data and its analysis provide a way to establish these cornerstones which, in principle, should help build a solid decision.
The problem is that we tend to forget how biased the data are and how data might fail to convey a true understanding of a phenomenon, despite offering an “easy” (superficial) knowledge of it.
Everyone can acknowledge the extent of uncertainty carried out by the market, due to the presence of these multiple data biases. In the Facebook’s case savvy analysts, equipped with a solid understanding, were certainly aware that the firm was/is running an “addictive” algorithm to increase users’ time spent on its service and therefore guarantee high ad campaign visibility.
To develop such an engine, one must acknowledge an underlying database with many individuals’ information. On that side, Google’s success indicates that this firm also owns a huge record of our Internet habits and therefore is living under very similar constraints [3]. Here, from those elements, one can conclude that the firm’s weakness was/is its database and that protecting it was a natural step to take.
Why did the financial markets not pay attention to this argument until the scandal?
There were warnings, and there were discussions, but the revenue was growing steadily, and the firm was wisely using the money to diversify its audience by investing in other platform-related applications.
There was no smoke and therefore no fire!
Ultimately, financial markets and data-driven decision makers (human or machine) do not like to review their notions or their hard-data.
Again, if one enters the analysis where most stop, such a sceptical analysis could be very rewarding intellectually, but what about present decision making?
Basically, the problem is always the same: If one is preaching alone that something is wrong out there, it may cost him or her a lot of money. For instance, before the 2007 subprime crisis, there were at least two years of frequent discussions on business channels like Bloomberg and CNBC about the existence of a possible bubble in the US housing market. There was no “Black Swan” here: People like Michael Burry, who took a huge bet against the market by forecasting a major crash, were losing money until the crisis began.
This scenario shows how turning points around data’s meanings cannot be forecast.
Would AI have any advantage on that side? To be honest, no, which is why, AI are by default, in the active fund world, “quantamental”. The human touch is likely to intervene even after the AI’s recommendations are established.
It is clear that social-network “Big Data” should help build better sentiment indicators, which in turn should predict those famous turning points.
Nonetheless, those new indicators are likely to be very biased.
After all, they are based on “reactions” or approval-disapproval rates to news or discussions on social media by a “special” subset of the population likely to be far from representative. Hence, the real sentiment is likely to remain well-hidden among a silent majority.
Here, despite all our efforts, the choice of standing with or against the market, remains a bet:
Fundamental analysis is basically silent about precise investment timing.
B. All these caveats concern so called “fundamental” data, or the data describing and defining the environment in which market actors operate: business results, economic data, etc. Those caveats should not apply to market data themselves: prices and various analytics data we can derive from them, e.g. from means of prices to stochastic oscillators. These are the real hard data, right? A price is a price?
Libor was supposedly a “fair” and transparent rate until it was revealed that some banks were manipulating its calculation by submitting unfair rate well aware of the math used to define this interbank rate. So, a price is a price, but it will always call for close scrutiny and a dose of uncertainty.
Certainly, AI or more specifically, Machine Learning are certainly very useful but, even with those more solid data, the outcome obtained should be taken with pinch of salt and in a modest and humble mood.
On that side, recent events, chiefly the return of a “normal” (i.e. in line with historical values) financial markets volatility seems to prove the need of this humble attitude. Loosely speaking, and as we will discuss more in details in our second part, AI is a trend hunting device, but to be useful we need “stable” trends, which keep running long enough. Now, if the turmoil is important, as in the first quarter of this year and likely for the remaining of this year, trends tend to become short living and this could lead to important losses. [4.]
Why?
Because, in short, when a market regime changed occurs psychological and common knowledge aspects are likely to play a far more important role than quantitative analysis. Again, we will be back in details in the next part.
We should always remember that we are living in the era of unicorn tech firms, the amazing growth of private equity funds- and plenty others out of the markets activities-. Are they a sign of a fatigue vis-à-vis of the market mechanism? Equivalently, the fact that plenty of financial activities are done and set out of the markets is not a sign of a lack of trust on this institutional arrangement? On that side, during the last ten years, a “great distortion” of the financial market price mechanism was implemented due to central banks’ policy, which has certainly helped disparaging the trust on this institution.
To conclude, what we can really say is that, when markets become too complex, when prices are too difficult to figure out (that is, understand, not just simply know), chiefly in its short-term evolution, several actors may be tempted to move away, and find other institutional frameworks in which investments are evaluated based on calculus and numerical data, no doubt, but also primarily by human and shared long-term visions and considerations.
Notes.
[1.] The interested reader can refer to this excellent article: http://creativethinking.net/your-theory-determines-what-you-observe/#sthash.goBkQ9KR.dpbs . Basically, in the Einstein’s tale the professor is a knowledgeable person but he does not understand- figure out- the concepts he is manipulating. Very often we are victim of the same trap: By referring at our knowledge we pretend to understand-we know how our television set works, but we do not understand it-. We will disseminate hints about this key distinction between knowledge and understanding all along this paper.
[2.] It goes without saying that here we see one of the main advantages to living in a free society: A datum’s definition can freely be discussed and challenged, which implies a critical review and therefore favours the formulation of an alternative, which might lead to better knowledge of a phenomenon. We also know that troubles often start when people refer to a single value as the right guide and the right tool to approach and solve an issue. Again, democratic and free speech regimes allow the denouncement of this obtuse attitude and push us to consider alternative views and angles. Clearly, the same remarks apply when the society is facing a new phenomenon: A priori, open society are better equipped to elaborate and fine tune measures, tools and finally solutions.
[3.] It is interesting to note that one’s Google profile is likely to differ from one’s Facebook profile. When I am searching on the Web, for example, I am more open to a “guess, learn and review” behavior but on Facebook, I am facing a medium whose usage reveals more details about my real preferences. Funnily enough, even Big Data are likely to be biased. Google’s one is shaped by and contains features that we will not find in Facebook’s one, and vice versa.
[4.] On that side, this February AI funds posted the worst month ever performance, i.e. here: https://www.bloomberg.com/news/articles/2018-03-12/robot-takeover-stalls-in-worst-slump-for-ai-funds-on-record, which has reopen the question to know if AI and Machine Learning are not Overhyped as presented here: https://www.zerohedge.com/news/2017-10-23/machine-learnings-overhyped-potential-headed-toward-trough-disillusionment
Great article! I wonder about this what you state from Nate Silver “there is no such thing as unbiased data. Bias is the natural state of all data.
Seems to suggest that there is no such thing as unbiased people. Bias is the natural state of all people.
LikeLiked by 1 person