The Active Stock-Picking Funds Industry: The Current Status and few open queries on AI & Big Data consequences.


Clifford StollData is not information, information is not knowledge, knowledge is not understanding, understanding is not wisdom.

Last available data seems to confirm the success of the passive stock fund industry is founded on an additional monetary outflow from the active fund industry [1].

Apparently, the sunny days of some are the rainy days of others. What is certain, in 2017 at least, is that the capital inflow in this sort of vehicle seems as unstoppable as the bull stock market itself.
In addition, during the summer some active stock-picking fund managers, became vocal about sharing their daily struggle to keep their clients. What needs to be done to save the active stock fund business? Should managers simply await “the great correlation collapse”, as analysts at US money manager Bernstein are calling the unprecedented synchro stocks’ movement that characterize the last ten years? After all, few months ago, this was the magic recipe pointed to finally fight effectively the mounting threat defined by the passive funds industry [1]. On that note, it is likely that we will be fixed soon: at the beginning of this year, the stocks volatility seems definitely back on track at normal level.

But will this be enough? Will these performances, if confirmed at the end of this year, last long enough to bring back a portion of the money lost on favor of passive funds?

In response to these existential threats, most active fund groups are structuring their answers around a central focal element: The only solution available is to keep investing in technology and data to beat the market.

From an outsider’s purposefully naïve perspective we can then raise a further, deeper series of questions:
Are we sure this choice defines the right answer? Why are fund managers considering only these two pillars?
Two main and, apparently, unquestionable facts seem to reinforce the idea that this is the correct path to follow:

1. The active fund industry needs to lower its feeds because the passive index tracking funds are, by definition, very cheap [2].
To fulfill this requirement, there is only one option: Generally speaking, fund managers must use software instead of humans. In other words, fund managers must increase the proportion of their choices made via quantitative digital data analysis-which implies to increase the part of systematic decisions. A priori, by so doing, more efficient actions, that is based on larger sets of information, would be implemented despite keeping costs.
2. Simultaneously, by increasing the technical endowment, your fund’s organization would become ready to embrace the AI-driven stock-picking revolution. Automatically, you will then be ready to absorb and to fully profit from the next data wave: the huge amount of -often- highly granular, but also highly unstructured, Big Data collected, mainly, thanks to web 2.0 services.

Once again, from a purely outside perspective, both points seem to boil down to a simple and general idea: To beat the market, to perform more than your benchmark year after year-that is, to create the famous alpha (you) as an active fund manager-you need more data and more data analysis capabilities. Nowadays, in data we believe seems to be the only motto widely accepted among active fund managers, instead of a healthier on future we believe.
That said, a well-known shortcoming is widely acknowledged among all fund managers and experts:
Although they have access to an unbelievable and fast-growing amount of digital data, the data is, by definition, no information. A detailed discussion of this fact would take too long and is too complex (e.g. the famous Clifford Stoll quote above); we will not fiddle while Rome burns.

What matters is that two main insights arise once we acknowledge the existence of a difference between data and information:
i) Data need to be assembled and ordered to be properly analyzed in preparation for being encapsulated in an informational framework. Basically, this is loosely defined as the process of extracting information from data.
Often, this implies answering questions like the following: Is the raw data sufficient to describe the phenomenon in informative terms? What is the most appropriate manner in which to work out the data? Which intervals should be used to analyze them? Should we transform the raw data (e.g. rate of growth) to obtain information? Many other questions in this vein exist and need to be explored.
Clearly, this sort of investigation is not free of cost. The cost is rising, from, to use a buzzword, the process of data mining on huge and unstructured data sets: Its costs are likely to become more and more important once web-based Big Data sets come to be considered.
ii) Some information cannot and will never be available in a digital-data-version or, more precisely, will never be obtained by working in data sets. Concerning this problem, the formula developed by Emmanuel Tahar is spot on and deserves to be quoted: “After all, out of all information out there, only some of it is indexable [i.e., can be read and used by computers]. Out of all that is indexable, only some of it is indexed [i.e., ready for a digital analysis]” [3].

In several hedge funds where the most sophisticated variant of AI, Machine Learning (ML), has already being used for years, those concerns have already been addressed and answered:
They use a blended solution of AI alongside with Human Intelligence (HI) to guarantee that the quant program-AI-ML suggestions are always, ultimately, validated by human eyes and thought.
You might think this is common wisdom: by definition, not all the information can be first extracted and then processed by software. Therefore, AI choices cannot encapsulate the whole “present-informational-reality”. There is always a “mechanical” part in any AI solution due to a lack of finesse because AI is not fully present and therefore involved: Basically, AI is answering using a “real” build and derived from data, but data despite all effort in preparing them are never capturing the full “realityand the missing part may play a huge role if the system is dynamic.
This part will be analyzed in a further paper.
Some authors have nicknamed these sorts of funds’ strategies “quantamental”, which combines the traditional stock-picking skills of fund managers with the use of data and computing power [4].

I do not fully agree with this labeling: HI is there because some information cannot be extracted from data when it needs to be considered. More than a combination of insights, a human is the only one who can really feel the presence and the weight of what is going on and then make a holistic decision, full stop.

As a side note, another argument in favor of a quantamental solution is that it is well known fact that ML suggestions may be very tricky for humans to fully comprehend. It is the famous black-box case in which the validity of the relationship found by the machine is too complex to handle for a human mind- a sort of “ML-intuition” that, like in the case of human thinking, is difficult to explain in plain English-.
Here, humans can decide to follow the machine’s insight, even without having a full understanding of the procedure [4].

In any case, once the decision is made, it can, always, be managed and followed using the standard risk procedures: The starting point is often deemed to be unclear, the outcome is simply a change in the holdings of a fund, which can handle risk wisely like any other modification.
In summary, what defines the current status characterizing the active stock picking fund industry? It is a paradox.
On the one hand, active fund managers need to lower fees to stop the money outflow toward the passive industry and, on the other hand, they must hugely invest in data analysis and teams that are well equipped to handle this new task, which implies an increase in the cost base. Additionally, we have seen some crucial reasons explaining why HI has a key role in any new AI-oriented fund offer. Rather than sticking solely to AI, the fund industry is enjoying an augmented form of intelligence by combining HI and AI. This is the right paradigm to embrace in our industry: AI will enhance HI, but HI will simultaneously prevent some AI dead spots that are there to stay, whichever data set we consider.

Here, a harsh approach that accepts only pure data-driven decisions in making stock-picking decisions, is condemned to fail sooner or later. Two main reasons can be highlighted to explain this failure:
1. There are plenty of cases in which the data do not tell enough or, paradoxically, where they show too much: They are just too noisy.
This is why statistics was created in the first place, and data science cannot entirely dismiss its skeptical core: Critical thinking remains key to highlighting the limits and to building feasible and acceptable workarounds. Once again, we will publish soon a further paper devoted to those arguments, but generally speaking, data are like a streetlight in the night, we can see a “reality” around it, but not the “real” which, by definition, includes also the dark around.
Besides, if plenty of data are used to analyze a share price, then their past movement are “explained” using this huge set. This implies plenty of possible trends and correlations, which potentially will keep running ahead: Which one should a fund manager choose?
A data-driven procedure may easily become a Pandora’s box that, instead of calming down a noisy market, increases complexity when prices need to be understood.
Such procedures are likely to add further noise to an already hyper-complex (noisy then) market-frame: how and why some movements occur and remain, will be more and more complicated to explain.
2. As simple as it can appear, as in the case of autonomous cars driving alongside humans’ drivers, the real issue of pure data-driven stock picking is that humans will be acting in the market. The “competitive” presence of humans and AI-driven, like cars sharing the road, is likely to induce both accidents and misunderstandings and, paradoxically, new opportunities, for fast cars drivers!
Indeed, is the human decision always made based on data? Certainly not. What about, for instance, decisions based on strategic-thus perfectly rational-considerations? To forecast the outcome of our decision we are assuming others’ participants behavior: Are we sure to get it right? If a fund manager is bound to a pure discretionary strategy in his/her choices, what about taking action while expecting a series of “mechanical” reactions from the AI-driven “competitor” side? Nowadays, this happens already with traders speculating on regular readjustment of passive index tracker.
Furthermore, a more fundamental query remains: Are active fund powerhouses truly obliged to follow this path of investing so massively in AI? Are those strategies really well-founded?

In our Why Mr. Keynes still matters: The stock market and its dysfunction (here), we will present some arguments that prove how, funds’ active powerhouses ultimately cannot escape their data-AI-driven fate due to their objective of beating the market.


[1.] Just few days ago the FT online reported how 2017 was a further record year for the passive fund industry with a global growth of 460billions. All details here:

[2.] For an overview and in-depth, analysis of the passive industry, I suggest that the reader consult my most recent article,

[3.] Emmanuel Tahar “Busting the Bot Myth: Why the Investing World Still Needs Humans” is available here:

[4.] For more details please refer to Joshua Maxey “The Rise of the Quant Fund: It’s Not Only About the Machines”, which can be found here:

[5.] This point has been analyzed in a one of my previous papers and in a very interesting article. My contribution can be found here: The article, which describes stock-picking decision-making in one of the first hedge fund powerhouses when AI/ML was used alongside humans, can be found here:

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s