For someone so practised a Luddite—still writes by hand; likes receiving the occasional, non-bill letter; subscribes to a lot of print media—it of some surprise always to myself that I have an ongoing interest in machine learning.
The reason is, for a few years here in Berlin, I worked for a tech company called Retresco, which specialises in this area. I was a linguistic data architect, helping the company build its natural language generation (NLG), or automated content, projects.
The key to much of our work was taking on large projects that took clean, structured data and turning it, through the power of computers, into content. Once built, these projects could run ad infinitum, churning out brief, data-driven news stories.
So it was with some interest, reading the front page of Handelsblatt this week, that an analysis by that paper found that AI has achieved double-digit percentage returns in the past 12 months. Compared to the world stock index, say the authors, the DWS index fund ‘Xtrackers AI & Big Data Ucits ETF’ outperformed in returns by 45% to 29%.
It is not the first time that such good news about this subject has been disseminated. In December, the Harvard Business Review did its own study, ‘What Machine Learning Will Mean for Asset Managers’, about the subject. This followed a similar study it did the month before.
In that previous study, the authors set out to build an investment algorithm and compare its performance with the returns of 255 angel investors. “Utilising state-of-the-art machine learning techniques,” the authors wrote, “we trained the algorithm to select the most promising investment opportunities among 623 deals from one of the largest European angel networks.
“The algorithm’s decisions were based on the same data that was available to the angel investors at the time, which included pitch material, social media profiles, websites, and so on. We used this data to predict a startup’s survival prospects—instead of measures such as valuation, which investors often favour—because it allowed us to train the algorithm with a much larger and more reliable dataset.”
Not only did the researchers compare the algorithm with angel investors, but they also threw investors with varying degrees of experience into the mix. The authors wrote: “We further investigated how angel investors of varying experience — novices with fewer than 10 investments vs expert investors with at least 10 investments — faired relative to the algorithm’s performance. Expert angel investors in our sample, on average, made about twice as many investments as novices (12.2 vs. 5.2) and invested double the amount per startup (€10,530 vs. €4,548).”
The results, on the surface, were stark. “The algorithm achieved an average internal rate of return (IRR) of 7.26%, the 255 angel investors — on average — yielded IRRs of 2.56%. Put another way, the algorithm produced an increase of more than 184% over the human average.”
That sounds pretty good, but the problem is that all investors are not created equally. The HBR also found out that, “[Angel] investors with lower signs of irrational behavior in their portfolios performed significantly better than their rather irrational counterparts: the less biased novice group averaged 3.51%, whereas the novice group with higher biases, on average, lost money at -20.52% IRR.”
The authors concluded: “According to our research, novice investors are easily outperformed by the algorithm — with their limited investment experience, they showed much higher signs of cognitive biases in their decision making. Experienced investors, however, faired far better.”
So, what does this mean for machine learning and investment?
Well, any data project is only as good as the data it has, and that is often vulnerable to misinterpretation. The journalistic organisation ProPublica discovered this more than five years ago when it ran an article ‘Machine Bias’ about the models used to predict future criminal behaviour in the US. Not surprisingly, ProPublica found that the models showed bias against black Americans, drawing a correlation between skin colour and susceptibility to commit crime when it was not there. (The story and reporting is excellent, and everyone should read it).
Data is also limited in that it is often contained with a single (occasionally multiple) Excel worksheet. While some machine learning programs are being developed that can look beyond these limitations, most data-driven projects are still reliant on single or limited data sources. They have no ability to look beyond the limitations of the data,
Here is a practical example of how this may work. A machine could take raw data and produce a single story about a basketball game. It could tell you the final score, who scored the most points, which team won, who was substituted, and how many warnings and sendings-off were given. But the most-salient explanations as to why a team won or lost—the sacking of a manager two weeks before, injuries before the game, prior performance, etc…–are unlikely to be available. And it is those details that give a narrative its proper context. And what judgements worth anything can be made without context?
So where does this leave AI and investing?
There is no doubt that AI and the processing of data into insights, but it is still at an early stage. We are beyond the first computers that sent a man into space, but still a way behind Skynet. It is people such as data engineers and architects, and machine learning experts, and their experience, that still makes all the difference.
To paraphrase Bruce Springsteen, we still need a little of that human touch.