At the beginning of my professional career, I used to work as a Data Scientist, and one of my early projects was the analysis of raw human genome data from patients with Alzheimers Disease. Many things were painful back then; we had to recruit participants 1-by-1 to enroll in our study, sequencing the genomes to get the data cost us >1m from a research grant, we had to set up a costly compute cluster ourselves, and even simple regression analyses took days to finish (per iteration). I especially remember working for weeks on engineering our data structures, optimizing database settings, and manually re-writing the analysis algorithms (because we exceeded our RAM limits), first just for the analysis to compute at all, and then to finish in days instead of months. A lot has changed since then.
The three currently most prominent enterprise technologies are without doubt AI, blockchain, and IoT, and the driving factor behind them is data; people even go so far to proclaim that data is the new oil.New technologies enablecollection, sharing, analysis of data, and automation of decisionsbased on them in ways that havent been possible beforein what is essentially a data value chain.
- 1st generation projects have been focussing oncreating the data infrastructureto connect and integrate data, e.g.IOTA,IoT Chain, orIoTexfor data from connected IoT devices, orStreamrfor streaming data.
- 2nd generation projects have been working oncreating data marketplaces, e.g.Ocean Protocol,SingularityNet, orFysicaland crowd data annotation platforms, e.g.GemsorDbrain.
- With solutions covering the first steps on this data value chain maturing, my friends@sherm8nand Rahul started to work onRaven Protocol, a first 3rd generation project that will close an important gap at the analyze stage:Compute resources for AI training.
A recentOpenAI reportshowed that the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.5 month-doubling time, this is a300.000x increase since 2012.
OpenAI Report: AI andCompute
The immediate consequences from this are:
- Higher costs, as used compute is increasing faster than supply
- Longer lead times for new solution, as model training takes longer
- Increased market entry barriersbased on access to funding & resources
These consequences can proofdire for smaller firms and researchers, limiting their ability to create competitive models without significant funding. And even with funding, they might be blacklisted from resources if theyre considered competition by the providers.
So Gavin Belson can just pick up his phone and make us radioactive to every single web hosting service?
Butbig corporations will feel the costs as wellconsidering both the growth rate of resources and the growth rate of their AI efforts multiplied. I spoke to a few Chief Data Officers of Fortune 500 firms over the past months, and while they dont consider it an issue yet, even they have to agree that they can invest their money in better ways than in bought-in HPC resources.
The beauty ofblockchain ecosystemsis that they allow to tap into otherwise unused resources, trade resources that would not have been tradable, and allow people to participate in a market that otherwise could not participate. From an economic perspective, it improves the leverage on existing resources.
Where 1st & 2nd generation data blockchain solutions used this to lower the barrier for access to annotated quality data,Raven Protocol is going to solve the training cost challenge. It is closing the gap that would have prevented the proverbial chain to hold, that is only as strong as its weakest link (hint: its the data valuechain).
Together, the solutions in this blockchain data ecosystem create new opportunity and they lower cost. Especially the second is key asit lowers the entry barriers for new innovation, enabling more people to contribute and therefore potentially accelerating our progress as a society.
If this all sounds a bit abstract, just have a look atone area where AI can help: Healthcare. Our global healthcare system is in serious distress. Costs are exploding, they areexpected to increase by 117% over the next 10 yearsdespite already being atup to 18% of a countrys GDP. At the same time, research for new drugs ishitting a cliff.
Our healthcare systems needs lots of innovation to keep healthcare affordable, and there aremany opportunities where AI solution can help; for this reason,healthcare is the industry with the highest investments in AI,and has been for years.