“Alexa, start a conversation,” she said.
We were immediately drawn into an experience with new bot, or, as the technologists would say, “conversational user interface” (CUI). It was, we were told, the recent winner in an Amazon AI competition from the University of Washington.
At first, the experience was fun, but when we chose to explore a technology topic, the bot responded, “have you heard of Net Neutrality?” What we experienced thereafter was slightly discomforting. The bot seemingly innocuously cited a number of articles that she “had read on the web” about the FCC, Ajit Pai, and the issue of net neutrality. But here’s the thing: All four articles she recommended had a distinct and clear anti-Ajit Pai bias.
But the experience of the Alexa CUI should give you pause, as it did me. To someone with limited familiarity with the topic of net neutrality, the voice seemed soothing and the information unbiased. But if you have a familiarity with the topic, you might start to wonder, “wait … am I being manipulated on this topic by an Amazon-owned AI engine to help the company achieve its own policy objectives?”
The experience highlights some of the risks of the AI-powered future into which we are hurtling at warp speed.
Any it’s a reminder that big companies, such as Amazon, have traditionally had big advantages when it comes to big data and AI.
The trust problem with centralized big data
According to Trent McConaghy, CTO of BigChainDB, AI took a huge evolutionary step forward in 2001. This was when two Microsoft researchers named Banko and Brill discovered something that now seems obvious to all of us: The bigger the data set you’re analyzing by orders of magnitude, the lower the error rates you get.
The era of Big Data was officially upon us and the race was on.
But if the race is about gathering, storing, and analyzing as much data as possible, then who is in the pole position to win? That’s right, the FANGs in the U.S. (Facebook, Apple, Netflix, Google), the BATs in China (Baidu, Alibaba, Tencent), and the wealthy Fortune 1000 or so multinational corporations.
They are the only ones with the reach and capital to get more data, store it, analyze it, and build AI models on top of it. What’s more, they are the only ones who can offer starting salaries in the $300,000 to $500,000 range and top-tier salaries that extend into to seven and eight digits. Your son or daughter may not make it to the NBA or NFL, but become a top AI scientist and you’re doing great.
The net effect of all of this is that the rich become even richer and more powerful and the barriers to innovation become even higher.
It is not only innovation that suffers, however. The closed nature of big-company AI means society must put its trust in “black boxes.”
Let’s look at how AI works to help make this clear. There are three layers that are essential
- The data repository
- The algorithm/machine learning engine
- The AI interface.
If you are going to trust your decision-making to a centralized AI source, you need to have 100 percent confidence in:
- The integrity and security of the data (are the inputs accurate and reliable, and can they be manipulated or stolen?)
- The machine learning algorithms that inform the AI (are they prone to excessive error or bias, and can they be inspected?)
- The AI’s interface (does it reliably represent the output of the AI and effectively capture new data?)
In a centralized, closed model of AI, you are asked to implicitly trust in each layer without knowing what is going on behind the curtains.
For a simple conversation with a nine-year-old, this may not be the end of the world. But for certain African-American criminal defendants, the implications can be life-altering: According to both the New York Times and Wired, the use of a proprietary machine-learning system called COMPAS, which is used by courts in many parts of the U.S., actually recommends longer prison sentences for blacks than whites, with all other data points being equal.
In effect, the AI makes racially-biased decisions, but no one can inspect it, and the company that makes it will not explain it. It’s closed, it is hidden, and models like these are in the hands of big, powerful companies have no incentive to share them or reveal how they work.
How blockchains level the playing field and add trust
Over time, more and more data will flow into blockchains, and that will reduce the big data advantage that the FANGs, BATs, and Fortune 1000 have over the little guys.
As Deepak Dutt, CEO of AI-based identity proofing company Zighra says, “When data is commoditized, AI algorithms become the most valuable part of the ecosystem.” In other words, we’ll see a power shift from those who own big sets of data to those who build smart, useful algorithms.
That’s great, but if we’re moving data to blockchains, some big, thorny questions still exist. For example:
- Where does the data go?
- How is it discovered and utilized?
- Why would people put their data in there?
- And don’t the “big guys” still have a huge advantage in terms of building powerful AI?
Welcome to the world of Blockchain+AI.
3 blockchain projects tackling decentralized data and AI
A number of projects have popped up to reward people through cryptographic tokens for making their data available through a decentralized marketplace. The result could be ever-more accurate AI models and the ability to create valuable conversational user interfaces, all with the trust and transparency that blockchains offer.
We are going to look at three of them.
1. Ocean Protocol. On the repository level, the Ocean Protocol aims to create a “decentralized data exchange protocol and network that incentivizes the publishing of data for use in the training of artificial intelligence models.” Put more simply, if you upload valuable data to the Ocean network and your data is used by someone else to train an AI model, you are compensated.
Let’s take one of my favorite examples, my Nest thermostat. Right now, data is uploaded constantly from my thermostat to Google. With data from me and all other Nest owners, Google has a really strong data set against which it can build AI services that could, for example, know when someone should send an offer of insulation or new windows to my house.
That data, which is mine (and yours), has value, but Google currently gets it for free.
What if, however, an enterprising home automation AI scientist (let’s call her Alice) believes she can build a better model than Google can?
In the Ocean model, Alice would license your data (and the millions of other data points out there) and compensate you with some amount of Ocean tokens.
Now think even bigger …
All of that data you are giving away for free (Nest, Fitbit, Hue lights, Ring doorbell, and every other IOT device out there) now has
- data integrity (everyone knows the source of the data)
- clear ownership (you)
- and thanks to cryptocurrencies and blockchains, a cost-effective way to buy and/or lease it.
You’re happy, since you’ll be getting compensated for something you’re currently giving away for free. Alice is happy, since she (eventually) will have access to the same dataset that Google has. Boom — playing field leveled, thanks to an open data marketplace. And we’re all safer from bias and error because the AI built on this data comes with more transparency, since the data sets that inform the models are known.
Another notable player in this space is IOTA, which already launched its marketplace.
2. SingularityNet. Now, let’s say Alice has really cracked the code on a powerful AI algorithm that could help marketers, government officials, or environmentalists understand how weather patterns affect energy consumption. That’s where SingularityNet comes in, focusing on the AI level.
SingularityNet has a strong leadership team which includes AI pioneers Ben Goertzel and David Hanson and aims to be the first AI-as-a-service (AIaaS) blcockhain-based marketplace. In their world, Alice offers up her model (for sale or rent) to others for use against their own dataset. Thanks to a standardized AI taxonomy, a search engine helps users discover and rapidly integrate Alice’s model with complementary models, creating even more powerful and better trained models.
Coming back to our Nest example, let’s say that Alice’s model is built to study the home energy market in New York City. Combine that with models for Newark, Stamford, and Long Island, and you can start getting even better insights about tri-state area consumption.
Since ownership of the model is clear (it belongs to Alice), her intellectual property is protected. Every time her model is used, she is compensated in SingularityNet’s AGI tokens (AGI being the acronym for Artificial General Intelligence). Now you have the data sets that the big guys have AND access to the AI models they have as well.
For those of you familiar with the crypto space, the project will sound a lot like Numerai, albeit with a more broad focus than the hedge-fund disintermediation objective Numerai has.
The implications of a successful rollout of the more broadly focused SingularityNet on every industry could be quite dramatic. It should lead to an arms race in terms of AI models among industry competitors and will likely impact the required skill sets for jobs of the future.
3. SEED. Finally, at the interface level comes SEED, a project that is looking to give us all confidence that we can actually trust the bots in our lives.
According to SEED, “The bot market is estimated to grow from $3 billion to $20 billion by 2021,” a projection that means interactions like the one my daughter and I had with Alexa will become much more common and potentially more risky. After all, even if you completely trust Amazon, there’s still the possibility the bot you are interfacing with has been hijacked.
The solution for this is the combination of the SEED Network, the SEED Network Marketplace, and the Seed Token.
The SEED Network is an open-source, decentralized network where any and all bot interactions can be managed, viewed, and verified. It is also the framework for ensuring that the data fed into the AI via the conversational user interface aka “bot” can be assigned a data owner who can be compensated for it.
The Marketplace is the way aspiring bot creators, like AI model creators, can sell and license the various components they have built to others who need the services. While the University of Washington students who built the winning AI for Amazon were probably thrilled with their $500,000 check, they would probably be more thrilled to get a small royalty on every interaction their CUI has with Alexa’s users in perpetuity.
Finally, the SEED token is the mechanism through which bot creators and data owners (you and I) are compensated for the value created inside the network.
To round it out, let’s come back to Alice. She has not only built an AI for home energy use, she has built a bot that will periodically ask you, “Hey, are you feeling hot or cold in your house right now?” When you answer, you are feeding data into the AI and into the AI repository. That’s your data. Why shouldn’t you be compensated for it? After all, it makes the AI better and enriches the data repository. SEED says you should, and it secures your asset rights in the blockchain.
When all is said and done, SEED will offer you better protection for the data you offer and greater confidence in the authenticity and reputation of the bot with which you are interacting.
The promise of blockchain-based AI
Blockchain-based AI projects are still in very early development, and the big data kings have a huge advantage, but so did the Atlanta Falcons at halftime of last year’s Super Bowl.
As blockchains drive into the mainstream, we will see more and more data hitting decentralized marketplaces and exchanges. As people realize the value their personal data has, along with the opportunities to monetize it, and as networks like SEED, SingularityNet, and Ocean mature, we will see a tipping point in the evolution of big data, moving from a closed, siloed phenomenon to open systems where the creators of data are more fairly rewarded for their contributions.
It is too early to tell which protocols will be the winners and whether these three first movers I’ve pointed to will remain in the lead or lose out to the next wave of fast-followers.
The only clear thing is that the winners will be the developers and consumers whose data and intellectual property will be rewarded and whose experiences will be protected from bad or manipulating actors by open, transparent systems.