DEV Community

Cover image for Accountable Privacy in Web3 (1/4)
Manmit Singh
Manmit Singh

Posted on

Accountable Privacy in Web3 (1/4)

Today, over 5 billion people around the world have access to the internet. While most online activity takes place on search engines and social media applications, other applications related to banking, health, and productivity also see considerable usage.

It has been estimated that users on popular apps and websites create around 2.5 quintillion bytes of data everyday. This treasure trove of information, ranging from someone’s address to their bank balance and fashion sense, is mostly then owned and managed by the apps themselves.

Data Privacy 101

This pervasive collection of personal data, and its subsequent monetization by businesses, has been recognized as one of the main challenges of information technology in the 21st century. In fact, governing bodies like the European Union (EU) have even enacted laws like GDPR to give users some semblance of data ownership.

But you’ve probably heard this before. And not cared too much about it. That’s fine.

Believe it or not, apps storing and monetizing user data isn’t the end of the world. This is for a few important reasons:

  • The average internet user is neither equipped nor inclined to manage the data they create, as doing so would introduce friction into their seamless digital experience.
  • The individual data point isn’t worth much by itself, since statistical algorithms can only derive inferences from larger datasets. Users are able to enjoy free online applications by letting companies aggregate, analyze and monetize their data.
  • Companies are able to keep user data safe and confidential by operating at economies of scale (their marginal cost of setting up hardware, maintaining it and developing safety protocols for each additional byte of data is negligible)

But while it may be somewhat acceptable to take a laissez-faire approach to personal data in Web2, that certainly isn’t the case in Web3. Why?

The Case For Web3

The core innovation behind blockchains like Bitcoin and Ethereum is that they solve the issue of ‘double spend’ in trustless digital networks. To learn more about this, take a look here (or read the Bitcoin White Paper). Blockchains do so by posting all the information regarding any cryptocurrency transaction on a public, append-only ledger. This includes things like -

  • The sender’s address in a transaction
  • The recipient’s address in a transaction
  • The amount of the transaction, and its metadata
  • All balances and past activities of both the sender and receiver

Moreover, recent studies have shown that it is possible to link a blockchain address to an IP address, even if it is behind a firewall. It is also possible to link multiple blockchain addresses to each other based on common ownership.

Still don’t see the problem? Try this.

We’ve all bought an ice cream or hot dog with cash. Imagine handing over a $5 bill to an ice cream vendor at a packed stadium with tons of strangers, and loudly announcing that you, W, known pseudo-anonymously as X, and having Y dollars in your account at Z bank are giving 5 dollars for a chocolate ice cream. The ice cream vendor similarly announces everything about his personal and financial life to the world. Yeah … people don’t do that for a reason.

Without corporate paywalls and data privacy mechanisms, blockchains like Bitcoin can expose their users (individuals/businesses) to a host of issues, including the following -

  • Leaking sensitive information: While it’s perfectly valid for governments to access user data to identify malicious actors, this argument works both ways in We3. It is just as likely for a malicious private actor to surveil and link a user’s real-world identity to their on-chain activity and gain access to their sensitive personal data.
  • Revealing business strategies: Finance is built on the advantages of information asymmetry. Businesses executing strategies, particularly in DeFi, may wish to prevent their competitors from analyzing their wallets and recognizing their strategies so easily.
  • Opening up value-extraction attacks: Since most blockchain transactions (especially individual user transactions) are collected through a public procedure, it may be possible for a malicious actor to front-run a certain user’s transactions, censor them or force them to submit higher gas fees to get their transaction included in a block (i.e MEV attack)

These potential issues have contributed, at least in part, to the lack of institutional adoption of Web3. They have also deprived the space of richer applications related to healthcare, identity and so on. So one thing’s clear, blockchains need privacy.

overview of the privacy spectrum
Figure 1: Overview of the Privacy Spectrum

Ideally, the ‘sweet spots’ (there may be more than one) of Web3 privacy lie somewhere between no privacy and complete anonymity. These design decisions can be informed by technological feasibility, user behaviour and very importantly, governmental compliance. Let’s name this Accountable Privacy.

AI x Web3

Recent advances in large language models like OpenAI’s ChatGPT, browsers like Perplexity and intelligent agents (e.g. Eliza) have completely revolutionized the way people use the web, and blockchains are no different. The last 5 years have seen a Cambrian explosion in AI-driven crypto, with AI agents providing recommendations on DeFi, helping users manage accounts etc.

But if you thought humans reading your blockchain data was bad—AI agents make it worse. This is because blockchain protocols are, by design, permissionless and open to “anyone” with a set of public and private keys. This means AI agents can participate in blockchain protocols alongside their human counterparts.

The growing pervasiveness of AI agents on-chain creates an entirely new set of concerns, including but not limited to the following:

  • Data Training and Leakage: AI agents can scrape and analyze on-chain data, learning from transaction histories, wallet activity, and user behaviors. This means sensitive on-chain actions can be used to train models without a user’s consent. Worse, models have been known to leak snippets of their training data, creating privacy risks.
  • Enterprise AI and Privacy: Advanced AI models interacting with blockchain protocols risk exposing their logic and proprietary data. While users need to trust that these models execute correctly, full transparency could compromise intellectual property. This calls for off-chain Zero-Knowledge (ZK) verification, allowing AI models to prove correct execution without revealing their inner workings.
  • Private Inferences, Public AI: The most advanced AI models will likely remain off-chain, operated by private companies due to high costs and complexity. Yet users may still want to query these models for insights without disclosing personal or financial data. This creates demand for privacy-preserving AI inference, where users can leverage powerful models without compromising their own privacy.

As AI agents integrate into blockchain ecosystems, privacy is no longer optional. Users interacting with AI-driven systems shouldn’t have to trade privacy for access, or gamble with sensitive personal information. If they want, they should still be able to provide their on-chain activity to someone training an AI model, but it shouldn’t be without their express consent.

Conclusion

As we discovered, designing accountable privacy for blockchains is an important hurdle to cross in a rapidly AI-driven Web3 industry. This would allow individuals, organizations, and even AI agents to engage with the network(s) securely without sacrificing trust, compliance, or usability.

In the next article, we’ll explore the solution space, from zero-knowledge proofs to selective disclosure mechanisms. The future of Web3 depends on doing this right.

Top comments (0)