In GDPR vs Blockchains - The Conflict, we talked about the key properties putting GDPR and Blockchains at odds. We also talked about GDPR enforceability
limitations and highlighted how Bitcoin certainly had no wish to undermine privacy. However, this is certainly not the end of the story.
Attacking Privacy
The paper from Ferenc Beres et al., Blockchain is Watching You: Profiling and
Deanonymizing Ethereum Users, (2020) help us better appreciate the complexity of the privacy challenges we are dealing with.
Their study focuses on the deanonymization of Ethereum users, highlighting how account-based blockchains like Ethereum, provide even weaker
privacy protection than Bitcoin-like blockchains. The fundamental difference being that in account-based blockchains users are often obliged
to reuse the same addresses, making them easier to track.
In their study, user information was collected from Twitter, the Ethereum Name Service (ENS) and the
defunct HumanityDAO. Addresses were also collected
from the transactions interacting with the Tornado Cash coin mixer.
ENS is a set of smart contracts allowing users to map a name to an Ethereum address. Just by looking up twitter account names in ENS,
the study was already able to deanonymize a set of addresses.
For the target set of users, the research also collected the addresses of smart contracts with which they interacted. In turn, using the
Etherscan Label World Cloud, they were able to identify the type of services
these users were accessing. Amongst other services, the research exposed access to Gambling and Adult services.
This helps appreciate how when attacking privacy, one can count on a wide array of data sources, both off-chain and on-chain. Some of
these services might as well be GDPR compliant. Users give away bits of information that on their own may seem harmless. However, combined
with other data sources, the privacy eroding effect increases significantly.
To top it off, these combinations blur the responsibilities of individual Data Controllers, creating further enforceability hurdles.
Effectively, even without blockchains, as soon as some degree of decentralization is introduced, we start testing the enforceability
limits of GDPR.
GDPR does cater for having joint Data Controllers. However, this is relevant when the controllers are jointly determining "the purposes
and means of the processing of personal data". Data Controllers may be acting independently from each other and still open up opportunities
for obtaining private information from their combined application.
One can argue that in this scenario, anyone combining the data sources becomes an enforcement target. However, as already discussed
applications can themselves be developed and distributed in a decentralized manner.
Blockchain as a Privacy Solution
Just because the original privacy preserving features proposed in Bitcoin turned out to be insufficient, the pursuit to fulfill its
privacy vision never stopped.
The paper, Decentralizing Privacy: Using Blockchain to Protect
Personal Data (2015) is one example of a privacy preserving solution. It protects user data by moving it off-chain. At the blockchain
only a one-way hash is stored from which no user data can be obtained.
This application goes a step further putting users in control of their data. They can choose which services are to be granted access
and can withdraw access whenever they want. In GDPR terms, users become the Data Controllers for their own data.
This solution does provide a different perspective from the one pitting blockchains and GDPR against each other. However, it also raises
important considerations. Here we are making a key trade-off. Moving data off-chain is effective at shifting data to a storage that better
allows for GDPR compliance. On the other hand, the move means that such data is no longer available to the smart contracts running on
the blockchain.
In this particular case, this is not an issue. The data could be moved since the solution did not really need it. Any smart contract
application that requires such data access cannot apply the same approach. In a sense this solution leaves us none the wiser since it
mostly avoids the problem rather than solving it.
Another limitation of this solution is the fact that service providers have the opportunity to store retrieved data obtained whilst access
is granted. Just because the user can revoke data access, it does not mean he truly has full control.
A mitigation is proposed, that would limit service provider to only submit queries about the stored data, rather than directly access the
raw data. However, it still leaves the opportunity for service providers to store the responses to those queries. Plus limiting the richness
of the information that can be retrieved, also limits the functionality applications can provide.
The bottom line here is that trade-offs are being made. Safeguarding privacy can be as elusive to the technical community as much as it is
to those enforcing GDPR.
On-Chain Privacy
In their paper
Qi Feng et al. (2019)
provide a survey of the privacy preserving technologies most in use. Here the set of technologies are categorized into two, identity and transaction privacy.
Identity privacy is largely concerned with protecting the identity of the sending and receiving parties. It also aims to obfuscate the link between
transacting parties making it harder to apply graph analysis.
Transaction privacy is concerned with hiding the actual transaction content. In a cryptocurrency transfer this would be the amount being transacted
and possibly also metadata such as the transaction date. When smart contracts are involved, the dataset is extended to include the smart contract data.
The solution just discussed, was concerned with transaction privacy. Its focus was to hide the user data that would have otherwise been stored within a
smart contract.
In identity privacy the study identifies mixing services, ring signatures and non-interactive zero knowledge NIZK proofs as the most frequently used.
As for transaction privacy the two main approaches are identified to be NIZK proofs and homomorphic cryptosystems.
Unlike the hashing approach, these technologies aim to achieve on-chain privacy. Applications are not denied data access outright. Thus, privacy
preserving applications are able to provide the same functionality as those lacking privacy. Data is still stored on-chain and public access is still
granted for everyone to verify. However, the data is stored in such a way that an observer cannot learn much beyond the fact that the underlying protocols
are being properly followed.
Today we already have blockchains that are widely considered to be very effective in protecting privacy. Public blockchains like Monero and Zcash allow
for transacting their cryptocurrency privately, hiding the transacting parties' identities and the transacted amounts.
In the DLT space, Ethereum is providing some primitives for supporting NIZK proofs. These are especially interesting because of their broad applicability.
This is unlike other technologies that are more focused on protecting specific classes of information. Today the biggest challenge for NIZK adoption is its
high computational power requirements and the need for a trusted setup.
Projects like the AZTEC protocol, and zk zk Rollups are maturing their transaction privacy solution for Ethereum, providing
building blocks for other privacy preserving solutions. Other interesting projects are those from Ernst & Young, Nightfall and Starlight. Even though
not production-ready, the involvement of a big 4 accounting firm is especially significant.
So here we see that there is also a very strong push for blockchain privacy.
From all the research and development in this space, it is easy to conclude that protecting privacy is a shared priority.
Summing it up
Today we focused on how blockchains are reacting and advancing their privacy preserving safeguards. Indeed, we present GDPR
lawmakers and blockchain technologists to be united in their intents. However, this unity of intents is not true for everyone. This and other challenges
will be discussed in the article concluding this series.