Data Privacy Blindspots in Software Development: A 2022 Guide
2022 has brought an abundance of fines for violations of data privacy regulations. However, for privacy and security teams and organizations as a whole, neither the number nor the size of the fines should be the focus of attention. The themes that group the fines listed below highlight the data privacy blind spots most often missed by organizations.
Data Mapping is Still a Challenge
Data Mapping came to the forefront in 2016 with GDPR and in 2018 with CCPA. Armed with excel sheets, surveys, and privacy tools, companies built data maps. With all the people, processes, and money spent, the problem would be solved. But six years in, companies like Meta & Twitter still call it an unsolved problem.
When whistleblower Peiter "Mudge" Zatko of Twitter testified in front of Congress, he explained that no one understood how much data Twitter collects or how it should be used.
A leaked document from Meta went into detail explaining the pains of incomplete data mapping:
'We do not have an adequate level of control and explainability over how our systems use data, and thus we can't confidently make controlled policy changes or external commitments such as 'we will not use X data for Y purpose.'
Create Dynamic Data Maps with Privacy Code Scanning
The solution to this problem lies in the code of the products & apps developers are building. Code is the source of truth on what data is collected, where it is stored, and who it is being shared with. Doing a privacy code scan across all products & apps creates an accurate data map that changes as the code changes and allows companies to have governance over the use of data.
Location Data Violations
Regulators across the US & EU tightened the rules around the collection & sharing of location data, classifying & treating it as sensitive personal data (information).
In the US, the Federal Trade Commission (FTC) provided guidance on the illegal use & sharing of location data. The focus of the guidance was on preventing companies from sharing location data with ad brokers, which could lead to harm to consumers.
While in the EU, CNIL fined a vehicle rental company for the continuous collection of the precise location. Although under GDPR, location data is not considered a special category data, it is still classified as sensitive personal data that could harm individuals requiring special protection.
Some high-profile fines for this violation include:
- Google had the biggest privacy settlement for the collection of location data even when users had paused Location History.
- UBEEQO, a vehicle rental company, was fined for continuous collection of individual's precise location.
- FTC sued Kochava Analytics for selling user location data to advertisers and third-parties.
Govern use & sharing of location data
- Data Collection: Audit your mobile applications and web apps to check if you are collecting location data. Minimize the data collection by choosing coarse location permission over precise location permission
- Data Sharing: Once you know the collection points of location data, follow where the location data goes. Be extra careful of location data flows if you monetize the mobile apps and have any third-party SDKs for ads or data brokers.
- Use of Data or Data Processing: Once location data goes into your company's data stores, it can potentially be used for purposes that are non-compatible with the collection purposes. A simple case could be a micro-service built for personalized ads that could take that data set and share it with Ad partners.
You can automate all of these by using a privacy code scanner that will give you full visibility into location data collection, sharing & use.
Tracking Pixel Data Breaches
The use of tracking pixels in client portals and other web properties where the patient or customer data is shared has led dozens of health organizations to disclose leaks of patient information. It has resulted in new guidance from the HSS. This type of leak is expected to affect at least one-third of the top 100 hospitals in the US as they share data with Meta pixel.
The issue also affects financial organizations and tax preparers (read how). And despite how recently this issue has come to light, new guidance has already been issued, and lawsuits have been filed.
Purpose Limitation
While collecting personal data, companies inform users of the purposes or goals for which the data will be used. However, it is easy to drift away from the original purpose and use data for purposes that users never imagined. Privacy laws like GDPR have specific obligations on companies of purpose limitation to check this drift. In the US, FTC holds companies to the promises of purposes and data use made in their privacy notice. FTC's recent fine of $150 million on Twitter for using Phone Numbers collected for Account Security or Authorization and instead used for targeted ads is a perfect enforcement crackdown on misuse of data.
Purpose Limitation guardrails
The solution to purpose limitation is usually two extremes, do nothing, which leads to a hefty fine, or be overtly cautious & have a tight approval process, in which case you are not getting the advantages of data. In our experience, even those companies who try to be overtly cautious can't reliably cover all possible cases of purpose drift.
Purposes are tied to data processing, and a large part of that happens via micro-services in a company. You can implement purpose limitation by first mapping the purposes of each micro-service along with what data they process. Once you have the mapping, you will have a Purpose graph for each data type that can be used to remove any non-compatible use of data.
For continuous compliance, detect new micro-services and the use of new personal data types in existing services. Privacy Code Scanning automatically builds the data map for your micro-services and purpose graph & monitors new code changes to prevent purpose drift.
Privacy by Design
Data Protection by Design & Default is one of the key obligations under GDPR. In 2022, we saw fines focussed on features built without Privacy by Design:
- Discord: Clicking a cross button in a Discord voice room minimized the room and did not exit it. Default privacy control & expectations suggest clicking a cross button would exit the user leading to a violation of Privacy by Design.
- Clubhouse: Italian regulator fined Clubhouse App for GDPR violations including transparency, consent, DPIA requirements. These violations were a miss because of missing Privacy by Design in the Software Development.
- Meta [Facebook] : Irish regulator fined the biggest GDPR fine to Meta for data breach due to data scraping of Meta users. Interestingly, this breach and fine pertains to the 'Add Member' feature, which was built and allowed data scraping.
Failing to address privacy at the design stage of product development is a common theme across these and other fines. Even with the best intentions, this is a huge challenge because the surface area for software development is huge and is continuously changing. This means privacy & security teams can't be in every room, leading to violations of Privacy by Design.
Continuous Privacy by Design
A good starting point for Privacy by Design is to do privacy reviews at the design stage of a new feature. Nishant Bhajaria, in his book, explains the process of doing a Technical Privacy Review along with a Legal Privacy Review for new features. CNIL also has a great guide for developers here.
Privacy reviews are a great starting point, but you will realize that it cannot catch up with the speed of development. In the agile world, developers are tasked to build great products at lightning speed and combine this with distributed architectures (micro-services, data pipelines, etc.). It becomes impossible to cover all these changes with privacy reviews alone.
To scale Privacy by Design & make it continuous, you can use a privacy code scanner to detect the relevant changes that break privacy among thousands of changes. This frees up the time of both development & privacy teams from looking at all changes to focusing on the ones that matter. Another solution is to leverage Policy as Code to describe the rules about the allowed use of data and then leverage Privacy Code Scanners to enforce it in the software development lifecycle.
Crystal Gazing into 2023
As we go into 2023, the trend of privacy-eating software development will accelerate further. There are five state laws in the US, India PDP Bill, and FTC is currently doing rule-making on privacy & data security. Sensitive data sharing is under the spotlight. These privacy tailwinds will force companies to embed privacy into the software development lifecycle but still ship at the same velocity as before.
Privacy Code Scanners eliminate this trade-off of speed and privacy, enabling companies to ship products fast without breaking any privacy laws or leaking customer data.
Anuj Agrawal is a Developer Relations Engineer at Privado