Since May 25th 2018 the European privacy law GDPR is effective. Although everybody knows this is all about privacy and respecting the customer’s data, it may not always be clear for you as a developer what is expected of you. Basically there are 6 principles in the GDPR that we need to follow. I’ll try to shed some light on the principles with practical examples to make you understand better what to think of during your daily work.
First it’s good to realize whether you (or your company) are the data processor or the data controller. The controller is the owner of the data, and therefore responsible for the privacy. The processor is ‘just’ processing data, but should give the controller enough guarantees that this happens safely. As a developer you should give the controller the means to keep to the law. I’ll explicitly mention this in each of the principles when applicable. Now let’s go look at the principles.
Lawfulness, fairness and transparency
“personal data shall be processed lawfully, fairly and in a transparent manner in relation to the data subject;”
The first principle basically says that you need to be honest. Don’t use personal data in a way that you wouldn’t approve of yourself if it was your personal data. Don’t try to be vague in your descriptions of how data is used. Although there may be users who don’t care, most people would like to know at least to some extent what you’re collecting about them. We should give those people the feeling that they can trust us with their data.
Be transparent and tell the end-user exactly what data you’re going to collect and process. Perhaps you’ve already done this a first time when describing the new privacy statement at the time the GDPR became effective. It is important however to keep it up-to-date. Every time you introduce a new feature that needs some new type of data, you need to explain this to the user. Are you doing research for a new feature and you keep track of contact details in an excel file, then this is relevant too. You can try to make your explanations a bit more generic, e.g. the fact that you collect usage statistics. That way you won’t need to ask for new permissions when adding another statistic, because it still fits the description.
The ‘lawfully’ part is mostly about the way you obtained the data. As long as you either have clear permission from the user, or collect the data because of legal obligations, you should be fine.
“personal data shall be collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes;
This kinda makes sense. If you tell a user you will use their email address for resetting a password and suddenly you decide to use it to send the user a newsletter, the user is not going to be happy. Even though you send the newsletter with the best intentions, it’s not allowed until they agree with the purpose you described to them.
So, what do you tell the user about the data you collect? If you are very specific, it means you cannot innovate and build new features, because you need to ask for permissions all the time. If you describe your purpose in a too broad and generic way, you may lose the trust of your customers, because you are vague and could do basically anything. So be sure to be honest, and find the right balance.
Often developers get scared to use personal data, because of the GDPR. It seems as if nothing is allowed anymore. Don’t be fooled, because this is certainly not true! There’s not much difference with before, except that now it’s more important than ever to be transparent towards the users and ask permission if you want to use their data for a certain purpose. If you want ‘real’ customer data to test a new feature, ask the customer and explain how he will benefit from the new feature. Explain which people will get access to the data, in what way, and that you will destroy the data after finishing the tests. If the customer agrees, you can continue. If not, you can either ask another customer, or accept that customers don’t want to give their data for your purpose, and find a different way to do your tests. Either way, you’ll have to accept the customer’s choice.
“personal data shall be adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed”
In the past it seemed as if the opposite was happening. Knowledge is power, so the more data you could gather, the better. That’s why this principle is added to the GDPR. If you don’t realistically need the data for the purpose you explained to the user, don’t collect or process it. Sometimes this makes sense: don’t ask the users’ home address if you only need his email address to contact him. It may also be less obvious. For example, if on your website you need to know if someone is of legal drinking age, you could store his country of residence and date of birth. Data minimization means that if a boolean value suffices, you should stick with that. The fact that you asked the question and the user said “yes” or “no” can already be enough.
Data minimization also means for example that in general in a development environment you shouldn’t use customer data, because for the purpose of developing and testing you rarely need the actual data. Of course realistic data is still needed, but there are many different ways to obtain realistic data without exposing customer data. It might be worth investing time in creating scripts to generate realistic random data. You could also try to anonymize the customer data. True anonymization however is very difficult to get right, because it is often still possible to extract real data through statistical analysis.
A good practice for production data is pseudonymization. The goal of this technique is non-attribution to an identified or identifiable person. In other words, you should try to store the data in such a way that even if it leaks, the form in which the data is stored prevents it from being linked to a natural person. This can often be done by using techniques like encryption, masking, hashing, aggregation or by using pointers to the real data, instead of including the data itself.
“personal data shall be accurate and, where necessary, kept up to date; every reasonable step must be taken to ensure that personal data that are inaccurate, having regard to the purposes for which they are processed, are erased or rectified without delay;”
As a developer you must make sure that your software has enough functionality to keep personal data up-to-date. What if Amazon wouldn’t deliver your package to the right address, because you couldn’t update your home address after you moved? Whenever you add a new feature that stores a new type of data, you should also think about how the user can update this data. Note that in theory it would be allowed if the controller just manually updates the data in the database after a phone call from the end-user. Of course it’s not practical or efficient, but for the accuracy principle it suffices. However, I would advise you to make the process of updating personal data a little easier, because it might cost you customers otherwise.
“personal data shall be kept in a form which permits identification of data subjects for no longer than is necessary for the purposes for which the personal data are processed”
This principle says you shouldn’t store personal data for many years if you only use it in the first week. Suppose you order a gift for a friend from some website on the Internet and forget about it. If nine years later that website gets hacked and your name, address and private phone number are suddenly public, then you might think: why the **** did they keep the data for so long?
The storage limitation principle means that you should delete data that you don’t need anymore for the original purpose. Did you keep an excel file with contact details for some research? Delete the file after you are done! Are you building software for which you are the processor, and customer is the controller? Make functionality available so they can delete data that is no longer relevant. Did you get a database with customer data for testing purposes? Delete it after your testing is done! You can always ask the data again later if needed.
At least you should be able to explain why you keep data as long as you do. If there’s a good reason, it’s fine. Often there are laws, in which case the explanation is easy. Otherwise, you should carefully think about these reasons, and document them. Keeping data indefinitely without reason is definitely not good.
Also, don’t make unnecessary copies of the data. It is okay to create backups in case disaster strikes. However, if you have 7 backups in different locations, it is hard to explain why that was necessary.
Integrity and Confidentiality
“personal data shall be processed in a manner that ensures appropriate security of the personal data, including protection against unauthorised or unlawful processing and against accidental loss, destruction or damage, using appropriate technical or organizational measures”
There can be no privacy without security. This principle says you need to take appropriate measures to prevent leaking data because of vulnerabilities, but also because of organizational problems. For example, if your team is using data from a customer to test a new feature, make sure you store the data in a place where this junior consultant cannot reach it.
Similarly, it is bad practice to put sensitive data in log files. Usually, logfiles are visible to too many people in the company, or sometimes even outside the company. It might seem very useful to a developer to put details of specific processing steps in the log file for debugging purposes, but be careful not to leak data this way.
If you have data from a customer that your team is using for some tests, make sure that this data is only available for you and not for the rest of the company.
There are countless more examples of practical situations that show you what to think of when dealing with the GDPR. Just remember that it is still possible to do almost anything you could do before, as long as you ask the user for permission. If the user agrees, it’s all fine. This means that it is more important now than ever to gain the user’s trust.