The EU General Data Protection Regulation (GDPR) governs the processing of personal data in theses as well.
The research data itself may contain personal data, but personal data can also be present in the documents necessary for data collection, such as consent forms from participants. It is important to note that even an anonymous survey may generate personal data if the survey is conducted via an online form that records, for instance, the respondent's IP address or if the survey includes open-ended response options.
The student is responsible for ensuring data protection while conducting the thesis. It is the thesis advisor's duty to advise the student on data protection matters.
Data anonymization refers to the process of handling data in such a way that it no longer contains any identifiable information. In terms of personal data, this means that individuals can no longer be identified from the data through reasonable means. Additionally, information about organizations or other confidential data can also be anonymized.
Even if you do not directly collect the personal information of participants, it may still be possible to identify them from the data. For instance, an anonymous survey may not be truly anonymous if respondents can disclose information about themselves in open-ended responses, or if the survey form records the respondent's IP address (note that this does not occur if Webropol is used according to the guidelines). Such data is not anonymous and is subject to data protection laws.
Techniques for anonymization include:
Anonymization and Personal Data (Finnish Social Science Data Archive)
Guidelines for Data Anonymization from the Data Archive.
Finnish Social Science Data Archive’s guideline includes for example instructions for anonymization of both quantitative and qualitative research data.
Personal data encompasses any information that can be used to identify an individual, either directly or indirectly. Research data may also include identifying information about individuals in the study participant's close circle or other individuals. Information that can identify them is also considered personal data.
Direct personal data includes items such as a person’s full name, personal identification number, and various biometric identifiers such as fingerprints, facial images, voice samples, and handwritten signatures.
Strong indirect identifiers are individual pieces of information that can be used to identify a person with reasonable ease. Examples include an address, phone number, uncommon job title, rare medical conditions, and unique identifiers such as an IP address, student ID, or bank account number.
Indirect identifiers are any data points that, when combined, can lead to the identification of an individual. These may include gender, age, place of residence, job title, household composition, income, marital status, language, nationality, ethnic background, workplace, or educational institution. When the target population of a study is already relatively small and well-defined, combining indirect background information can make it reasonably easy to identify an individual.
Sensitive personal data refers to specific categories of personal information as defined by data protection regulations, such as the General Data Protection Regulation (GDPR). These data types reveal critical aspects of an individual's identity, including:
Sensitive personal data must be protected with heightened security measures due to the potential risks to an individual's fundamental rights that may arise from their processing. Consequently, the processing of such data is generally prohibited. However, there are exceptions to this prohibition, one of which includes the explicit consent of the individual regarding the processing of their sensitive personal data.
An ethical review must always be assessed in advance for research that involves:
Further Information on the Processing of Sensitive Personal Data: Processing of special categories of personal data (Office of The Data Protection Ombudsman)
When is a preliminary ethical review required: Ethical review (Finnish National Board On Research Integrity TENK)
The principle of minimizing the collection of personal data entails avoiding the gathering of unnecessary information. This principle should be adhered to from the planning stage of the research for your thesis. Here are key considerations for implementing this principle: