Data Vault Deployment

Data Vault Deployment

Docker is required to install and run OwnYourData. The data vault is installed on the commandline and requires the following 5 docker images:

  • Database: postgres:9.6.4
  • Message Broker: rabbitmq:3-management
  • OwnYourData Data Vault: oydeu/oyd-pia2
  • Schedulder: oydeu/srv-scheduler
  • Running periodic tasks: oydeu/srv-worker

A script on Github is available for configuring and starting Docker containers. Run the following command in an empty directory to download and start the script:

bash <(wget -qO- https://raw.githubusercontent.com/OwnYourData/oyd-pia2/master/local/oyd_local_setup.sh)

You have to provide the following four pieces of information:

  • IP address of the computer on which you want to run the data vault
  • Port number under which the data vault is to be reached
    (default value: 3000)
  • Gmail address for sending emails – necessary to create an account in the data vault and to send the weekly data summaries (eg: my.email@gmail.com)
  • the associated password to the Gmail address

The data vault is then available under the respective IP address and user accounts can be set up. The following blog posts describe how to use OwnYourData on your own NAS (such as Synology) and on an Asus VivoMini.

Managing Personal Data in Emergency Calls

In the DECTS project – funded by NGI TRUST Grant Agreement No. 825618 – OwnYourData and DEC112 implemented a Proof-of-Concept to demonstrate sharing personal data between an emergency caller and a control room.

Problem Description

The DEC112 App for deaf emergency chats allows users to store personal information (profile data) at the phone to be automatically shared with an operator in a control room when an emergency chat is initiated. But storing this profile data has a few disadvantages like inability to migrate this data when switching phones and also security concerns (anyone gaining access to this phone can read and edit the data). It therefore makes sense to also provide an option to store this emergency information securely in the cloud which in turn generates several challenges:

  • referencing and accessing emergency information in the cloud
  • migrating between cloud storage providers (GDPR Article 20 – Right to Data Portability)
  • guaranteeing secure storage of profile data

This blog posts describes the implementation of managing profile data in a Personal Data Store addressing the above-mentioned challenges.

Self Sovereign Identity

Decentralized Identifiers (DID) provide an elegant and self-determined way of managing access to personal data. Using cryptographic methods and blockchain technology a user can generate a DID (i.e., a unique token) that references a DID Document. In this document a service endpoint can be specified that provides a certain type of service. Since only the user is in possession of the cryptographic key to manage the DID Document it can be edited only by this user.

For the given use case of deaf emergency chats the DEC112 app automatically generates a DID at user registration and to reference an account in the OwnYourData Data Vault (source available here: https://github.com/OwnYourData/oyd-pia2). The user has the option to choose between storing the data on the phone or in a Personal Data Store and when choosing Personal Data Store the respective DID is shown on the Profile page.

 

Shamir’s Secret Sharing

The OwnYourData Data Vault is a Personal Data Store that recently received the MyData Operator Status and is used as the default cloud storage for personal information associated with an DEC112 user account. Since it stores personal data End-to-End encrypted it was necessary to develop a solution to exchange this encrypted data with a control room. The component that manages data provisioning in case of an emergency chat is called PI2 (Personal Identifiable Information) – source code available here: https://github.com/OwnYourData/service-pi2.

Upon creating an account in the OwnYourData Data Vault two key parts a created using the Shamir Secret Sharing scheme. One key part remains in the Data Vault while the other key part is sent to all participating PI2 services located in the respective control rooms. The profile data can only be decrypted when both key parts come together. Additionally, the Personal Data Stores logs any access to the encrypted profile data to guarantee complete documentation.

The data flow for accessing profile data during an emergency chat is displayed in the following sequence diagram.

The basic steps in providing personal information upon initiating an emergency chat include:

  1. The Viewer (client-side application used by the Call Taker) requests from PI2 additional information about the Emergency Call based on the DID delivered along the emergency chat.
  2. PI2 resolves the DID (i.e., retrieves the DID document) and gets the service endpoint of the PDS holding emergency information
  3. PDS sends encrypted emergency information together with one key part for decryption
  4. PI2 combines stored and provided key parts to decrypt emergency information and responds to Call Taker

Conclusions

This blog post described the infrastructure to share data in a secure and self-determined way. Users leverage capabilities of DIDs to manage a trusted service endpoint and use Shamir’s Secret Sharing scheme for a purpose-based data provisioning.

 

NGI DAPSI Funding for Digital Immunization Passport

Digital Immunization Passport has been selected as one of the brightest data portability projects in Europe

NGI Innovators

OwnYourData and the Human Colossus Foundation have been selected to take part in the Data Portability & Services Incubator (DAPSI), a 3-year EU funded project that empowers internet innovators to develop new solutions in the data portability field.

What is DAPSI?

The Data Portability and Services Incubator (DAPSI) is a EU funded project, under the European Commission’s Next Generation Internet (NGI) initiative, to empower top internet innovators to develop human-centric solutions, addressing the challenge of personal data portability on the internet, as foreseen under the GDPR and make it significantly easier for citizens to have any data which is stored with one service provider transmitted directly to another provider.

DAPSI will support top-notch projects through a 9-month incubation programme where experts in diverse fields will provide a successful working methodology, access to top infrastructure, training in business and data related topics, coaching, mentoring, and a vibrant ecosystem. On top of that, each DAPSI project can receive up to €150k equity-free funding.

Digital Immunization Passport (DIP) is taking part in DAPSI to create a state-of-the-art digital certificate of vaccination. Currently, vaccination and immunization information are spread over different organizations like labs and hospitals as well as pharmaceutical companies together with government agencies. A patient usually only has a paper certificate that provides vaccination treatments with often difficult to read handwritten additional information. In the proposed solution OwnYourData and Human Colossus Foundation will develop an end-to-end data flow that:

  • connects to various immunization information providers,
  • aggregates, harmonizes, and semantically annotates the data,
  • makes this data-set available in an open format accessible for Personal Data Stores (PDS), compliant to the MyData Operator guidelines,
  • provide a human-centric data management platform for health information,
  • allows to prove immunization status through Verifiable Credentials, and
  • packages selected health data for processing by 3rd parties together with a well-defined usage policy and a provenance trail.

The main focus of this project is on Data Interoperability & Compatibility through establishing interfaces between health industry and individuals as well as pushing forward on standardized interfaces for PDSs. Additionally, we address Data Transparency (Usage Policies and Data Provenance in Semantic Containers) and Security & Privacy (by applying blockchain technology and digital watermarking on data sharing).

 

How does it work?

In the two-phase supporting programme, the projects will develop advanced solutions in the Data Portability field. From September 2020 to February 2021 (Kick-Start phase) they will develop a solution related to a specific use case. The best projects will progress to the second phase (Booster) where the use cases will be fostered to evolve into solid projects to gain enough traction for deployment and get ready for the market.

 

Follow our journey through DAPSI!

Take a look at the DAPSI project portfolio to see more information about the selected innovators. The Digital Immunization Passport page is available here.

To read more about DAPSI, please visit the website: dapsi.ngi.eu

 

OwnYourData is MyData Operator 2020

A bold new initiative to shape the new normal of data has arrived – MyData Operator 2020 status awarded to OwnYourData.

In July 2020, OwnYourData was awarded the inaugural status of MyData Operator 2020. The award was given to 16 world-leading organisations from 12 countries, all working for humans-centric approaches to personal data.

A MyData operator, as described in the white paper Understanding MyData Operators, is a provider of infrastructure for personal data management and a key element in creating sustainable ecosystems for fair and ethical use of personal data.  MyData operators provide interoperability at the technical, informational and governance levels to support the flow of personal data across services. They are examples of a human-centric approach to what the recent EU data strategy calls “novel data intermediaries” and which are set to play a critical role in the provision of the strategy’s vision of data spaces.

Radical collaboration for human-centric data control

The awarded organisations are working together with their competitors on areas of mutual interest to create a successful, interoperable personal data ecosystem. “MyData Global congratulates awardees for their integrity, dedication, and openness in this unprecedented sharing of information that goes far beyond the requirements of any regulation. Ultimately, it is individuals who will benefit from this shared understanding of MyData operators – with increased transparency, better choices, and more innovative, responsible services,” explains Joss Langford, co-lead of the MyData Operators Thematic Group and the lead editor for Understanding MyData Operators white paper.

More information

The organisations awarded the MyData Operator 2020 status are listed here: mydata.org/operators/

The MyData Operator 2020 Award was created by the internationally recognised nonprofit MyData Global. The award acknowledges organisations that place the individual at the centre of personal data about them, provide tools to help them manage personal data, and have the individual as the primary beneficiary of this data.

 

Certificate of Award

Diabetes Data Processing

In the NGI funded MyPCH project OwnYourData developed several technologies for secure and traceable data exchange of diabetes data: Digital Watermarking, Semantic Annotation, and Data Traceability. In addition, we also participated in a MyData Health initiative to write a feature article for the European Medical Writers Association (EMWA) and we are proud to announce that the article written by 14 individuals from 9 countries around the 6 MyData principles is already published and public available via the EMWA journal website – see section “Data Interoperability” and “Establishing trust between stakeholders for health data use” for example use cases of Semantic Containers.

In this blog post we cover the successful integration of diabetes data into the OwnYourData Data Vault. In this data flow, persons with diabetes (Pwd) can not only transfer their data to a Personal Data Store but also perform SPARQL queries to combine their diabetes data with public information – more information and examples are available here.

A special feature in the OwnYourData Data Vault is the Personal Knowledge Graph shown on the left side of the main screen. It compiles available data from the respective user and presents the information in a clear form. The screenshot below shows information about recent GPS Data, blood sugar levels over the course of a few days, as well as record numbers overall.

Beyond information in the Personal Knowledge Graph plugins allow further data exploration. In the course of the MyPCH project however, we decided to use existing tools like R or Jupyter Notebooks to provide more sophisticated visualization and analysis mechanisms. The R-Notebook available on Github is an example how to retrieve and decrypt information in the Data Vault and compile a report.

If you have any questions using Diabetes data with Semantic Containers or within the OwnYourData Data Vault don’t hesitate to contact us as support@ownyourdata.eu.

NGI Funding for DECTS

The project DECTS (Deaf Emergency Chat and Training System) aims to provide deaf emergency calls and a training environment in several languages. With the help of a chatbot, deaf people can learn how to use the app and at the same time generate test data for training the control center personnel. The users can determine whether the entries are used as test data and there is documentation about origin, GDPR-compliant provision, and use of the data. The consent to the use of the data can also be changed and revoked later.

The teams from OwnYourData and DEC112 work together to implement this. In previous projects, an infrastructure was already set up in Austria for deaf emergency calls and the challenge now lies in international operations – for example when an Austrian tourist is on vacation in Copenhagen: in this case an emergency call is made via the DEC112 App registered in Austria and conveyed to the control center in Copenhagen.

A chatbot is developed for the training environment that simulates a control center. In a test chat, structured information is queried and further questions arise when using certain keywords. In cooperation with emergency call centers, typical conversations were analyzed and so-called decision trees were created, which the chatbot automatically processes.

If a user consents to the further use of the chat protocol, this consent can be managed in the OwnYourData Data Vault. There the consent of the transfer of the data is documented and it is possible to query when the data was accessed. In particular, however, access can also be restricted or subsequently prohibited. Semantic containers are used as the technology platform, which ensure data access is transparent and traceable.

Finally, personal data (emergency contacts, medical data and other information) can also be stored in the OwnYourData Data Vault, which is automatically provided to the control center in the event of an emergency chat. This personal data is referenced using a DID (Decentralized ID) and the data itself is stored encrypted. The Shamir’s Secret Sharing scheme ensures that the data can only be read by the user and the control center, but cannot be accessed by OwnYourData or DEC112.

The system architecture for the project is shown in the graphic below, together with the data flows between the individual components. All parts are now at least available as prototypes and the first end-to-end tests were carried out in May.

Semantic Technologies for Semantic Container

The vast amount of data being produced everyday requires semantics to provide meanings to the data. The recent advances of Semantic Web technologies provides us with standard data formats (i.e., Resource Description Framework – RDF), vocabularies (e.g., PROV-O for provenance) and tools (e.g., SPARQL for querying data) to add semantics to data and structuring the metadata. This use of semantics would in turn allows advances data analysis and processing to the data and its metadata.

Semantics for Diabetes Data

The MyPCH project aims to develop methods and tools to allow diabetes patients to share their data using Semantic Container. As part of the process, we provide a mean to transform the raw JSON data of diabetes patients’ observation data into RDF. The transformation is conducted in two steps: (i) Ontology definition, and (ii) Raw data transformation

We define a MyPCH ontology for the patient’s observation data based on the HL7-FHIR ontology for patient observation. The ontology is designed to maintain a high-degree of compliance with the original FHIR model while reducing the needs for data duplications required in the original FHIR-RDF model. An excerpt of the adapted ontology is shown in the figure below.

Figure: the FHIR-based ontology for the MyPCH project (click to enlarge)

As the next step, we utilize the RML mapping and CaRML engine to transform the raw JSON data into its RDF. The current version of RML mapping is available in Semantic Container. The resulted RDF representation of patients’ data would allow further analysis of the data using SPARQL queries (see also the tutorial for using these technologies in the semcon/sc-sparql container and the following dataflow with concrete diabetes examples). Furthermore, it would allow inferring additional knowledge based on the underlying knowledge with logical reasoning as well as integration of additional background knowledge (e.g., information about devices used for diabetes monitoring) and external knowledge, e.g., WikiData and DBPedia.

Extending SPECIAL Vocabularies for MyPCH user consent

Semantic Containers utilizes RDF-based vocabularies to represent its metadata. It adopts the standard Provenance Ontology (W3C PROV-O) to represent its provenance information and SPECIAL vocabularies to represent user consent, in addition to a custom RDF vocabularies to represent container’s metadata. We have extended the user consent vocabularies (SPECIAL vocabularies) with a set of additional classes specific for Semantic Container use cases using a specific namespace scp: <http://w3id.org/semcon/ns/policy#>. We utilize the vocabulary to automatically check the compliance between user consent of their data and the possible usage of the data by a data processor.

In the MyPCH project, we extend it further with three additional classes for diabetes patients’ data. The classes are assigned as specific types of SPECIAL health data category (svd:Health). These classes are for now scp:Diabetes (i.e., the generic diabetes patients’ data), scp: DiabetesSensor (i.e., insulin data observation from sensors), and scp:InsulinPump (i.e., insulin data observation provided by insulin pumps).

Personium Collaboration

Personium and OwnYourData are proud to announce a partnership for collaborating in Personal Data Store interoperability.

During the MyData 2019 conference (September 2019) in Helsinki, several PDS (Personal Data Store) innovators had come together and discussed openly regarding the interoperability between different PDSs. After the conference, Christoph Fabianek (OwnYourData) and Salman Farmanfarmaian (Freezr) contributed a first draft of the implementation document: CEPS – Common Endpoint for Personal data Stores.

The goal of the initiative is to allow users of different PDSs (right now: Freezr, Personium, Datafund, Data Vault) to use the same app (i.e., data processing capabilities) and switch between data stores. In a first step a simple app (Tally Zoo: maintain a tally chart for frequent tasks and collect your everyday data) was implemented that can connect natively to the individual PDSs. Subsequently, common interfaces for authorization and common operations like read/write should be harmonized across participating organizations to decouple data store development and app development.

Feel free to contact Dixon or Christoph if you have any questions or want to join the effort. We expect to showcase results in the upcoming MyData 2020 events.

Milestone 3/3 – Summer

Semantic Containers for Data Mobility has reached all project goals by June 2019 and is now finalized with the final report. The three prototypes have developed well and Semantic Containers will continue to be used as infrastructure, especially at ZAMG.

As in the previous posts we summarize here our accomplished goals for this last milestone:

Dates, dates, appointments!

  • SemCon was presented at the PyDays 2019 on May 3rd and we submitted a scientific publication at the Semantics 2019 in Karlsruhe.
  • Our final project presentation took place on 17 June 2019 at the MyData Meetup at the WU Vienna (slides are available here). We were happy about a lively participation and lively discussions on the topic “Data as Propery or Human right”. That’s what MeetUps should be like!

  • Another presentation is planned on September 27, where Christoph Fabianek will give a talk at the MyData 2019 with Diabetes.services from Denmark. We will continue to work on the topic of Semantic Containers even beyond the end of the funded project period!

In summary, we can look back on 9 successful project months. SemCon was presented around 10 times at events, we had about 60 internal project meetings, we were able to meet our milestones in time and there are 3 prototypes that show how the exchange with data can work simply and safely: by packaging data with usage rights and provenance information together with the necessary processing logic to enable reliable and easy management.