Updates from the OwnYourData team.
Updates from the OwnYourData team.
In the NGI funded MyPCH project OwnYourData developed several technologies for secure and traceable data exchange of diabetes data: Digital Watermarking, Semantic Annotation, and Data Traceability. In addition, we also participated in a MyData Health initiative to write a feature article for the European Medical Writers Association (EMWA) and we are proud to announce that the article written by 14 individuals from 9 countries around the 6 MyData principles is already published and public available via the EMWA journal website – see section “Data Interoperability” and “Establishing trust between stakeholders for health data use” for example use cases of Semantic Containers.
In this blog post we cover the successful integration of diabetes data into the OwnYourData Data Vault. In this data flow, persons with diabetes (Pwd) can not only transfer their data to a Personal Data Store but also perform SPARQL queries to combine their diabetes data with public information – more information and examples are available here.
A special feature in the OwnYourData Data Vault is the Personal Knowledge Graph shown on the left side of the main screen. It compiles available data from the respective user and presents the information in a clear form. The screenshot below shows information about recent GPS Data, blood sugar levels over the course of a few days, as well as record numbers overall.
Beyond information in the Personal Knowledge Graph plugins allow further data exploration. In the course of the MyPCH project however, we decided to use existing tools like R or Jupyter Notebooks to provide more sophisticated visualization and analysis mechanisms. The R-Notebook available on Github is an example how to retrieve and decrypt information in the Data Vault and compile a report.
If you have any questions using Diabetes data with Semantic Containers or within the OwnYourData Data Vault don’t hesitate to contact us as firstname.lastname@example.org.
The project DECTS (Deaf Emergency Chat and Training System) aims to provide deaf emergency calls and a training environment in several languages. With the help of a chatbot, deaf people can learn how to use the app and at the same time generate test data for training the control center personnel. The users can determine whether the entries are used as test data and there is documentation about origin, GDPR-compliant provision, and use of the data. The consent to the use of the data can also be changed and revoked later.
The teams from OwnYourData and DEC112 work together to implement this. In previous projects, an infrastructure was already set up in Austria for deaf emergency calls and the challenge now lies in international operations – for example when an Austrian tourist is on vacation in Copenhagen: in this case an emergency call is made via the DEC112 App registered in Austria and conveyed to the control center in Copenhagen.
A chatbot is developed for the training environment that simulates a control center. In a test chat, structured information is queried and further questions arise when using certain keywords. In cooperation with emergency call centers, typical conversations were analyzed and so-called decision trees were created, which the chatbot automatically processes.
If a user consents to the further use of the chat protocol, this consent can be managed in the OwnYourData Data Vault. There the consent of the transfer of the data is documented and it is possible to query when the data was accessed. In particular, however, access can also be restricted or subsequently prohibited. Semantic containers are used as the technology platform, which ensure data access is transparent and traceable.
Finally, personal data (emergency contacts, medical data and other information) can also be stored in the OwnYourData Data Vault, which is automatically provided to the control center in the event of an emergency chat. This personal data is referenced using a DID (Decentralized ID) and the data itself is stored encrypted. The Shamir’s Secret Sharing scheme ensures that the data can only be read by the user and the control center, but cannot be accessed by OwnYourData or DEC112.
The system architecture for the project is shown in the graphic below, together with the data flows between the individual components. All parts are now at least available as prototypes and the first end-to-end tests were carried out in May.
The vast amount of data being produced everyday requires semantics to provide meanings to the data. The recent advances of Semantic Web technologies provides us with standard data formats (i.e., Resource Description Framework – RDF), vocabularies (e.g., PROV-O for provenance) and tools (e.g., SPARQL for querying data) to add semantics to data and structuring the metadata. This use of semantics would in turn allows advances data analysis and processing to the data and its metadata.
The MyPCH project aims to develop methods and tools to allow diabetes patients to share their data using Semantic Container. As part of the process, we provide a mean to transform the raw JSON data of diabetes patients’ observation data into RDF. The transformation is conducted in two steps: (i) Ontology definition, and (ii) Raw data transformation
We define a MyPCH ontology for the patient’s observation data based on the HL7-FHIR ontology for patient observation. The ontology is designed to maintain a high-degree of compliance with the original FHIR model while reducing the needs for data duplications required in the original FHIR-RDF model. An excerpt of the adapted ontology is shown in the figure below.
Figure: the FHIR-based ontology for the MyPCH project (click to enlarge)
As the next step, we utilize the RML mapping and CaRML engine to transform the raw JSON data into its RDF. The current version of RML mapping is available in Semantic Container. The resulted RDF representation of patients’ data would allow further analysis of the data using SPARQL queries (see also the tutorial for using these technologies in the semcon/sc-sparql container and the following dataflow with concrete diabetes examples). Furthermore, it would allow inferring additional knowledge based on the underlying knowledge with logical reasoning as well as integration of additional background knowledge (e.g., information about devices used for diabetes monitoring) and external knowledge, e.g., WikiData and DBPedia.
Semantic Containers utilizes RDF-based vocabularies to represent its metadata. It adopts the standard Provenance Ontology (W3C PROV-O) to represent its provenance information and SPECIAL vocabularies to represent user consent, in addition to a custom RDF vocabularies to represent container’s metadata. We have extended the user consent vocabularies (SPECIAL vocabularies) with a set of additional classes specific for Semantic Container use cases using a specific namespace
scp: <http://w3id.org/semcon/ns/policy#>. We utilize the vocabulary to automatically check the compliance between user consent of their data and the possible usage of the data by a data processor.
In the MyPCH project, we extend it further with three additional classes for diabetes patients’ data. The classes are assigned as specific types of SPECIAL health data category (
svd:Health). These classes are for now
scp:Diabetes (i.e., the generic diabetes patients’ data),
scp: DiabetesSensor (i.e., insulin data observation from sensors), and
scp:InsulinPump (i.e., insulin data observation provided by insulin pumps).
Personium and OwnYourData are proud to announce a partnership for collaborating in Personal Data Store interoperability.
During the MyData 2019 conference (September 2019) in Helsinki, several PDS (Personal Data Store) innovators had come together and discussed openly regarding the interoperability between different PDSs. After the conference, Christoph Fabianek (OwnYourData) and Salman Farmanfarmaian (Freezr) contributed a first draft of the implementation document: CEPS – Common Endpoint for Personal data Stores.
The goal of the initiative is to allow users of different PDSs (right now: Freezr, Personium, Datafund, Data Vault) to use the same app (i.e., data processing capabilities) and switch between data stores. In a first step a simple app (Tally Zoo: maintain a tally chart for frequent tasks and collect your everyday data) was implemented that can connect natively to the individual PDSs. Subsequently, common interfaces for authorization and common operations like read/write should be harmonized across participating organizations to decouple data store development and app development.
Feel free to contact Dixon or Christoph if you have any questions or want to join the effort. We expect to showcase results in the upcoming MyData 2020 events.
A digital watermark is a kind of marker covertly embedded in data and is also sometimes referred to as “the practice of imperceptibly altering a work to embed a message about that work”. For Semantic Container a digital watermark is a unique digital fingerprint that is applied to data provided by a Semantic Container, i.e., any data request results in a dataset with insignificant errors that uniquely identifies the recipient of the data set. In case such a dataset is leaked and appears in an unintended location, the person who originally requested and leaked the dataset can be identified. This blog post describes the design of the digital watermarking that will be implemented in the course of the currently ongoing MyPCH project.
To embed a watermark into a dataset the following two steps are performed:
To detect a watermark in a suspicious dataset the following two steps are performed and require the original data to be available:
The above process including various test cases for attacks will be implemented in the next weeks and will soon be available in the Semantic Container base package. Feel free to reach out to us with any questions or comments!
Semantic Containers for Data Mobility has reached all project goals by June 2019 and is now finalized with the final report. The three prototypes have developed well and Semantic Containers will continue to be used as infrastructure, especially at ZAMG.
As in the previous posts we summarize here our accomplished goals for this last milestone:
Dates, dates, appointments!
In summary, we can look back on 9 successful project months. SemCon was presented around 10 times at events, we had about 60 internal project meetings, we were able to meet our milestones in time and there are 3 prototypes that show how the exchange with data can work simply and safely: by packaging data with usage rights and provenance information together with the necessary processing logic to enable reliable and easy management.