Shannon Kroes
Logo Scientist Innovator
I’m Shannon Kroes, and I currently work as a Scientist Innovator at TNO, where I’m all about building privacy-preserving data solutions.

Education
  • Utrecht University
    Utrecht University
    Master of Science, Methodology and Statistics
    2012 - 2016
  • Utrecht University
    Utrecht University
    Bachelor of Science, Psychology
    2016 - 2017
Experience
  • TNO
    TNO
    Scientist Innovator
    2022 - Present
  • Sanquin
    Sanquin
    PHD Candidate
    2017 - 2022
Publications
Evaluating Cluster-Based Synthetic Data Generation for Blood-Transfusion Analysis

Shannon K. S. Kroes, Matthijs van Leeuwen, Rolf H. H. Groenwold, Mart P. Janssen

Journal of Cybersecurity and Privacy 2023

Cluster-based synthetic data generation (CBSDG) offers an explainable, privacy-preserving way to share sensitive data. We applied CBSDG to 250,729 real blood-transfusion records and trained SVMs to predict donor hemoglobin levels, matching original precision (male 0.997, female 0.987) and showing comparable recall and feature‐impact patterns. Most attributes became harder to infer, with only deferral status and sex remaining detectable—demonstrating CBSDG’s promise for practical use.

Evaluating Cluster-Based Synthetic Data Generation for Blood-Transfusion Analysis

Shannon K. S. Kroes, Matthijs van Leeuwen, Rolf H. H. Groenwold, Mart P. Janssen

Journal of Cybersecurity and Privacy 2023

Cluster-based synthetic data generation (CBSDG) offers an explainable, privacy-preserving way to share sensitive data. We applied CBSDG to 250,729 real blood-transfusion records and trained SVMs to predict donor hemoglobin levels, matching original precision (male 0.997, female 0.987) and showing comparable recall and feature‐impact patterns. Most attributes became harder to infer, with only deferral status and sex remaining detectable—demonstrating CBSDG’s promise for practical use.

Generating synthetic mixed discrete-continuous health records with mixed sum-product networks

Shannon K S Kroes, Matthijs van Leeuwen, Rolf H H Groenwold, Mart P Janssen

Journal of the American Medical Informatics Association (JAMIA) 2022

Mixed sum-product networks (MSPNs) instantiate private data representations from which synthetic patient records are drawn, enabling secure exchange for downstream statistical analyses. Rigorous evaluation against privacy and information-loss metrics demonstrates the approach’s capacity to uphold confidentiality while preserving analytical utility.

Generating synthetic mixed discrete-continuous health records with mixed sum-product networks

Shannon K S Kroes, Matthijs van Leeuwen, Rolf H H Groenwold, Mart P Janssen

Journal of the American Medical Informatics Association (JAMIA) 2022

Mixed sum-product networks (MSPNs) instantiate private data representations from which synthetic patient records are drawn, enabling secure exchange for downstream statistical analyses. Rigorous evaluation against privacy and information-loss metrics demonstrates the approach’s capacity to uphold confidentiality while preserving analytical utility.

Evaluating privacy of individuals in medical data

Shannon K S Kroes, Mart P. Janssen, Rolf H.H. Groenwold, Matthijs van Leeuwen

Health Informatics Journal 2021

A novel, variable-centric paradigm merges rigorous information-theoretic metrics with dynamic visualizations to elevate the understanding of individual privacy risk within medical datasets. By contextualizing each feature’s real-world exploitability, the methodology fulfills regulatory mandates and informs the creation of nuanced anonymization schemes, thereby illuminating the delicate balance between data utility and confidentiality.

Evaluating privacy of individuals in medical data

Shannon K S Kroes, Mart P. Janssen, Rolf H.H. Groenwold, Matthijs van Leeuwen

Health Informatics Journal 2021

A novel, variable-centric paradigm merges rigorous information-theoretic metrics with dynamic visualizations to elevate the understanding of individual privacy risk within medical datasets. By contextualizing each feature’s real-world exploitability, the methodology fulfills regulatory mandates and informs the creation of nuanced anonymization schemes, thereby illuminating the delicate balance between data utility and confidentiality.