Re-identification from microbiomes

A (now not-so-recent) paper by Homer et al. made a splash by showing that one could take a DNA sample from a person and detect whether they were part of the Human Genome Project (HGP) based on looking at the SNP variations from that individual together with the reported allele variations in the HGP data. More recently, a paper in PNAS by Franzosa et al. showed reidentification of individuals in the Human Microbiome Project.

Color me unsurprised. Given the richness of the data, from a purely informational point of view it seems pretty clear that people should be identifiable. As with many machine learning problems, however, the secret is in the feature encoding. Many approaches to comparing metagenomes, especially for bacterial ecologies, try to assess the variability in the population of bacteria, perhaps through mapping the to known strains. As mentioned in the Methods section, “reads were additionally mapped to a database of 649 microbial reference genomes using the Burrows-Wheeler aligner.” However, in addition to these mapping statistics, they used a few other more complicated features to help gain some additional robustness in their identification procedure.

Somehow being able to be identified by your microbiome seems less scary than being able to be identified by your genome, perhaps because we have a sense that genes are more “determining” than microbiomes. After all, you could get a fecal transplant and change your gut flora significantly. Is it the same as burning off your fingerprints? Probably not. But perhaps in the future, perpetrators of certain campus shenanigans may be easier to catch.

	Zonghong Liu on A story about Canvas
	anonymousskimmer on “The needs of the many,…
	Chanterelle Recipes… on Broiled shrimp with chanterell…
	kvarsh on ICML 2019 encouraged code subm…
	Pulkit Grover on gender inclusivity in communic…

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

An Ergodic Walk

a process whose average over time converges to the true average

Re-identification from microbiomes

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply