Last year, Italy’s prime minister at the time, Matteo Renzi, announced that IBM would invest $150 million dollars building a new research center in Milan for its Watson Health division, which applies “cognitive computing” techniques to healthcare. As usual, much was made of what was presented as a big win for Italy and its citizens, before rapidly disappearing from the public view. A year later, the Italian journalist Gianni Barbacetto obtained the relevant memorandum of understanding signed by IBM and the Italian government, which revealed the high price the latter would pay.
In return for that $150 million investment, IBM will receive the medical records of 61 million Italians in what seems to be their entirety. According to Barbacetto (original in Italian), the information provided will include: demographic data; all medical conditions, diagnoses, and their treatment; emergency and other hospital visits, including dates and times; prescriptions and their costs; genomic data and information about about any cancers; and much else besides.
This information will be supplied in a supposedly anonymous form, with obvious personal indicators removed. However, it has been known for decades that detailed medical records can never be considered truly anonymous. Here’s what Ross Anderson, Professor of Security Engineering at the Computer Laboratory, University of Cambridge, wrote in 1998 on the topic of de-identifying medical data:
although it is not too difficult to de-identify data that provide only a time-limited snapshot of a population’s health – such as the data which health services use to compile monthly management statistics of numbers of operations, consumption of drugs and the like – it is effectively impossible to de-identify longitudinal records, that is, records which link together all (or even many) of the health care encounters in a patient’s life.
You only need a few reasonably specific medical facts