This is the first post in a short series about those dimensions of privacy that the insurance sector should pay most attention to. In these posts, I’ll be looking at why the public have concerns about these dimensions of privacy and only in passing at how the law treats them. This is in part because the law invariably lags public opinion, but mainly because insurers need to tune into their customers just as much as their lawyers.
I’m starting with aggregation, which involves the gathering and bringing together of data about a person from various different sources. The logic behind aggregation is that while a single piece of data here or there is not very informative, when several such pieces are brought together, they begin to form more of a portrait of that person.
Why does the whole become greater than the sum of its parts in this way? It’s largely down to data synergies. Aggregating data can reveal new facets of a person that she may not have expected to become known when she originally divulged each item of isolated information.
One may think then that if all this information had already been disclosed in one place or another, then bringing it together should not raise privacy concerns. Not quite. The public distinguishes between the scattered disclosure of pieces of information at various points in our lives and the aggregation of it all in one place. We know that to live in an organised society we need to disclose information along the way, but we often feel uncomfortable when it starts being bundled together.
Why do we feel uncomfortable about this? The problem is that aggregated data can be informative, but it can also be misleading. The picture you put together about someone from scattered pieces of data may bear little resemblance to the real person. If you disconnect a piece of data from the context in which it was originally disclosed, there is a danger that it will be simplified, perhaps even distorted. That original disclosure could have been in a complex situation with quite different obligations and incentives to that of an insurance contract. Stripping away that context and attaching that piece of data to someone’s record runs the risk of its meaning being contaminated.
Yet how does this differ from the situation up until now in which policyholders have been obligated to disclose a wide range of data about themselves with utmost good faith? It differs in two ways: firstly, there’s a single, clear context (the insurance contract) in which disclosure is taking place, as opposed to a multiplicity of differing contexts. And secondly, data aggregation is invariably undertaken by computer according to a prescribed formula. This creates a rigid and unyielding process in which computerised information is accorded greater weight and human interventions and interpretations become minimal. Data that might actually require substantial evaluation is instead reduced to discrete entries in preassigned categories.
To sum up. Data is not neutral. Its meaning relies heavily on the context in which it is disclosed (think of common phrases used in different ways). Ignoring that context can distort the meaning you try to draw from its aggregation. I’ll explore the implications this can have for insurers in another post tomorrow.