Via Mashable
-----
The world of Big Data is one of pervasive data collection and
aggressive analytics. Some see the future and cheer it on; others rebel.
Behind it all lurks a question most of us are asking — does it really
matter? I had a chance to find out recently, as I got to see what
Acxiom, a large-scale commercial data aggregator, had collected about
me.
At least in theory large-scale data collection matters quite a bit. Large data sets can be used to create social network maps
and can form the seeds for link analysis of connections between
individuals. Some see this as a good thing; others as a bad one — but
whatever your viewpoint, we live in a world which sees increasing power
and utility in Big Data’s large-scale data sets.
Of course, much of the concern is about government collection. But
it’s difficult to assess just how useful this sort of data collection by
the government is because, of course, most governmental data collection
projects are classified. The good news, however, is that we can begin
to test the utility of the program in the private sector arena. A useful
analog in the private sector just became publicly available and it’s
both moderately amusing and instructive to use it as a lens for thinking
about Big Data.
Acxiom is one of the
largest commercial, private sector data aggregators around. It collects
and sells large data sets about consumers — sometimes even to the
government. And for years it did so quietly, behind the scene — as one
writer put it “mapping the consumer genome.” Some saw this as rather ominous; others as just curious. But it was, for all of us, mysterious. Until now.
In September, the data giant made available to the public a portion of its data set. They created a new website — Abouthedata.com
— where a consumer could go to see what data the company had collected
about them. Of course, in order to access the data about yourself you
had to first verify your own identity (I had to send in a photocopy of
my driver’s license), but once you had done so, it would be possible to
see, in broad terms, what the company thought it knew about you — and
how close that knowledge was to reality.
I was curious, so I thought I would go explore myself and see what it
was they knew and how accurate they were. The results were at times
interesting, illuminating and mundane. Here are a few observations:
To begin with, the fundamental purpose of the data collection is to
sell me things — that’s what potential sellers want to know about
potential buyers and what, say, Amazon might want to know about me. So I
first went and looked at a category called “Household Purchase Data” —
in other words what I had recently bought.
It turns out that I buy … well … everything. I buy food, beverages,
art, computing equipment, magazines, men’s clothing, stationary, health
products, electronic products, sports and leisure products, and so
forth. In other words, my purchasing habits were, to Acxiom, just an
undifferentiated mass. Save for the notation that I had bought an
antique in the past and that I have purchased “High Ticket Merchandise,”
it seems that almost everything I bought was something that most any
moderately well-to-do consumer would buy.
I do suppose that the wide variety of purchases I made is, itself,
the point — by purchasing so widely I self-identify as a “good”
consumer. But if that’s the point then the data set seems to miss the
mark on “how good” I really am. Under the category of “total dollars
spent,” for example, it said that I had spent just $1,898 in the past
two years. Without disclosing too much about my spending habits in this
public forum, I think it is fair to say that this is a significant
underestimate of my purchasing activity.
The next data category of “Household Interests” was equally
unilluminating. Acxiom correctly said I was interested in computers,
arts, cooking, reading and the like. It noted that I was interested in
children’s items (for my grandkids) and beauty items and gardening (both
my wife’s interest, probably confused with mine). Here, as well, there
was little differentiation, and I assume the breadth of my interests is
what matters rather that the details. So, as a consumer, examining what
was collected about me seemed to disclose only a fairly anodyne level of
detail.
[Though I must object to the suggestion that I am an Apple user J.
Anyone who knows me knows I prefer the Windows OS. I assume this was
also the result of confusion within the household and a reflection of my
wife’s Apple use. As an aside, I was invited throughout to correct any
data that was in error. This I chose not to do, as I did not want to
validate data for Acxiom – that’s their job not mine—and I had no real
interest in enhancing their ability to sell me to other marketers. On
the other hand I also did not take the opportunity they offered to
completely opt-out of their data system, on the theory that a moderate
amount of data in the world about me may actually lead to being offered
some things I want to purchase.]
Things became a bit more intrusive (and interesting) when I started
to look at my “Characteristic Data” — that is data about who I am. Some
of the mistakes were a bit laughable — they pegged me as of German
ethnicity (because of my last name, naturally) when, with all due
respect to my many German friends, that isn’t something I’d ever say
about myself. And they got my birthday wrong — lord knows why.
But some of their insights were at least moderately invasive of my
privacy, and highly accurate. Acxiom “inferred” for example, that I’m
married. They identified me accurately as a Republican (but notably not
necessarily based on voter registration — instead it was the party I was
“associated with by voter registration or as a supporter”). They knew
there were no children in my household (all grown up) and that I run a
small business and frequently work from home. And they knew which sorts
of charities we supported (from surveys, online registrations and
purchasing activity). Pretty accurate, I’d say.
Finally, it was completely unsurprising that the most accurate data
about me was closely related to the most easily measurable and widely
reported aspect of my life (at least in the digital world) — namely, my
willingness to dive into the digital financial marketplace.
Acxiom knew that I had several credit cards and used them regularly.
Acxiom knew that I had several credit cards and used them regularly. It had a broadly accurate understanding of my household total income range [I’m not saying!].
They also knew all about my house — which makes sense since real
estate and liens are all matters of public record. They knew I was a
home owner and what the assessed value was. The data showed, accurately,
that I had a single family dwelling and that I’d lived there longer
than 14 years. It disclosed how old my house was (though with the rather
imprecise range of having been built between 1900 and 1940). And, of
course, they knew what my mortgage was, and thus had a good estimate of
the equity I had in my home.
So what did I learn from this exercise?
In some ways, very little. Nothing in the database surprised me, and
the level of detail was only somewhat discomfiting. Indeed, I was more
struck by how uninformative the database was than how detailed it was —
what, after all, does anyone learn by knowing that I like to read?
Perhaps Amazon will push me book ads, but they already know I like to
read because I buy directly from them. If they had asserted that I like
science fiction novels or romantic comedy movies, that level of detail
might have demonstrated a deeper grasp of who I am — but that I read at
all seems pretty trivial information about me.
I do, of course, understand that Acxiom has not completely lifted the
curtains on its data holdings. All we see at About The Data is summary
information. You don’t get to look at the underlying data elements. But
even so, if that’s the best they can do ….
In fact, what struck me most forcefully was (to borrow a phrase from
Hannah Arendt) the banality of it all. Some, like me, see great promise
in big data analytics as a way of identifying terrorists or tracking
disease. Others, with greater privacy concerns, look at big data and see
Big Brother. But when I dove into one big data set (albeit only
partially), held by one of the largest data aggregators in the world,
all I really became was a bit bored.
Maybe that’s what they wanted as a way of reassuring me. If so, Acxiom succeeded, in spades.