Via Mashable
 
-----
 
 
 

 
The world of Big Data is one of pervasive data collection and 
aggressive analytics. Some see the future and cheer it on; others rebel.
 Behind it all lurks a question most of us are asking — does it really 
matter? I had a chance to find out recently, as I got to see what 
Acxiom, a large-scale commercial data aggregator, had collected about 
me.
 
At least in theory large-scale data collection matters quite a bit. Large data sets can be used to create social network maps
 and can form the seeds for link analysis of connections between 
individuals. Some see this as a good thing; others as a bad one — but 
whatever your viewpoint, we live in a world which sees increasing power 
and utility in Big Data’s large-scale data sets.
 
Of course, much of the concern is about government collection. But 
it’s difficult to assess just how useful this sort of data collection by
 the government is because, of course, most governmental data collection
 projects are classified. The good news, however, is that we can begin 
to test the utility of the program in the private sector arena. A useful
 analog in the private sector just became publicly available and it’s 
both moderately amusing and instructive to use it as a lens for thinking
 about Big Data.
 
Acxiom is one of the
 largest commercial, private sector data aggregators around. It collects
 and sells large data sets about consumers — sometimes even to the 
government. And for years it did so quietly, behind the scene — as one 
writer put it “mapping the consumer genome.” Some saw this as rather ominous; others as just curious. But it was, for all of us, mysterious. Until now. 
 In September, the data giant made available to the public a portion of its data set. They created a new website — Abouthedata.com
 — where a consumer could go to see what data the company had collected 
about them. Of course, in order to access the data about yourself you 
had to first verify your own identity (I had to send in a photocopy of 
my driver’s license), but once you had done so, it would be possible to 
see, in broad terms, what the company thought it knew about you — and 
how close that knowledge was to reality. 
I was curious, so I thought I would go explore myself and see what it
 was they knew and how accurate they were. The results were at times 
interesting, illuminating and mundane. Here are a few observations:
 
To begin with, the fundamental purpose of the data collection is to 
sell me things — that’s what potential sellers want to know about 
potential buyers and what, say, Amazon might want to know about me. So I
 first went and looked at a category called “Household Purchase Data” — 
in other words what I had recently bought.
 
It turns out that I buy … well … everything. I buy food, beverages, 
art, computing equipment, magazines, men’s clothing, stationary, health 
products, electronic products, sports and leisure products, and so 
forth. In other words, my purchasing habits were, to Acxiom, just an 
undifferentiated mass. Save for the notation that I had bought an 
antique in the past and that I have purchased “High Ticket Merchandise,”
 it seems that almost everything I bought was something that most any 
moderately well-to-do consumer would buy.
 
I do suppose that the wide variety of purchases I made is, itself, 
the point — by purchasing so widely I self-identify as a “good” 
consumer. But if that’s the point then the data set seems to miss the 
mark on “how good” I really am. Under the category of “total dollars 
spent,” for example, it said that I had spent just $1,898 in the past 
two years. Without disclosing too much about my spending habits in this 
public forum, I think it is fair to say that this is a significant 
underestimate of my purchasing activity.
 
The next data category of “Household Interests” was equally 
unilluminating. Acxiom correctly said I was interested in computers, 
arts, cooking, reading and the like. It noted that I was interested in 
children’s items (for my grandkids) and beauty items and gardening (both
 my wife’s interest, probably confused with mine). Here, as well, there 
was little differentiation, and I assume the breadth of my interests is 
what matters rather that the details. So, as a consumer, examining what 
was collected about me seemed to disclose only a fairly anodyne level of
 detail.
 
[Though I must object to the suggestion that I am an Apple user J. 
Anyone who knows me knows I prefer the Windows OS. I assume this was 
also the result of confusion within the household and a reflection of my
 wife’s Apple use. As an aside, I was invited throughout to correct any 
data that was in error. This I chose not to do, as I did not want to 
validate data for Acxiom – that’s their job not mine—and I had no real 
interest in enhancing their ability to sell me to other marketers. On 
the other hand I also did not take the opportunity they offered to 
completely opt-out of their data system, on the theory that a moderate 
amount of data in the world about me may actually lead to being offered 
some things I want to purchase.]
 
Things became a bit more intrusive (and interesting) when I started 
to look at my “Characteristic Data” — that is data about who I am. Some 
of the mistakes were a bit laughable — they pegged me as of German 
ethnicity (because of my last name, naturally) when, with all due 
respect to my many German friends, that isn’t something I’d ever say 
about myself. And they got my birthday wrong — lord knows why.
 
But some of their insights were at least moderately invasive of my 
privacy, and highly accurate.  Acxiom “inferred” for example, that I’m 
married. They identified me accurately as a Republican (but notably not 
necessarily based on voter registration — instead it was the party I was
 “associated with by voter registration or as a supporter”). They knew 
there were no children in my household (all grown up) and that I run a 
small business and frequently work from home. And they knew which sorts 
of charities we supported (from surveys, online registrations and 
purchasing activity). Pretty accurate, I’d say.
 
Finally, it was completely unsurprising that the most accurate data 
about me was closely related to the most easily measurable and widely 
reported aspect of my life (at least in the digital world) — namely, my 
willingness to dive into the digital financial marketplace.  
 
 
 
Acxiom knew that I had several credit cards and used them regularly. 
Acxiom knew that I had several credit cards and used them regularly. It had a broadly accurate understanding of my household total income range [I’m not saying!]. 
They also knew all about my house — which makes sense since real 
estate and liens are all matters of public record. They knew I was a 
home owner and what the assessed value was. The data showed, accurately,
 that I had a single family dwelling and that I’d lived there longer 
than 14 years. It disclosed how old my house was (though with the rather
 imprecise range of having been built between 1900 and 1940). And, of 
course, they knew what my mortgage was, and thus had a good estimate of 
the equity I had in my home.
 
So what did I learn from this exercise?
 
In some ways, very little.  Nothing in the database surprised me, and
 the level of detail was only somewhat discomfiting. Indeed, I was more 
struck by how uninformative the database was than how detailed it was — 
what, after all, does anyone learn by knowing that I like to read? 
Perhaps Amazon will push me book ads, but they already know I like to 
read because I buy directly from them. If they had asserted that I like 
science fiction novels or romantic comedy movies, that level of detail 
might have demonstrated a deeper grasp of who I am — but that I read at 
all seems pretty trivial information about me.
 
I do, of course, understand that Acxiom has not completely lifted the
 curtains on its data holdings. All we see at About The Data is summary 
information. You don’t get to look at the underlying data elements. But 
even so, if that’s the best they can do ….
 
In fact, what struck me most forcefully was (to borrow a phrase from 
Hannah Arendt) the banality of it all. Some, like me, see great promise 
in big data analytics as a way of identifying terrorists or tracking 
disease. Others, with greater privacy concerns, look at big data and see
 Big Brother. But when I dove into one big data set (albeit only 
partially), held by one of the largest data aggregators in the world, 
all I really became was a bit bored.
 
Maybe that’s what they wanted as a way of reassuring me. If so, Acxiom succeeded, in spades.