Consumer Profiling and Credit Card Data Mining

I’ve always loved reading and learning about data mining and its applications in various fields. Because of this, Charles Duhigg’s comprehensive look at the consumer profiling practices of credit card companies was my favourite read over the weekend.

[Researchers] emphasized that the biggest profits didn’t come from people who always paid off their bills but rather from less-responsible clients who never paid their entire balance. […]

But giving credit cards to riskier customers posed a problem: How do you know which cardholders will pay something each month, providing fat profits, and which will simply run up a huge tab and then disappear?

[One] solution was learning to predict how different types of customers would behave. Card companies began running tens of thousands of experiments each year, testing the emotions elicited by various card colors and the appeal of different envelope sizes, for instance, or whether new immigrants were more responsible than cardholders born in this country. By understanding customers’ psyches, the companies hoped, they could tell who was a bad risk and either deny their application or, for those who were already cardholders, start shrinking their available credit and increasing minimum payments to squeeze out as much cash as possible before they defaulted.

There are some fascinating insights in the article, and throughout I was reminded of this Marissa Mayer quote (from her Charlie Rose appearance), taken from Super Crunchers—a book on number analysis and data mining:

Credit-card companies can tell whether a couple is going to get divorced two years beforehand, with 98% likelihood.

The validity of that statement seems slightly dubious, but I love it nonetheless.



4 responses to “Consumer Profiling and Credit Card Data Mining”

  1. Paul

    I know from my own data assurance work with the UK Information Commissioner, that credit card companies (and more precisely the credit reference agencies they have come to rely upon heavily) are not the beacons of rigorous scientific analysis they would like ordinary citizens to think.

    For a start, according to a 1992 survey of mined data from one large credit reference agency (“CRA”) undertaken by a privacy technology writer for her book, over 30% of the individuals’ data was incorrect “enough to negatively influence a credit decision”. That’s an alarmingly high error rate.

    On the other hand, you won’t hear that study mentioned by the CRAs’ marketing material boasting of highly accurate reporting results for individuals.

    Another aspect which is forgotten is that credit information is often made accurate *reactively*. For example, by default one UK CRA Electoral Register data is updated only once a year. So if you moved into your new address less than a year ago, chances are you’ll be turned down for credit even though you are a registered and tax-paying voter. After being turned down for credit by your bank, the CRA will likely receive a disgruntled call from you when you ask for (and pay for) your full credit record. At this point they’ll update your record, and send it to you. You’ll probably see that the data is there correctly, and take it back to your bank. And when you get there, hey presto! the bank’s records from the CRA will now show you’re registered correctly. Do you see what they did there?

    In the UK at least, this reactionary rolling of credit records is unacceptable under the Data Protection Act 1998, but it is quietly tolerated by the Information Commissioner. To do it any other way would mean that the CRA would have to be proactive, and that means spending money updating your data and chasing for updates – and they really don’t want that hassle when there’s gold to be had in them thar … inaccurate records.

    So next time your bank does a credit check on you, take the results with a big pinch of salt and tell the bank that they really shouldn’t rely on the CRA’s marketing materials to gauge the accuracy of the data.

  2. Paul

    And one other thing about Marissa Meyer’s quote there.

    “Credit-card companies can tell whether a couple is going to get divorced two years beforehand, with 98% likelihood.”

    So when, why and how did the credit card company perform this groundbreaking and expensive research exactly? I say expensive because they will have needed controls samples for false positives and false negatives and a very large sample in the first place to factor out any conflating or compounding effects (demographics, age, racial background, household income, etc. etc.).

    Which leads me to conclude that to give this ‘research’ any credence, either Marissa Meyer has a questionable grasp of statistical methods or she simply hasn’t mentally scoped the problem. Or both.

    98% of statistics aren not made up on the spot, but the remaining 2% give statistics a very bad name.

  3. Insightful comments about the CRAs and I thank you for that; I’ll definitely be more dubious over future credit check results, no matter the outcome.

    I particularly agree with your comments about the Marissa Mayer quote.

    I personally like the quote as it goes some way to helping people realise the amount of personal information (or at least profile-able information) we give away to companies purely by purchasing goods.

    As you say, to take this statistic at face value and/or to give the research credence without doubt shows at least a “questionable grasp of statistical methods”. I hadn’t thought about it, but by repeating this on a show such as Charlie Rose she has seemingly done exactly that.

  4. […] (via Lone Gunman) […]