Monday, December 24, 2007

Privacy vs. Data-Driven Tools

There was an interesting post at ACRLog a few days ago about online privacy and libraries.

Librarians' attitudes towards privacy are something I've been thinking about quite a bit lately, because they have a huge bearing on the potential success of a project I'm working on. A friend of mine and I decided to enter the Netflix Prize competition, which involves creating an algorithm that predicts what movies a person will like, based on their ratings of previous movies, that works better than the algorithm Netflix is currently using to recommend movies to its users. Since we've started working on this, I've been considering in the back of my mind how to apply what we're learning to building a similar recommender tool for libraries. The problem with this, from a librarian's perspective, is that it takes unholy amounts of more-or-less personally identifiable data to make something like this work. You can set it up to be opt-in on a user-by-user or book-by-book basis, so that every single user has consented to make available the information linking them to every single specific book to which they are linked, but still, at the end of the day you're going to have a huge database linking specific books with specific people. Are librarians on the whole ever going to be okay with something like that, even if it allows them to provide a valuable service to their patrons (and I think a lot of the voracious-reader-type library patrons would really appreciate a service that found them good new authors to read), or are they going to stand on their privacy principles and lose some of their core customers to some sort of Bookswim/Netflix-esque company that offers these types of services?

No comments: