Friday, December 28, 2007

Nothing New Under the Sun, Part 2

Same game as last time—I reproduce a quotation that sounds like it was written today, with a little identifying information removed, and you guess when it was actually written.

The rate of increase of journal prices over the last 10 years has far outstripped overall inflation. The result is, for instance, that “a large academic library which subscribed to a broad range of journals and whose acquisitions budget has increased between [year] and [year 12 years later] at the general rate of inflation would be able to purchse only 62% of the journals it could have purchased 12 years earlier.”

Answer: the years are 1967 and 1979.

Source: Wood, D. N. (1983). Reprography and copyright with particular reference to inter-library lending activities—a view from the BLLD. Aslib Proceedings, 35, 457-467.

Monday, December 24, 2007

Privacy vs. Data-Driven Tools

There was an interesting post at ACRLog a few days ago about online privacy and libraries.

Librarians' attitudes towards privacy are something I've been thinking about quite a bit lately, because they have a huge bearing on the potential success of a project I'm working on. A friend of mine and I decided to enter the Netflix Prize competition, which involves creating an algorithm that predicts what movies a person will like, based on their ratings of previous movies, that works better than the algorithm Netflix is currently using to recommend movies to its users. Since we've started working on this, I've been considering in the back of my mind how to apply what we're learning to building a similar recommender tool for libraries. The problem with this, from a librarian's perspective, is that it takes unholy amounts of more-or-less personally identifiable data to make something like this work. You can set it up to be opt-in on a user-by-user or book-by-book basis, so that every single user has consented to make available the information linking them to every single specific book to which they are linked, but still, at the end of the day you're going to have a huge database linking specific books with specific people. Are librarians on the whole ever going to be okay with something like that, even if it allows them to provide a valuable service to their patrons (and I think a lot of the voracious-reader-type library patrons would really appreciate a service that found them good new authors to read), or are they going to stand on their privacy principles and lose some of their core customers to some sort of Bookswim/Netflix-esque company that offers these types of services?

Link Roundup

Cataloging and Classification Quarterly has a whole issue devoted to the Semantic Web. I haven't read any of it yet, but it's been added to my "to read" list.

An interview with Peter Norvig, director of research at Google. (Hat tip: Slashdot.)

How to turn off various search filters in Google.

More about the Amazon Kindle.

Friday, December 21, 2007

There Is Nothing New Under the Sun

I'm a big fan of the study of history. I firmly believe that you can't really know where you are if you don't know how you got there, whether the “you” is an institution, a profession, a country, etc. But the more history (broadly conceived) I study, the more I come to realize that it's true that there is nothing new under the sun.

Today's illustration of that fact comes courtesy of the Authors League of America, which is the source of the following passage. I have changed the words in brackets; your challenge is to guess what the quote was originally about and the year in which it was published. (Answer below.)

[Scanning] is not an extension of the normal library function. It is obviously a publishing function; it is publishing. Libraries managed to operate for many hundreds of years before the discovery of the camera; there is no reason why they cannot continue to function without [scanning]—at least of copyrighted books. Nor is there any reason why any reader who does not wish to buy a book (or cannot find a copy available) should not do what readers have done throughout library history, go to the library and read the book there.

Got your guesses? Scroll down for the answer.

The quote is from 1963 (as quoted on page 72 of this book) and both of the replaced words are "photocopying."

Tuesday, December 18, 2007

Google and the Knols

I've been going round and round in my head about whether Google's new “knol” project is brilliant or utterly daft since I first heard about it. (Background here and here and here.) I think I've settled on “not brilliant, but not entirely daft either.”

The key is something that Google isn't sharing—exactly how much the authors of knols can expect to make from the ads on their pages. So, not knowing that makes it hard to evaluate whether Google is going to attract decent writers or not. But the ads really wouldn't have to pay all that well to be competitive—this sort of writing is not lucrative anywhere, and yet somehow people are still lining up to be writers. Plus, writing knols has the added benefit of getting your name out there and allowing you to build a reputation. So yeah, if I was a couple of years younger and still trying to get by as a full-time freelancer I'd be awfully tempted to write a few of these and see if they wound up paying roughly as well as the work I've been doing. Heck, I'm half-tempted to throw up a few of them now (well, sometime in the spring when I have free time again), just to see what happens.

Monday, December 17, 2007

The Future of Bibliographic Control

I'm finally getting around to reading the Library of Congress Working Group on the Future of Bibliographic Control's report, and it's really much more interesting than I was expecting. I am very much not a cataloger—one of my goals in getting into librarianship was getting away from a job that entailed sitting in front of a computer by myself doing repetitive work all day, and the arcane little details of MARC and AACR2 make my head spin—but I am invested in the idea of getting bibliographic data into a format that can be set free on the Web and used by Web services much more easily than MARC data. Hence my desire to dance when I got to Section 3, "Position Our Technology for the Future," which is all about moving away from the MARC-centric world towards something Webbier. The section on standards for doing this is exceptionally vague, unfortunately, but it's still nice to see that high-level people with power to make changes understand that this is the way that libraries need to be going with their data.

And then Section 4.1 just about knocked me off my chair. The working group is recommending not just opening up the library data to let people use it in Web applications outside the catalog, but actually letting user-generated information into the catalog! I've been on record as being in favor of this before, but I understand the controversy behind that position and I was surprised to see a blue-ribbon group like this recommend it so straightforwardly and universally. (Even I tend to say that this is likely to work better in some types of libraries than others!)

Saturday, December 15, 2007

And This Is Supposed to Be a GOOD Thing?

Tim O'Reilly has an interesting editorial in the New York Times today about the benefits to both users and corporations of opening up cellphone networks. For 75% of the editorial I was completely with him . . . and then I got to the last few paragraphs:

Imagine, for a moment, that Verizon were to think like Google or Amazon. It could give you access to your entire call history, every phone call you have sent or received, not just your last 10 phone calls. It might build an address book for you based on everyone you had ever talked to, with top results for the numbers you call most often.

And what if this phone company opened up its databases to developers of software applications? We could soon see mash-ups of your call history with the address books from your personal computer, your telephone and your social network. Now imagine a user community turned loose to annotate that data.

...Who would switch carriers when so much knowledge about your social network resided on your phone company’s servers? [emphasis mine]

Now, I am less fanatical about privacy than many librarians are. I think that people are perfectly capable of making rational decisions to give up some of their privacy to get certain benefits. I have a listed number in the phone book, profiles on LinkedIn and Facebook, and I blog under my real name because I think that the personal and professional benefits of people knowing who I am and how to find me outweigh the risks. I am not overwhelmingly concerned about Amazon tracking my information there and using it to recommend books to me (although the fact that I use Amazon to research books that I'm interested in only because some editor is currently paying me to care about the subject tends to make a hash out of its recommendations for me, but that's another story). But something about this proposal creeps me out for reasons that I can't entirely pin down.

I think that my biggest problem with this vision is that it would be unavoidable. When people put personal information about themselves online in places like Facebook, they are in control of what they're revealing. Even with Amazon and other online tracking and information-gathering, you are still in control—if you really don't want there to be a record of you having purchased a particular book, you're still free to go to a bookstore in person and pay for it in cash (or get it at the library!), and there are ways to anonymize or at least mask your Web searches if you really want to. But I don't see a realistic way to opt out of O'Reilly's vision—even if you decided not to participate yourself, if your friends and colleagues call you, keep you in their address books, and annotate their address book listings for you, there will still be a whole lot of information about you and your place in the social network out there.

Friday, December 14, 2007

Link Roundup

Isabelle Fetherston says something that I've been saying for awhile—e-books are great for older adults, because they can adjust the font size upwards as far as they need to. (Hat tip: Michael Habib)

Virginia Postrel, a great blogger on issues of design and style, rounds up and critiques some of the commentary on Amazon's Kindle.

An interesting project to watch out of George Mason University's Center for History and New Media. I haven't read anything about this other than the linked article at Inside Higher Ed, so I'm not sure how fully-baked this idea is, but it's intriguing at the least.

The Economist talks about “citizen science”—distributing scientific scut-work that involves the kinds of visual processing that computers can't yet do well out to volunteers. (Hat tip: Slashdot.) And somehow neither the Economist nor Slashdot mentioned Amazon's Mechanical Turk.

More people are getting behind the idea that bibliographic records need to be more freely available.

Wednesday, December 12, 2007

The Coolest Use I've Seen of XML in Awhile

This is absolutely, utterly brilliant—an XML schema for knitting patterns. (Hat tip: Kat with a K.) Seriously, if you don't knit/crochet/etc. you probably can't even begin to understand the brilliance of the idea of expressing patterns in a standardized, machine-readable markup language—although the linked page does a nice job of explaining it—but trust me, it is brilliant. I'm so tempted to create a schema along these lines for crocheting patterns now, since I can't knit to save my life but can crochet a mean pair of gloves....

Sunday, December 9, 2007

Something to Keep an Eye On

I doubt that this new law, which would allow the government to seize computers that are used to facilitate intellectual property theft, is going to pass in its present form, or be upheld as constitutional if it is . . . but I didn't believe that the courts were going to uphold the laws that allow the government to seize property allegedly used in the drug trade without anyone actually being convicted of anything, either. But things could get very, very interesting for libraries that have public access computers if it does pass. This is definitely something that's worth paying attention to.

Another Nifty Tool Built on Amazon's Open Information

Don't you hate it when you're $1 or so short of having a big enough order on Amazon to qualify for free shipping? You feel like you ought to find something cheap to buy to fill out the order, but then you have to wander all over the site looking for something.... Well, now somebody has built a site that will do the searching for you.

Hat tip: Marginal Revolution.

By the way, I'm so happy to see that OCLC is moving towards making the WorldCat data ever more open. I can't wait to see spiffy tools like this being built for library users and not just book-buyers!

Other Industries Confront the Web

I don't generally blog about the problems that the music and film/TV industries face in digitizing their content and making it available online, because 1) I just don't care about the music industry at all and 2) I don't have a particularly good grasp of the film/TV business model. But I've come upon a few interesting articles lately on the topic, so I'm posting them here as a link round-up.

The New York Times has a nice article on Radiohead's name-your-price album and related changes in the music industry.

Nikki Finke's Deadline Hollywood Daily has become my go-to source for news about the WGA strike.

And, one with direct library relevance: You can now do keyword searches of [some] online videos.

Friday, December 7, 2007

Interesting story... Slate, titled “A Librarian's Worst Nightmare”, about Yahoo! Answers. The story is interesting because it doesn't wail that Yahoo! Answers users are idiots who don't know that they're getting unreliable information; instead, it looks at what value uers get out of a service like Yahoo! Answers.

Monday, December 3, 2007

Some Thoughts on DRM

As I've been reading more and more people's reactions to the Kindle and its DRM issues, something occurred to me: why are publishers so concerned about wrapping up e-books in DRM, when they're willing to let e-journals and e-magazines float around online with no DRM at all?

This occurred to me because I know that, were I to get an e-book reader, I would very rarely pay for content for it—partially because I'm a Pennsylvania Dutch cheapskate, and partially because I can get digital versions of most content I'm interested in for free. Why? Because most of what I read is magazines/journals rather than books.

(Digression: Do not even start with me, a la the periodic NEA reports on the decline of reading for pleasure, on how reading magazines and/or online content isn't "real" reading and only books count. And really REALLY don't start with me on the idea that nonfiction books don't count either and only fiction reading is "real" reading. 1) In terms of any benefit of reading you could possibly name, I'll put one of the Atlantic's 15,000-word essays up against any of the formulaic crime/romance/etc. novels that most people read any day. 2) I do read nonfiction books when I have the time to indulge in them, and normally my #1 complaint about them is that they take what could have been a nice lean 15,000-word essay—in fact, these days, what often started off as a nice lean 15,000-word essay—and expanded it with a lot of filler that doesn't really enhance their point. End digression.)

So, why is so much magazine content so freely available? I'm not just talking about the content that is online free and ad-supported (although there really is a surprising amount of that going on amongst the major magazines), but also the magazines that aren't free online but that are aggregated in half-a-dozen different databases with no significant embargo period and no DRM and that most people can get free access to online without trying too hard. I mean, just using InfoTrac's General OneFile—which absolutely everyone in Michigan has free access to through the State Library—I could load up an e-book reader with PDFs of the latest issues of Atlantic Monthly, The New Yorker and The Economist and easily keep myself amused for two plane flights and a weeklong vacation, say. And when interesting articles caught my eye I could e-mail them to friends who I thought might be interested in them, print them out and mark them up for typesetting if I thought I could use them in a Greenhaven anthology . . . all the things that the book publishers would be dead set against me doing with any of their content. So why are the magazine/journal publishers so much less concerned about this than the book publishers are?

I don't think it has to do with popularity—The New Yorker, for example, has a circulation of over a million, which handily beats all but the most popular best-selling books. But I really don't know what it is, either. Theories?