Friday, December 28, 2007

Nothing New Under the Sun, Part 2

Same game as last time—I reproduce a quotation that sounds like it was written today, with a little identifying information removed, and you guess when it was actually written.

The rate of increase of journal prices over the last 10 years has far outstripped overall inflation. The result is, for instance, that “a large academic library which subscribed to a broad range of journals and whose acquisitions budget has increased between [year] and [year 12 years later] at the general rate of inflation would be able to purchse only 62% of the journals it could have purchased 12 years earlier.”




Answer: the years are 1967 and 1979.

Source: Wood, D. N. (1983). Reprography and copyright with particular reference to inter-library lending activities—a view from the BLLD. Aslib Proceedings, 35, 457-467.

Monday, December 24, 2007

Privacy vs. Data-Driven Tools

There was an interesting post at ACRLog a few days ago about online privacy and libraries.

Librarians' attitudes towards privacy are something I've been thinking about quite a bit lately, because they have a huge bearing on the potential success of a project I'm working on. A friend of mine and I decided to enter the Netflix Prize competition, which involves creating an algorithm that predicts what movies a person will like, based on their ratings of previous movies, that works better than the algorithm Netflix is currently using to recommend movies to its users. Since we've started working on this, I've been considering in the back of my mind how to apply what we're learning to building a similar recommender tool for libraries. The problem with this, from a librarian's perspective, is that it takes unholy amounts of more-or-less personally identifiable data to make something like this work. You can set it up to be opt-in on a user-by-user or book-by-book basis, so that every single user has consented to make available the information linking them to every single specific book to which they are linked, but still, at the end of the day you're going to have a huge database linking specific books with specific people. Are librarians on the whole ever going to be okay with something like that, even if it allows them to provide a valuable service to their patrons (and I think a lot of the voracious-reader-type library patrons would really appreciate a service that found them good new authors to read), or are they going to stand on their privacy principles and lose some of their core customers to some sort of Bookswim/Netflix-esque company that offers these types of services?

Link Roundup

Cataloging and Classification Quarterly has a whole issue devoted to the Semantic Web. I haven't read any of it yet, but it's been added to my "to read" list.

An interview with Peter Norvig, director of research at Google. (Hat tip: Slashdot.)

How to turn off various search filters in Google.

More about the Amazon Kindle.

Friday, December 21, 2007

There Is Nothing New Under the Sun

I'm a big fan of the study of history. I firmly believe that you can't really know where you are if you don't know how you got there, whether the “you” is an institution, a profession, a country, etc. But the more history (broadly conceived) I study, the more I come to realize that it's true that there is nothing new under the sun.

Today's illustration of that fact comes courtesy of the Authors League of America, which is the source of the following passage. I have changed the words in brackets; your challenge is to guess what the quote was originally about and the year in which it was published. (Answer below.)

[Scanning] is not an extension of the normal library function. It is obviously a publishing function; it is publishing. Libraries managed to operate for many hundreds of years before the discovery of the camera; there is no reason why they cannot continue to function without [scanning]—at least of copyrighted books. Nor is there any reason why any reader who does not wish to buy a book (or cannot find a copy available) should not do what readers have done throughout library history, go to the library and read the book there.


Got your guesses? Scroll down for the answer.


The quote is from 1963 (as quoted on page 72 of this book) and both of the replaced words are "photocopying."

Tuesday, December 18, 2007

Google and the Knols

I've been going round and round in my head about whether Google's new “knol” project is brilliant or utterly daft since I first heard about it. (Background here and here and here.) I think I've settled on “not brilliant, but not entirely daft either.”

The key is something that Google isn't sharing—exactly how much the authors of knols can expect to make from the ads on their pages. So, not knowing that makes it hard to evaluate whether Google is going to attract decent writers or not. But the ads really wouldn't have to pay all that well to be competitive—this sort of writing is not lucrative anywhere, and yet somehow people are still lining up to be writers. Plus, writing knols has the added benefit of getting your name out there and allowing you to build a reputation. So yeah, if I was a couple of years younger and still trying to get by as a full-time freelancer I'd be awfully tempted to write a few of these and see if they wound up paying roughly as well as the work I've been doing. Heck, I'm half-tempted to throw up a few of them now (well, sometime in the spring when I have free time again), just to see what happens.

Monday, December 17, 2007

The Future of Bibliographic Control

I'm finally getting around to reading the Library of Congress Working Group on the Future of Bibliographic Control's report, and it's really much more interesting than I was expecting. I am very much not a cataloger—one of my goals in getting into librarianship was getting away from a job that entailed sitting in front of a computer by myself doing repetitive work all day, and the arcane little details of MARC and AACR2 make my head spin—but I am invested in the idea of getting bibliographic data into a format that can be set free on the Web and used by Web services much more easily than MARC data. Hence my desire to dance when I got to Section 3, "Position Our Technology for the Future," which is all about moving away from the MARC-centric world towards something Webbier. The section on standards for doing this is exceptionally vague, unfortunately, but it's still nice to see that high-level people with power to make changes understand that this is the way that libraries need to be going with their data.

And then Section 4.1 just about knocked me off my chair. The working group is recommending not just opening up the library data to let people use it in Web applications outside the catalog, but actually letting user-generated information into the catalog! I've been on record as being in favor of this before, but I understand the controversy behind that position and I was surprised to see a blue-ribbon group like this recommend it so straightforwardly and universally. (Even I tend to say that this is likely to work better in some types of libraries than others!)

Saturday, December 15, 2007

And This Is Supposed to Be a GOOD Thing?

Tim O'Reilly has an interesting editorial in the New York Times today about the benefits to both users and corporations of opening up cellphone networks. For 75% of the editorial I was completely with him . . . and then I got to the last few paragraphs:

Imagine, for a moment, that Verizon were to think like Google or Amazon. It could give you access to your entire call history, every phone call you have sent or received, not just your last 10 phone calls. It might build an address book for you based on everyone you had ever talked to, with top results for the numbers you call most often.

And what if this phone company opened up its databases to developers of software applications? We could soon see mash-ups of your call history with the address books from your personal computer, your telephone and your social network. Now imagine a user community turned loose to annotate that data.

...Who would switch carriers when so much knowledge about your social network resided on your phone company’s servers? [emphasis mine]


Now, I am less fanatical about privacy than many librarians are. I think that people are perfectly capable of making rational decisions to give up some of their privacy to get certain benefits. I have a listed number in the phone book, profiles on LinkedIn and Facebook, and I blog under my real name because I think that the personal and professional benefits of people knowing who I am and how to find me outweigh the risks. I am not overwhelmingly concerned about Amazon tracking my information there and using it to recommend books to me (although the fact that I use Amazon to research books that I'm interested in only because some editor is currently paying me to care about the subject tends to make a hash out of its recommendations for me, but that's another story). But something about this proposal creeps me out for reasons that I can't entirely pin down.

I think that my biggest problem with this vision is that it would be unavoidable. When people put personal information about themselves online in places like Facebook, they are in control of what they're revealing. Even with Amazon and other online tracking and information-gathering, you are still in control—if you really don't want there to be a record of you having purchased a particular book, you're still free to go to a bookstore in person and pay for it in cash (or get it at the library!), and there are ways to anonymize or at least mask your Web searches if you really want to. But I don't see a realistic way to opt out of O'Reilly's vision—even if you decided not to participate yourself, if your friends and colleagues call you, keep you in their address books, and annotate their address book listings for you, there will still be a whole lot of information about you and your place in the social network out there.

Friday, December 14, 2007

Link Roundup

Isabelle Fetherston says something that I've been saying for awhile—e-books are great for older adults, because they can adjust the font size upwards as far as they need to. (Hat tip: Michael Habib)

Virginia Postrel, a great blogger on issues of design and style, rounds up and critiques some of the commentary on Amazon's Kindle.

An interesting project to watch out of George Mason University's Center for History and New Media. I haven't read anything about this other than the linked article at Inside Higher Ed, so I'm not sure how fully-baked this idea is, but it's intriguing at the least.

The Economist talks about “citizen science”—distributing scientific scut-work that involves the kinds of visual processing that computers can't yet do well out to volunteers. (Hat tip: Slashdot.) And somehow neither the Economist nor Slashdot mentioned Amazon's Mechanical Turk.

More people are getting behind the idea that bibliographic records need to be more freely available.

Wednesday, December 12, 2007

The Coolest Use I've Seen of XML in Awhile

This is absolutely, utterly brilliant—an XML schema for knitting patterns. (Hat tip: Kat with a K.) Seriously, if you don't knit/crochet/etc. you probably can't even begin to understand the brilliance of the idea of expressing patterns in a standardized, machine-readable markup language—although the linked page does a nice job of explaining it—but trust me, it is brilliant. I'm so tempted to create a schema along these lines for crocheting patterns now, since I can't knit to save my life but can crochet a mean pair of gloves....

Sunday, December 9, 2007

Something to Keep an Eye On

I doubt that this new law, which would allow the government to seize computers that are used to facilitate intellectual property theft, is going to pass in its present form, or be upheld as constitutional if it is . . . but I didn't believe that the courts were going to uphold the laws that allow the government to seize property allegedly used in the drug trade without anyone actually being convicted of anything, either. But things could get very, very interesting for libraries that have public access computers if it does pass. This is definitely something that's worth paying attention to.

Another Nifty Tool Built on Amazon's Open Information

Don't you hate it when you're $1 or so short of having a big enough order on Amazon to qualify for free shipping? You feel like you ought to find something cheap to buy to fill out the order, but then you have to wander all over the site looking for something.... Well, now somebody has built a site that will do the searching for you.

Hat tip: Marginal Revolution.

By the way, I'm so happy to see that OCLC is moving towards making the WorldCat data ever more open. I can't wait to see spiffy tools like this being built for library users and not just book-buyers!

Other Industries Confront the Web

I don't generally blog about the problems that the music and film/TV industries face in digitizing their content and making it available online, because 1) I just don't care about the music industry at all and 2) I don't have a particularly good grasp of the film/TV business model. But I've come upon a few interesting articles lately on the topic, so I'm posting them here as a link round-up.

The New York Times has a nice article on Radiohead's name-your-price album and related changes in the music industry.

Nikki Finke's Deadline Hollywood Daily has become my go-to source for news about the WGA strike.

And, one with direct library relevance: You can now do keyword searches of [some] online videos.

Friday, December 7, 2007

Interesting story...

...in Slate, titled “A Librarian's Worst Nightmare”, about Yahoo! Answers. The story is interesting because it doesn't wail that Yahoo! Answers users are idiots who don't know that they're getting unreliable information; instead, it looks at what value uers get out of a service like Yahoo! Answers.

Monday, December 3, 2007

Some Thoughts on DRM

As I've been reading more and more people's reactions to the Kindle and its DRM issues, something occurred to me: why are publishers so concerned about wrapping up e-books in DRM, when they're willing to let e-journals and e-magazines float around online with no DRM at all?

This occurred to me because I know that, were I to get an e-book reader, I would very rarely pay for content for it—partially because I'm a Pennsylvania Dutch cheapskate, and partially because I can get digital versions of most content I'm interested in for free. Why? Because most of what I read is magazines/journals rather than books.

(Digression: Do not even start with me, a la the periodic NEA reports on the decline of reading for pleasure, on how reading magazines and/or online content isn't "real" reading and only books count. And really REALLY don't start with me on the idea that nonfiction books don't count either and only fiction reading is "real" reading. 1) In terms of any benefit of reading you could possibly name, I'll put one of the Atlantic's 15,000-word essays up against any of the formulaic crime/romance/etc. novels that most people read any day. 2) I do read nonfiction books when I have the time to indulge in them, and normally my #1 complaint about them is that they take what could have been a nice lean 15,000-word essay—in fact, these days, what often started off as a nice lean 15,000-word essay—and expanded it with a lot of filler that doesn't really enhance their point. End digression.)

So, why is so much magazine content so freely available? I'm not just talking about the content that is online free and ad-supported (although there really is a surprising amount of that going on amongst the major magazines), but also the magazines that aren't free online but that are aggregated in half-a-dozen different databases with no significant embargo period and no DRM and that most people can get free access to online without trying too hard. I mean, just using InfoTrac's General OneFile—which absolutely everyone in Michigan has free access to through the State Library—I could load up an e-book reader with PDFs of the latest issues of Atlantic Monthly, The New Yorker and The Economist and easily keep myself amused for two plane flights and a weeklong vacation, say. And when interesting articles caught my eye I could e-mail them to friends who I thought might be interested in them, print them out and mark them up for typesetting if I thought I could use them in a Greenhaven anthology . . . all the things that the book publishers would be dead set against me doing with any of their content. So why are the magazine/journal publishers so much less concerned about this than the book publishers are?

I don't think it has to do with popularity—The New Yorker, for example, has a circulation of over a million, which handily beats all but the most popular best-selling books. But I really don't know what it is, either. Theories?

Tuesday, November 27, 2007

Bleg

Has anyone seen a comparison of e-books to paper books from an ecological perspective? Pollution, energy use, whatever. My gut feeling is that e-books have to be more environmentally friendly, if for no other reason than transporting the paper from forest to paper plant to printing press to warehouse to bookstore/library/home/wherever has to use a whole lot of fossil fuels, but I'd be curious to see an actual analysis of this.

Tuesday, November 20, 2007

So Close, and Yet So Far Away

As you've probably already heard, unless this is the first blog you've read today, Amazon has released its new e-book reader, called Kindle. And it's very, very nice. But still not quite there....


The good:



  • E-ink.

  • Built-in "free" EVDO.

  • The battery life rocks: it goes up to a week without recharges.

  • It handles newspapers, magazines, and blogs as well as books.

  • It has a keyboard and the ability to write margin notes, bookmark pages, etc., etc.

  • Adjustable text size.


The bad:



  • The EVDO is limited. It's free to browse Amazon's Kindle e-book store and to have books, magazines, and blogs delivered to your Kindle, but you can't just browse the Internet (except, bizarrely, Wikipedia) on it, and you have to subscribe to and pay for blogs.

  • The newspaper subscription prices are a little ridiculous, considering that you can get the entire content of most (and once the Wall Street Journal goes free, I think all) of these newspapers online for free.

  • It supports Word documents and pictures, but not PDFs. (Although, according to the comments on Amazon, you can get around this by converting PDFs to Mobi files.)

  • $400. Considering that I have my eye on an entire laptop that costs $400, boots in 15-20 seconds, and does a whole lot more, I can't see spending $400 on something just to read e-books. My goal in buying new gadgets nowadays is to cut down on the number of things I'm carrying with me, not add a new one that won't replace any of the old ones to the pile.



If it was less than $400 I might still be tempted. I really would like an e-ink device with long battery life for reading PDFs on the go. Even once I get a job and can justify buying one of those Asus Eees, it still only has about 3 hours of battery life. (Although I will be amazed if Asus doesn't eventually put an e-ink display in one of the Eee models, which should help on that front quite a bit.) But it's going to have to be a lot less than $400. Like, $50 if I can only use it to read e-books. Maybe up to $200 if I could use it to browse the Web or if it could replace my PDA and/or laptop in some circumstances.



Of course, if the Kindle bombs maybe I'll be able to pick one up on Ebay in a couple of months for $50.... Hmm.... Should I root against the Kindle so I can get one cheap, or should I root for it so e-books will take off and there will be more competition between vendors to improve their e-book interfaces and offer more/better/cheaper e-books? Dilemmas, dilemmas.

Friday, November 16, 2007

An Advertising-Related Link Roundup

I've had a post percolating in my head for a couple of weeks about the economics of content production in the digital world—basically, where is stuff going to come from, and who's going to pay for it, and what's the future of ad revenue in that picture? There have been a lot of interesting posts and stories about that topic lately, and one of my freelance gigs is currently paying me to pay attention to the Hollywood writers' strike, which hinges on some of the same issues. But the post just isn't coming together in my head (probably because I haven't had more than 30 seconds in a row to sit and think about it, thanks to interning + job-hunting + trying to do enough paying work to cover the rent). So I'm giving up, admitting defeat, and just posting all of the links here for your pondering pleasure.

The Wall Street Journal, as has been expected for awhile now, has finally decided to make its online edition free to users and to fund this via advertising revenue.

Marginal Revolution links to a study showing that relying on advertising revenue for funding actually seems to make a given media outlet less biased.

A British library has started putting advertising inserts into its books; the company that's handling the inserts is apparently paying them around 3 cents per insert.

And Meredith Farkas thinks about advertising in relation to ALA and its funding streams.

Tuesday, November 6, 2007

Your Educational Time-Waster of the Day

Somebody has built a shiny interactive interface for Princeton's WordNet ontology. It looks a lot like the AquaBrowser word cloud, but with a whole bunch more semantic richness. I'm tempted to start using this as a thesaurus instead of Thesaurus.com, but I suspect that I'd get way too distracted by all of the pretty colorful moving things and fail to return to whatever I was supposed to be writing.

Hat tip: Marginal Revolution.

Monday, November 5, 2007

Using Scanned Books to Develop Ontologies

This is a brilliant idea. Go read it. (Because, the previous post notwithstanding, I'm not going to violate copyright by cutting and pasting the whole thing in here, and it's too short to summarize or excerpt productively.)

Cory Doctorow Talks About Online Publishing

Cory Doctorow gives a wonderful interview. (Hat tip: Marginal Revolution.)

A few choice quotes (but you should really go read the whole thing):

"[I]t's just hubris that makes us think that this particular change—the computer change—is the one that's going to destroy publishing and that it must be prevented at all costs. We'll adapt."

"It's the 21st century, there's not going to be a year in which it's harder to copy than this year.... And so, if your business model and your aesthetic effect in your literature and your work is intended not to be copied, you're fundamentally not making art for the 21st century."

"If things that schoolchildren do in the course of being schoolchildren violate copyright, the problem is with copyright—not with the schoolchildren."

Friday, November 2, 2007

Wednesday, October 31, 2007

Semantic Web Technologies Making My Life Easier Soon!

I want this. NOW. (Yes, I registered for the beta, but the site was Slashdotted two days ago and I only registered today, so I'm guessing it's going to be awhile until I get my invitation.)

More about Twine.

Tuesday, October 30, 2007

The Economics of Blogging, Continued

More and more people ponder the economics of blogging. Since they all know way more about economics than I do, I'm going to refrain from elaborating and just send you off to read them.

Friday, October 26, 2007

Link Roundup

Using captchas to proofread digitized texts. This is sheer brilliance. Everybody wins by harnessing the time that people are expending filling out captchas, which up until now had been largely wasted time, to do something productive that would be cost-prohibitive to do in any other way.

This article about re-imagining electronic publishing not just as recreating the words of the book on a screen, but also recreating the social, “coffeehouse” aspects of reading and discussing texts online, is pretty brilliant too. (Hat tip: ACRLog.) I cannot wait for publishers to take full advantage of the benefits of e-books, and even though this is a benefit that I hadn't really spent much time considering before I read this article, it might just become my favorite new feature of online books. And I bet it will work fairly well for scholarly books, where the limited number of people reading any given book (and the fact that most of those people will have professional reputations to maintain) ought to keep the quality of the comments pretty high.

Apropos my Cult of the Amateur rant: Tyler Cowen talks about the economics of blogging, and a response by FP Passport.

The down side of open information: People may use freely available online information to do evil things.

Apparently Google Book Search is now making use of Library of Congress Subject Headings data for the books it has digitized. (Hat tip: Cataloging Futures.)

Somebody else is having fun with Amazon's data.

Wednesday, October 24, 2007

Slashdot to the Rescue

I knew before I read Cult of the Amateur that most cases of identity theft did not involve personal information stolen off of the Internet or other computer networks, but I didn't have the time to go dig up the actual number and argue with Keen on that point, too. (Especially since, like I said, I think that his arguments about identity theft on the Internet are irrelevant to his main argument about the Cult of the Amateur). But since Slashdot so nicely sent the actual figure right to my Google Reader this morning, I'm going to pass it on: this study shows that only 20 percent of identity thefts are conducted over the Internet.

Sunday, October 21, 2007

The Cult of the Amateur

Yes, I'm finally getting around to reviewing this book, two weeks after I finished it.

Remember how I said a couple of weeks ago that I had not been tempted to throw the book across the room while reading the first 20 pages? Well, the book-throwing impulse kicked in between pages 27 and 69, which I read while stuffed into a center airplane seat between two people who, I suspect, were not interested in hearing me rant about the book. So instead I sat there and scribbled angry things in a little notebook so I could blog them when I got back.

Surprisingly, the part of the book that aggravated me the most was that Keen just fundamentally does not understand economics. (The part that I had expected to find the most aggravating—Keen's insistence that people who are not trained journalists/columnists/critics cannot possibly produce good and useful work—ran a distant second.) Despite his protestations that he is not anti-progress, Keen is indeed a Luddite in the historical sense of that term—someone who recognizes that economic progress is about to make his job obsolete and who wishes to halt that particular aspect of economic progress. Keen recognizes, correctly, that mainstream media's old economic model isn't working in the Web 2.0 world, because the mainstream media has been undercut on price by bloggers who are willing to produce a comparable product (note that I said comparable, not equivalent) for less money (in this case, generally free). “[P]erhaps the biggest casualties of the Web 2.0 revolution are real businesses with real products, real employees, and real shareholders,” Keen writes on page 27. “Every defunct record label, or laid-off newspaper reporter, or bankrupt independent bookstore is a consequence of 'free' user-generated Internet content.” On page 62 he discusses a contest that Frity-Lay ran in which amateurs created commercials for Doritos. He calculates that this contest cost Frito-Lay $331,000 less than it would have cost them to pay for a professionally-created spot. “That's $331,000 that wasn't paid to professional filmmakers, scriptwriters, actors, and marketing companies—$331,000 sucked out of the economy.”

Except that that money wasn't sucked out of the economy. Frito-Lay did not take that money out back and burn it; they spent it on something else—something else that, in their professional judgment, was more valuable than a professionally-created advertisement. Listen carefully, because this is important: it is pretty much always and everywhere a good thing when people and corporations can meet the same needs while spending fewer resources, because that frees up resources that can be used to meet needs that weren't getting met before. One of my favorite econ blogs tipped me off to a great statistic today: There are more World of Warcraft players than farmers in the U.S. today. Something like 2% of the U.S. population today are farmers, compared to around 33% 100 years ago and 90% 200 years ago. Yes, it must have been scary to be a farmer during the early days of the Industrial Revolution, or to be an automotive assembly line worker or a journalist today—but the net result of jobs in these industries being destroyed isn't eternal suffering, it's freeing people up to do things that weren't being done before. If it still took 90% of the population just to grow the crops we needed to feed and clothe ourselves, there wouldn't be nearly enough people left over to produce all of the luxuries that we take for granted today. Anecdotally, living in Detroit, I've heard that a lot of the auto workers who are taking buyouts from the Big Three are going back to school to train for jobs in health care. Who would have wound up taking care of all of the aging baby boomers if progress hadn't made so many automotive jobs obsolete? I don't know what the journalists, columnists and critics who are being displaced by bloggers are going to wind up doing in the future, but I'm sure that somebody will come up with something more productive for them to do, and afterwards we'll wonder how we ever could have lived without having people doing that thing.

Look, nobody is holding any guns to anybody's heads anywhere in this process. I'm sure that the amateurs who submitted commercials to the Frito-Lay contest had a blast making them. People write blogs because they enjoy writing them; people read the blogs that they read because they enjoy reading them. If people want to pay Keen (or anybody else) to keep writing, nobody is saying that they can't. If newspapers can come up with a business model that allows them to bring in enough advertising revenue to continue publishing, more power to them. But if people feel that their information/entertainment needs are being met by free user-created content, then more power to them, too—I'm sure that they can find other things to spend their hard-earned money on that will bring them more utility than paying for content.

After the third chapter, the book began to suffer from the malady that is so common to popular non-fiction books: the author has enough material to write an excellent long-form essay (think a New York Times magazine cover story, or the extended pieces carried by Atlantic Monthly and the New Yorker), but prefers the larger paycheck that comes with writing a book and not a magazine article. So, they pad what would have been a very excellent essay (and as much as I'm shredding some of Keen's arguments here, he is an excellent writer with some good points and I do think that there's an excellent essay to be made out of the first three chapters) with a lot of not-entirely-relevant filling to make it book-length. Identity theft (Chapter 7) and online gambling (Chapter 6) are both very bad things that have been facilitated by the Internet, but their connection to the "cult of the amateur" that Keen is decrying is awfully tenuous. Similarly the problem of music piracy (Chapters 4 and 5). Keen's cult of the amateur theory—in a nutshell, that people are turning away from traditionally produced and vetted culture in favor of unregulated dross—is actually disproved in a certain sense by the music industry. Despite the fact that there are thousands of unsigned bands running around posting their music free for the taking on MySpace, people by and large still prefer music created by professional musicians acting within the traditional label system. The fact that they prefer to steal it rather than buy it is a problem, but it's not a problem that's particularly tightly tied to the Web 2.0/Cult of the Amateur phenomenon.

Thursday, October 18, 2007

The Semantic Web, redefined

This is four years old and probably everyone has seen it but me . . . but it's hysterical, so it's getting linked to again. (It will also take you all of 10 seconds to read.)

Tuesday, October 16, 2007

You know you're a librarian when...

...you're wandering around the world's largest Christmas store and you start to wonder, "How do they decide on their categorization system for all of these ornaments? What kind of metadata do they use in their internal tracking system? I mean, if I asked one of the clerks if they had any ornaments with skiing penguins, could she look that up in their computer by doing a Boolean keyword search on penguin AND ski*?"

Still technically on vacation. Back with real blogging (including my review of The Cult of the Amateur, which I finished on the way home from Denver) probably this weekend.

Thursday, October 4, 2007

Link Roundup

Here goes—a valiant effort to clean off some of my desktop before leaving for Denver for the LITA National Forum tomorrow. I'll be blogging at the LITA Blog for the weekend, so check it out. I have no idea what sessions I'll be blogging yet (I'm going as one of the student volunteers, so I go where they send me), but I'm sure that wherever I wind up, it'll be interesting.

And now, onto the links:

Much of this study falls into the, "Well, duh," category of research. College students prefer search engines over libraries because search engines are faster, more convenient, and easier to use? I mean, it's nice to have some survey data to put with the common knowledge, but this isn't exactly groundbreaking. There are a couple of interesting points, though. Three-quarters of college students realize that, on balance, the information in libraries is more likely to be credible and accurate than information on the Web. I hope that those students aren't blindly trusting everything they read in the library and dismissing out-of-hand everything that they read on the Internet, but it's still not a bad finding. And, finally, the item that really interested me was about students' perceptions of the relative effectiveness of reference librarians versus search engines. One-third of students think that librarians are better than search engines; two-thirds think that they're the same or worse. (I'm actually surprised that these findings are as positive for reference librarians as they are.)

Terje Hillesung writes about Reading Books in the Digital Age Subsequent to Amazon, Google and the Long Tail in First Monday. (Which, apropos of nothing, is the most confusingly-named journal I know. I was a political wonk before I was a library techno-geek, and "first Monday" already has a very specific meaning in political wonkery: the new Supreme Court term starts on the first Monday of October, which is sort of like Christmas for serious wonks. As much as I love First Monday, it's going to be a long, long time until it's the first thing I think of when I hear the words "first Monday.") I haven't had time to read the whole thing yet, but from the abstract it looks like another really interesting perspective on the future economics of publishing.

Some love for librarians in Semantic Web-land, and more about the Semantic Web and libraries (Hat tip for both: jackflaps.net.)

Reading this interview with Richard Ackerman (hat tip: LISNews.com) in the midst of reading Andrew Keen's The Cult of the Amateur is a very interesting experience, because they're talking about some of the exact same things (right down to O'Reilly's SciFoo unconference) and yet coming to totally different conclusions about them.

Yes, I've finally gotten around to starting The Cult of the Amateur. So far I don't have much to say about it. Keen writes very well, and I have not yet been tempted to throw the book across the room, but in the first 20-odd pages he really isn't saying much. As far as I can tell so far his big complaint about the Web is that it allows the hoi polloi to read what they're interested in, not what cultural gatekeepers like Keen think that they should be interested in.

I've recently discovered another great blog, called Cataloging Futures. Another librarian who's interested in the Semantic Web!

LazyLibrary

Who knows what people will do with your data when you put it out there for them to play with formatted in a way that allows for easy playing?

Some folks have created the LazyLibrary, which will let you search Amazon for books on a given topic that clock in at 200 pages or fewer.

(Dear database vendors: please take a page from the LazyLibrary folks and allow me to limit my searches in your databases to articles within a given word count range. This would make it much easier for me to find pieces are the right length to use in Greenhaven anthologies. Thank you.)

Tuesday, October 2, 2007

The Economics of Digital Publishing

Tons of pixels have been spilled recently by people trying to figure out how publishing is going to adapt its profit model to the Web. The New York Times has finally abandoned its ill-conceived plan to charge people to access its opinion columnists online; the Wall Street Journal (newly purchased by Rupert Murdoch)—pretty much the only major popular-press publication out there that's managed to succeed with a subscription-only model for online access—is also rumored to be considering putting its content online for free. Why? In both cases, because they think that there is more profit to be made in attracting more eyeballs and in charging advertisers for access to those eyeballs, than in charging the owners of the eyeballs directly for access. People in the Web age have gotten used to the idea that information is supposed to be free, and freely linkable and Googleable, so that free discussions can take place around that information. (See: most political blogs.) And information that is not freely linkable and Googleable will be ignored, as both the New York Times and the Wall Street Journal have discovered.

So . . . what are the implications of that for libraries? For the amount of information that will be produced by organizations resembling traditional publishers? For the ratio of information-on-the-Web (that can be searched, indexed, mashed up, Semantic-Webbed, etc.) to information-only-on-dead-trees (that can't be)? I'm not sure that anybody really knows yet. But the people who write at the following links are thinking about it.

Representatives from the California Digital Library, Google, and Microsoft discuss publication and academic libraries in the digital age.

Dani Rodrik (professor of political economy at Harvard) asks, “Why publish in a journal if you can disseminate online?”

The Ithaka Report on University Publishing in a Digital Age. (Yes, it's several months old now. But I've been a little busy of late, so it's still sitting in a Firefox tab waiting for me to find time to read it.)

How Google Killed Web Subscriptions, which links to lots of other information about the death of TimesSelect and the rumored death of the Wall Street Journal paywall.

What will AdBlock Plus do to the advertising-supported free online content model?

Saturday, September 29, 2007

Trust and Folksonomies

In a job interview two weeks ago, I was asked a question about the potential for malicious tagging if libraries opened up their catalogs and allowed users to tag books as they pleased. At the time, I answered by saying that most tagging systems handled malicious tags by ignoring them: if only one person tags an item with a malicious tag in a system like LibraryThing that has thousands of taggers, then that tag will sink to the bottom and the more common (and therefore more useful) tags will rise to the top. But after chewing on the question for awhile, I think I want to take that answer back.

Oh, it's a perfectly accurate answer, don't get me wrong, but I want to contest the very premise of the question. Why don't we as librarians trust our users to act responsibly in the library catalog? Consider all of the trust that we already put in our patrons. We let them wander around largely unsupervised in buildings containing millions of dollars worth of books that people have spent hundreds of hours painstakingly shelving in the proper order. We let them check out hundreds of dollars worth of books and DVDs at a time and take those books out into the world, where we have no control whatsoever over what they do with those items. And, really, how often do our users do anything truly catastrophic when given this trust? Yes, there are accidents sometimes, and coffee gets spilled on books or dogs them chew up, but I'm talking about really large-scale, intentional attemps to be destructive or malicious. Have you ever had a group of users, say, decide to re-shelve a section of books by color? (If I was a bored freshman looking for some sort of mildly amusing prank to pull, that's the kind of thing I'd consider. Think how pretty a rainbow of books would be!) How often do bored students pull the keys off the keyboards on the library computers and put them back on in alphabetical order? (Personally, I don't understand the amusement value of that one, but it was popular with kids in my computer programming classes in high school.) Often enough that you would call it a problem and consider taking some sort of step to cut down on it? If not, then why do you think your users would be any more likely to vandalize the catalog than they are to vandalize your physical facilities?

Friday, September 28, 2007

A New Use for Wikis

I haven't decided yet if this is sheer brilliance or absolutely insane. On the one hand, bringing more transparency to the process of drafting laws, and increasing the ability of non-lobbyists to have an influence on that process, are unquestionably good things. Heck, forget for a minute about the ability of common people to suggest edits to a draft law—can you just imagine all of the voters being able to look at the "edit history" of a law and see what parts were inserted by which people when?

On the other hand, a big part of what makes Wikipedia work is that is has a dedicated core of editors who are interested in truth and balance over any partisan viewpoint, and those editors out-number and out-clout the partisans. This situation is much less likely to hold in a sphere that is partisan by definition, such as drafting laws. As Michael Mussa (a former economist with the International Monetary Fund at a level where even the economists have to be politicians) once commented, "In Washington, truth is just another special interest, and one that is not particularly well financed."

New and Notable Digital Collections

Three new initiatives to digitize information and put it online have been in the news lately.

1) The Boston Library Consortium, which includes most of the major New England colleges and universities (Brandeis, Brown, MIT, Tufts, and quite a few more) is working with the Open Content Alliance to digitize public domain materials in their libraries.

2) The papers of former Supreme Court Justice Harry Blackmun are being digitized and put online. (Hat tip: Volokh Conspiracy.)

3) Robert Heinlein's papers are being digitized and put online (although, unfortunately, you have to pay to get access to them). (Hat tip: Slashdot.)

Sunday, September 23, 2007

Google Strikes Again!

I can only assume that they're planning to launch a U.S. version of this sometime before the campaigning for 2008 really gears up.

Friday, September 21, 2007

Another Player in Semantic Search

A startup called Powerset debuted a new system of semantic natural-language Web searching on September 17. It's currently in a closed alpha, so I haven't been able to poke at it, but it sounds both ambitious and wonderful. Plus, it's build by PARC—the same folks who invented computer mice and GUIs. Here is the page for PARC's natural language processing research project, which is the technology used by Powerset.

Saturday, September 15, 2007

One more reason to be skeptical about information published in journals

I really need to be preparing for my job interview on Monday (everyone wish me good luck and pray that Northwest Airlines manages not to make a hash of my flights!), but this story is so fascinating that I had to blog it right away.

Dr. John Ioannidis, an epidemiologist, has compiled evidence indicating that the results of the majority of published studies in the sciences are incorrect. He has also analyzed some of the reasons for this, stemming from the incentive structures of publishing and academia. Ioannidis doesn't seem to use the term confirmation bias—one of my hobbyhorses areas of research interest—but I think he'd agree that confirmation bias probably plays a pretty big role here too.

Tuesday, September 4, 2007

More about e-book sales

Looks like I may have spoken too soon about ebooks not catching on in the U.S. According to the International Digital Publishing Forum, U.S. trade ebook sales broke $8 million in the second quarter of 2007, up from under $2 million as recently as the fourth quarter of 2002. And that "does not include library, educational or professional electronic sales." It's still peanuts compared to print publishing revenues, but it's a sharper upward trend than I thought.

This is why one should always check the numbers and not just believe whatever the conventional wisdom is on a subject.

What was that about e-books being doomed?

In Japan, e-books designed for cellphones outsold print books in the first six months of 2007. (Hat tip: LISNews.)

I wish this sort of thing would catch on in the U.S. In general I prefer reading on a screen to reading on paper—I appreciate having the ability to search the full text for words and to control the font, the type size, etc. And, since it's not particularly uncommon for me to spend 12 hours a day at the computer, I've invested in a nice setup—a big high-quality LCD screen, ergonomic keyboard and trackball, and a good desk chair—so I'm actually more comfortable sitting at the computer than I am sitting on the sofa or in bed or all of those other places that people say they prefer to read. But I very rarely read true e-books (of the type carried by NetLibrary or ebrary) because the interfaces on them are so awful. Actually, for the longest time I couldn't use ebrary books even if I wanted to, because their proprietary reader didn't work on Linux and I wasn't about to boot into the other side of my dual-boot setup to access their books. (I'm a messy-desktop person—I usually have several dozen Firefox tabs, a dozen or so Thunderbird windows, and half a dozen word processing documents open at once. Closing them all, booting up into Windows, and then going back to Linux and trying to remember what I had open and why and re-opening them all is a pain that I'm only willing to go through for a very small number of things.) I've tried NetLibrary, but I got frustrated at my session timing out and losing my place. I was trying to use a NetLibrary book to write a paper, so I wanted to be able to refer to the book, refer to other stuff, write for awhile, pace around for awhile, and then go back and refer to the book again. No dice—every time I went back I had timed out and I had to start from the beginning to find the book and my page again. Also, at this point the majority of the time when I'm reading a book I'm doing so with an eye towards using an excerpt from the book in one or another anthology that I'm editing for Greenhaven, which means I need to be able to print a copy of the chapter I want to use. Except (at least the last time I tried this) NetLibrary is understandably not so keen about people printing out entire chapters because of the potential for copyright infringement.

But my point is, these aren't problems with e-books per se; they're problems with the current e-book interfaces. Unfortunately, I don't know what it's going to take to convince the e-book vendors to either improve their proprietary readers or to serve books in plain old let-me-do-what-I-want-with-it HTML. *sigh* Maybe I should just move to Japan.

Sunday, September 2, 2007

More Holiday Weekend Reading

I've had the June 2007 issue of Webology open in a Firefox tab since, oh, June or so, because it had a couple of articles on folksonomies and ontologies that I intended to read and blog about. Today I finally got around to reading them . . . and I have nothing to say about them except, read them. There's nothing too surprising in either of them, but they're still worth a look.

Saturday, September 1, 2007

Speaking of Economists and Libraries....

This article on how patients would get more accurate diagnoses if doctors used computer algorithms rather than their own “clinical judgment” has been getting some attention on econ blogs.

What does this have to do with libraries? Read the debate that's been raging on the NCG4LIB listserv about the potential that a well-coded algorithm using Bayesian inference could possibly do authority control work as well as if not better than human catalogers, and ponder how much the catalogers who decry this possibility might have in common with the doctors discussed in the article.

Link Roundup

An interview with the man behind Google Scholar. (Hat tip: LISNews.org.)

How many books should you start?, by Tyler Cowen, who is on the economics faculty at George Mason University. More LIS folks should read econ blogs like Marginal Revolution. One of the big movements in economics right now is behavioral economics—drawing on both psychology and economics to understand why people do what they do with the resources they control, including their time. I think that if more LIS folks understood the incentives that people respond to, they'd have an easier time designing services that get used and selling their services to their funding authorities.

Thursday, August 23, 2007

Open-Source Legal Documents

I'm sure that it will surprise no one who reads this blog that I think that this is a great idea. (More from the New York Times.)

Another Fun Toy from Google...

...brought to you by structured data and a little imagination: a book layer for Google Earth. Visual representations of the mentions of geographic locations in books! I haven't had a chance to poke around in this and see how it's put togther, but what from what I've read about it, it's quite spiffy.

Saturday, August 18, 2007

Link Roundup

See who is editing entries on Wikipedia. (More discussion.)

The Pew Internet & American Life Project finds that 28% of Internet users have ever tagged or categorized online content, and 7% of Internet users tag or categorize online content basically every day. Seven percent! That's a lot of time and effort that could be harnessed if somebody could figure out a good way to harness it....

More evidence that much of the trust that people put in traditional print media is misplaced. (I'm not convinced that the methodology he's using is 100% reliable . . . but even if he's only half-right, it's still quite a finding.)

Wednesday, August 8, 2007

HTML 5

Slashdot pointed me today to some information about the new HTML 5 standard. Specifically, that said standard is going to include a bunch of new semantic tags! You'll be able to mark up times in a machine-readable format; do all sorts of fancy stuff with numeric data; and indicate which parts of a page are navigation, which are the article, which are figures, etc.

Thinking of all of the spiffy search engine options that could be built to draw on these new tags is left as an exercise to the reader. Personally, I'm looking forward to be able to search for text just inside of a [figure] tag. I'm forever half-remembering some really spiffy chart or other graphic that I saw somewhere and not being able to find it again. (I'm both a visual person and a data/statistics geek. What can I say?)

8/23 -- edited to fix the fact that the word "[figure]" didn't display.

Wikiing the News

Google News announced yesterday that they're trying a new, experimental feature that will allow "comments from a special subset of readers: those people or organizations who were actual participants in the story in question."

I suspect that corporations will avail themselves of this feature more than individuals, but for the individuals who choose to use it it will be a powerful tool. If corporations don't like how something that they said or did was spun in a news story they can put out a press release that will generally be published, in whole or in part, in quite a few newspapers. Individuals, on the other hand, are just as frequently quoted out of context or otherwise mis-spun as corporations, but they have much less clout to get their side of the story out. Good for Google News for increasing readers' access to all sides of the story and for letting participants in the news fact-check stories about themselves! It would be nice if they had a mechanism to let everyone with knowledge of a topic fact-check stories on that topic, rather than just the participants in a single story, but this is still a step in the right direction.

More on the Open Library...

...in Inside Higher Ed.

Saturday, August 4, 2007

Link Roundup

I've had a bunch of articles open in Firefox tabs for weeks that I've been meaning to blog about, and I've decided to accept the fact that I'm not going to blog about them and just post the links for your reading pleasure.

A nice interview with Tim Berners-Lee about the Semantic Web.

Geotagging and potential uses thereof.

A Library Thing for Libraries success story, or why the flexibility and democracy of tags often beats the rigid, top-down LCC subject headings.

Martha Yee's "Will the Response of the Library Profession to the Internet Be Self-Immolation?" (WARNING: link is a .doc) and the discussion thereof on the JESSE listserv.

An automated method for assessing the trustworthiness of Wikipedia edits based on the trustworthiness ratings of the contributors who made the edits. This is one of the coolest things I've seen in awhile. It's a little jarring to see chunks of text highlighted in orange while you're reading, and the articles I looked at had things flagged as untrustworthy based on the contributors' reputation that I know were perfectly accurate, but it's still a really neat idea.

"Are Tags Vannevar Bush's Trails?"

Hakia, which is billing itself as a Semantic search engine. I'm not sure it's clearly better than Google at this point, but it's got them beat in certain categories. Google won hands-down in a search for "Who is the president of Slovenia?"--Hakia took me to a subject page all about Slovenia, which is well-organized and spiffy in itself but wasn't quite what I was looking for, while Google gave me the answer right at the top of the page above its search results. On the other hand, when searching for "What is the molecular weight of carbon monoxide?" and "How many symphonies did Beethoven compose?," Hakia highlighted the answer in the search results(!) and Google made me actually skim the page. But, regardless, it warms the cockles of my cynical little heart to see people out there who take semantic search seriously. And if the Hakia folks manage to pull off everything that they say that they want to pull off, this could be great.

Sunday, July 29, 2007

A Crowd-Powered Search Engine

Jimmy Wales, founder of Wikipedia, has gone and done it—he's launching a wiki search engine, which, according to the story "will combine computer-driven algorithms and human-assisted editing.... Human editors would help untangle terms with multiple meanings, such as palm, which can refer to location like Palm Beach, or generic topics like trees or handheld computers."

You can bet I'll be watching this one closely.

Saturday, July 28, 2007

"Satisficing" Is Not a Dirty Word

Another one of my pet peeves is people who complain about students and other information-seekers “satisficing”—looking for information that is just good enough, rather than for the best information. A related pet peeve recently came up on PUBLIB (don't ask why I lurk on PUBLIB; there was a reason at the time I signed up for it, but now it's mostly for amusement value), when someone called easily searchable digital information systems a “prop” that hinders the development of thinking and reasoning skills.

While these complaints have a certain degree of merit to them, they ignore a couple of important economic principles. (Humor me here; my undergrad background is in the social sciences.) We all have a limited amount of time, money, energy, etc., to get through our days, and we have to make rational decisions about how to “economize” those things—how to use them most efficiently to achieve the most we can based on our constraints. This means we can't have it all—things that are time-consuming might not be expensive monetarily, but they're “expensive” in terms of another scarce quantity: time. Home-cooked “slow food” meals might not cost more than take-out, but an hour spent preparing a slow food meal is an hour that you can't spend, say, mowing the grass or sleeping or doing other things that you need to accomplish. Information is no different: an hour spent digging through a pile of poorly organized information trying to find the piece that is needed is an hour that a student can't spend writing the paper he needs to write, or doing homework for his other classes, or having a life outside of school. Yes, sometimes it's important for students to take the time and effort to really dig in and learn the structure of the literature in an area, to see who the big names are and what they're arguing, to learn the contours of the discourse . . . and sometimes they just need to find a piece of information quickly and get on with the rest of their lives. I suspect that this is doubly true of public library patrons, who generally don't feel the need to engage with a broad swathe of human knowledge the way students should. So give the people what they want already and don't make them feel guilty for having other things in their lives that are more important to them than conducting the best information search possible! Unless you live up to every other field's standards of perfection: if you eat only home-cooked healthy meals, exercise for the recommended 30 minutes per day, sleep for the recommended 8 hours per night, maintain your home in a state of Martha Stewart-like perfection, check the air pressure in your tires every time you gas up your car....

Thursday, July 26, 2007

More about Eyeballs and Errors

Sorry for the disappearance; I'm finishing up my last semester of actual classroom classes for the MLIS right now, so things have been a bit crazy of late. (I still have one more semester before the degree, but I'll just be interning in an academic library in the fall semester—no actual classes.) I hope to be back with a real, meaty post soon.

In the meantime, take a gander at this list of errors in the Encyclopedia Britannica that have been corrected in Wikipedia.

Monday, July 9, 2007

A Brain-Flash about Finding Libraries

I had one of my brain-flashes this morning, brought about by this Stephen Bell post at ACRLog and my initial response to it, which is posted in the comments over there. (Go read them. I'll wait. The rest of this post won't make much sense if you don't.)

The brain-flash was, this sounds like a project for the hive-mind! The hardest part of creating a "Find Your Library" tool is gathering all of the information about all of the however many thousands of libraries there are in the U.S. (and Canada, if we want to be inclusive). If you have to pay people to gather all of that information, it gets expensive, but create a site where people can contribute information about the libraries that they work at/patronize/know about, and sooner or later you'll get all of your data for free.

And you can get really rich data and do fun stuff with it when you're letting the public contribute. Let people rate libraries and leave comments about them! Let people tag the libraries and allow tag-based searches! Let people create structured folksonomies to organize the libraries into hierarchical categories to allow for even more powerful searches! This would be a really fun test-bed for some of the ideas I've been kicking around about people-powered ontologies....

So, what do you say? Is anybody interested in helping me launch this site?

Sunday, July 8, 2007

Given Enough Eyeballs, All Errors Are Obvious

One of my biggest pet peeves in life is people who proclaim that Wikipedia can't be trusted because anyone can edit it, while at the same time placing their full trust in "professionally-created" encyclopedias—preferably ones printed on paper—put out by the major publishers.

This pet peeve has been on my mind more than usual of late because I've spent a whole lot of the past month up to my eyeballs in reference works about the history of Eastern Europe, and I've found errors in quite a few of them. Two errors stick out for me, because I didn't immediately recognize them as errors and they sent me off on wild-goose chases. Error #1: A book on the history of Poland put out by one of the major library-focused publishers had some of the vital dates for Poland's most famous medieval queen off by about 50 years. (I've returned the book already and I don't remember if it was her birth date or death date or the date she assumed the throne or what, but it was something important like that, and it was WAY off.) Error #2: An entry in one of the major, reputable online encyclopedias listed one of Czechoslovakia's prime minister as an ethnic Slovak when he was actually an ethnic Czech.

Had these entries been in Wikipedia, I probably would have taken the time to fix them. Had they been put out by a company I freelance for, I would have e-mailed one of my contacts there and had them have it fixed. (Well, the online one anyway; there's really no fixing a book that's already been published.) But since neither of those things were true, those errors are going to persist and mislead more people, some of whom will never know that they've been misinformed.

Some errors are unavoidable. People are human and they make mistakes. But the odds of someone recognizing and fixing a mistake go up dramatically as the number of people who have the opportunity to notice and fix the mistake increases. Having worked in reference publishing for 6 years now, I'd estimate that around 4 people really seriously evaluate most things before they're published in reference books. How many people read and edit the average Wikipedia article? I don't know off the top of my head, but I'm guessing that it's a lot more than 4.

That doesn't mean you should uncritically accept anything you see on Wikipedia. Of course there are plenty of errors in Wikipedia as well, both of the "innocent mistakes" variety and the "malicious vandalism" variety. And of course professionally-produced encyclopedias don't have to worry about malicious vandalism in the same way, and that's an important factor to consider when discussing the reliability of Wikipedia. All I'm saying is that the Linux folk who proclaim that "given enough eyeballs, all bugs are shallow" are on to something, and not just in software development.

If you're interested in a scholarly, data-heavy examination of fact-checking in Wikipedia, check out this article.

Saturday, June 30, 2007

"The Cult of the Amateur"

I feel like I need to engage with this new book, The Cult of the Amateur, by Andrew Keen, but I have a feeling that it's going to be one of those books that I can't finish because I have to stop reading and rant every few pages.

But, from the reviews I've read of the book, it seems like Keen makes a few points that I ought to be able to counter if I'm going to propose that a people-powered ontology/folksonomy can underlie the Semantic Web. And I have a blog to rant on now! So I think I'm going to grab it next time I'm at the library.

Wednesday, June 27, 2007

Semantic Web 101

If you were one of the folks who read the first post and thought, "What on earth are you talking about?," this article is for you.

Monday, June 25, 2007

Welcome to The Folks' Web

Hello, world! Welcome to my blog. I'm Julia Bauder, a student in the Library and Information Science Program at Wayne State University.

One of my interests in information science is the Semantic Web (and the organization and retrieval of online information in general). This blog is called The Folks' Web because I've been kicking around some ideas about how to harness folksonomies to run the Semantic Web. I'm sure that right about now you're thinking, "That is the dumbest idea I have ever heard." (Unless you're thinking, "Folksonomies? Semantic Web? What on earth are you talking about?") For a while I thought it was a dumb idea too. But I wrote a 25-page paper last semester that convinced an old-school cataloging professor not only that it could work, but that it would work better than the rigid ontologies that most Semantic Web people think should be used. So stick around and hear me out. Or stick around and argue with me. Whichever you prefer.

I'll be updating this blog regularly with more about my ideas and with links to other stuff that discusses related topics, starting soon.

------------------------------

I have to give some credit for the existence of this blog to all of the great LITA folks at ALA Annual 2007 (especially Michael Habib) who have convinced me to get off my duff and get these ideas out there into the LIS world.