I just put up my latest pipe to find similar postings. The goal is simple, it takes the first few posts of an rss feed, runs them through content analysis and for
each one of those posts it finds the two most recent search results from technorati and publishes that. I wanted to play around with using Yahoo Pipes’ content analysis module - when I first started it was returning some wierd results, but it seems to have gotten better, I’m not sure if anything changed on their end, but it’s looking pretty reasonable.

The pipe itself is pretty straight forward, it takes a feed url, number of posts to use from that feed url (the default is 3) and the number of technorati search results to use for each url (default 2). So by default this will return 6 search results, all ordered by published date. It also takes an optional filter field that will filter out any of the feed posts that contain something in the title - so for my use, I filter out “Breakfast” because those posts are all over the place and seem to give content analysis some fits.

I used PipeJax to stick this on - tweaked it a little bit because it wasn’t dealing with nested items properly, now everything works just peachy, you can see it in action at the bottom of the right nav on my homepage.

I’ll post a live example right here, too - it may take a second to fill in (if it doesn’t try and reload, sometimes pipes and/or technorati don’t want to behave):

It is interesting to see the difference between this result and my “who else is writing about this” results. They both perform a technorati search, the who else is writing about this’s search are a few words that I select per post to do it where this one uses yahoo’s content analysis. My hand selected search, is obviously a bit more targetted, but content analysis does a decent job of it, I think. I like having both of these around as they surface a bunch of postings that have a reasonable chance of being interesting to me (and hopefully to you) that I wouldn’t have seen otherwise.

← newer PHP? Ruby?  ↑  Breakfast Links older →

TwitterCounter for @nybble73