Sonic Spectrum, Yahoo Pipes & Content Liberation
As an inspirational influence to the previous iteration of Patchchord.com, Robert Moore’s weekly “Sonic Spectrum” radio show on 89.3 KCUR was one of my few “must hear” ties to old media. It was always a remarkably eclectic mix of music and musings with a healthy dose of local flavor. When Moore left KCUR, it felt like there was a definite void when it came to coverage of the local music scene.
When “Sonic Spectrum” returned, this time as an ongoing feature over at PresentMagazine.com, I was thrilled. Labeled as a “podcast,” I was excited to subscribe and get my weekly dose of aural oddities and goodness delivered directly to my iPod. To my disappointment, however, there was no RSS feed to be found.
Well, that sucked…and didn’t make sense either. Via Wikipedia…
A podcast is a series of digital-media files which are distributed over the Internet using syndication feeds for playback on portable media players and computers. [emphasis mine]
Now nearly a year later, I finally had a way to do something about that. Enter Yahoo Pipes. Yahoo Pipes is “a powerful composition tool to aggregate, manipulate, and mashup content from around the web.” I’m a visual learner by nature, so Pipes’ fastastic “plug and play” visual graphic interface was very easy to work with. After some casual messing around, I realized that the potential of creating a screen scraper was sitting here right in front of me. Time to roll up my sleeves and get to work.
WARNING: MASSIVE GEEK OUT
First things first, I employed the Fetch Page module to execute the screen scrape, setting it to pull out the main chunk of HTML content that I needed from the PresentMagazine.com “Sonic Spectrum” page. Thankfully, this summary page appeared to have nearly everything I could need to create a complete feed, though I would still have to sift out some code to extract only exactly what I needed.
Here is where I quickly hit my first learning point, which was coming face to face with the notion of having to use regular expressions (regex) to filter the extraneous junk from the screen scrape. As I’ve said before, my kung-fu is weak compared to the sort of talented developers I work with everyday. Yet by deconstructing/reverse engineering a few other feeds, including this one in particular, I was able to get most of the way there. (Big ups to Chris over at the816.com for helping me take the last few steps in this department.)
After the initial Regex module, I was able to divide and assign the content to 4 of the 5 primary RSS feed items I would need in the finished product: title, description, link and a published date.
I still had some more refining to do to the content, so another Regex module helped me clear up the last vestiges of HTML in the feed items. After that, I employeed a looped Date Builder module to transform the scraped date info into a RSS-friendly format.
The first time around constructing the pipe, I didn’t install the Filter module that comes next, but later realized it was needed when I saw non-“podcast” posts publishing to the feed…news items about Sonic Spectrum, etc. This filter ensures that only items with a number in the title pass through the Pipe any further (ie. “Podcast 40″ vs. “Sonic Spectrum at Jilly’s”).
The final few steps are all related to getting at the actual MP3 of the show. I dropped in a looped Fetch Page module that’s designed to follow the link item to the individual article page for each podcast and scoop out the HTML surrounding the MP3 to create an enclosure item. Another Regex module pulls out the exact URL for the MP3 and redefines the content of the enclosure item. Finally, a looped item builder assigns the MP3 file to a URL attibute and hardcodes the length and type attributes. A quick Rename module cleans up the enclosure item before the final Pipe output.
[If that recap wasn’t far enough down in the weeds for you, feel free to poke around the Pipe for yourself.]
So the next step was to take this puppy over to Feedburner and pass it through there. Not required, but more of a way to track the feed’s performance and number of subscribers, to appease my sense of curiosity if nothing else. However, it also gives me a nice, clean URL for the final feed. (Mmmm, clean feeds URLs.) I also played around with their SmartCast setup as a way of making it more compatible with iTunes. Unfortunately, my submission of the feed to be listed in the iTunes podcast directory listings was rejected. The note I got back from iTunes provided no detail as to why the feed was kicked back. Since I would have to change the feed title and URL as to avoid Apple’s duplicate feed autofilter on iTunes, resubmitting would require me to play a guessing game on why it was rejected and plus a lot of other work that I just don’t have time for. C’est la vie.
/END MASSIVE GEEK OUT
Regardless of the final fail on achieving a directory listing in iTunes, the feed is viable for use in iTunes. Just go to the “Advanced” menu in iTunes, select “Subscribe to Podcast…” and drop in http://feeds.feedburner.com/SonicSpectrum. That should also work in your friendly neighborhood feed reader of choice too.
So, welcome to the PresentMagazine.com “Sonic Spectrum” Podcast hosted by Robert Moore. Enjoy!