April 12th, 2008
This past weekend there’s been a conversation about Shyftr a new RSS service that allows people to read and comment on full text stories on the Shyftr site, rather making the reader click through to the originating blog to comment. The thought is that folks who care about pageviews for advertising will lose out in such a scenario.
So, in the spirit of helping the wider, feathers in a ruffle, blogging community out, I’ve pasted the Shyftr RSS bot info below. The good news is that you can block the Shyftr IP address from accessing your blog (if you already have that capability through your blog hosting solution, etc.). As of present, the IP address is 66.234.234.34.
Unlike other annoying bots, I would not block the user agent in your .htaccess file as the RSS bot software the Shyftr folks are using is the generic MagpieRSS toolset, which is used by other RSS services. Hopefully, the people at Shyftr will rename the user agent to something more uniquely identifiable in the future so you can block via .htaccess.
(Note: Blocking a future unique Shyftr user agent via robots.txt probably won’t work as the crawler would need to fetch the robots.txt file first before fetching your feed and I didn’t see that behavior tonight.)
Host: 66.234.234.34
*
/feed
Http Code: 200 Date: Apr 12 19:48:28 Http Version: HTTP/1.0 Size in Bytes: 6244
Referer: -
Agent: MagpieRSS/0.72 (+http://magpierss.sf.net)
*
/favicon.ico
Http Code: 200 Date: Apr 12 19:48:28 Http Version: HTTP/1.0 Size in Bytes: 1406
Referer: -
Agent: -
Technorati Tags: Shyftr, RSS, RSS toolset, robots, crawlers, bots, lost pageviews, comments, blogger brouhaha
Posted in General, Internet | No Comments »
January 20th, 2008
On the MSNBC developer blog, the question was posed How do you share?. Not in the grade school way, but in the newfangled Web 2.0 way.
Overall, the comments from MSNBC readers were pretty… negative. Aside from the “I’ll just paste the link I want to share in an email” or the “I’ll just add the page to my browser bookmarks” or the “they’re tracking your habits for nefarious purposes” comments, other commenters cited just one or two social bookmarking sites (the most popular seeming to be either del.icio.us or digg.com). And a few other commenters wondered, “Hey, MSNBC, don’t you own Newsvine?”
It appears that the zen habits of social bookmarking hasn’t been widely accepted by the at large Internet populace.
Technorati Tags: social bookmarking, bookmarking, Web 2.0, del.icio.us, digg, newsvine, putting it in my favorites
Posted in Tech, Internet | 4 Comments »
January 18th, 2008
For those of you with Apple TV, do you like it?
I’m thinking of springing for it, seeing as the idea of downloading movies and watching them on my (nearly outdated last of the mohicans CRT TV) does appeal to me. I don’t watch broadcast TV, I don’t have on-demand anything nor do I Netflix.
On the other hand, the iMac is in the family room too and I could, I suppose, hook that up to the TV negating the need for another product from Apple.
Thoughts?
Technorati Tags: Apple TV
Posted in Tech | 3 Comments »
January 6th, 2008
Building upon a discussion elsewhere on the Web, here’s some brute force SEO for you.
Apparently, the NY Times is inserting tagging in the page META title tag, in the instances where it seems that article headlines lack sufficient keywords. Normally, the Times just carries the article’s headline into the page META title tag.
For example, in the article headlined The Falling-Down Professions, the page title tag reads as “Economic Conditions-Economic trends-legal profession-lawyers-prestige-doctors - New York Times”.
You see, the page title tag is important for SEO as Google in particular lends much weight to the text contained within the title tag.
All in all, the NY Times approach is definitely an interesting methodology for organizations deploying content management systems and who wish to build traffic from search engines.
Technorati Tags: SEO, tagging, title tag, keywords, NY Times, CMS, search engines, search engine traffic
Posted in General | 2 Comments »
December 29th, 2007
Posted in Internet | 1 Comment »
December 28th, 2007
Let’s all take a moment and remember the good old days of the Internet in the 1990s … the Netscape Web browser is being end of lifed as of Feb 2008.
If you didn’t catch Code Rush, a documentary on Netscape which was shown on PBS in 2000, I highly recommend you do so.
Technorati Tags: Netscape, Netscape Navigator, Netscape Communicator, Netscape Web Browser, AOL, Mosaic, Mozilla, Mozilla Foundation
Posted in Internet | No Comments »
December 23rd, 2007
This particular crawler is being deployed from the Semantic Web Search Engine (SWSE) project, which is attempting to crawl the nascent Semantic Web, including RSS and FOAF data.
This is yet another reason why deploying RSS is a good idea for any Web presence.
Here’s a link to the SWSE search demo.
Host: 140.203.154.196
/wp-rdf.php
Http Code: 304 Date: Dec 18 14:56:27 Http Version: HTTP/1.1 Size in Bytes: -
Referer: -
Agent: multicrawler (+http://sw.deri.org/2006/04/multicrawler/robots.html)
Technorati Tags: Semantic Web, SWSE, RSS, FOAF, User Agents, Crawlers, Bots, Robots
Posted in Tech, Internet | 2 Comments »
December 17th, 2007
Has anyone else seen some different activity coming from MSN? What I mean is that I’m seeing the following entries in my search logs, but it doesn’t appear like traditional MSNBot crawler behavior.
Why this activity is different:
1) The originating IP address is from the MSN netblock.
2) There is an alleged referrer that looks like it is from an MSN search http://search.live.com/results.aspx?q=keyword&mrt=en-us&FORM=LIVSOP
3) The user agent is showing as a browser.
4) This activity is showing very close to when I see MSNBot entries in my logs.
And no, the behavior does not appear to be a real life user.
Host: 65.55.165.38
*
/2006/06/17/live-blogging-from-the-philly-blogger-meeting/
Http Code: 200 Date: Dec 17 02:59:16 Http Version: HTTP/1.0 Size in Bytes: 40839
Referer: http://search.live.com/results.aspx?q=podcasts&mrt=en-us&FORM=LIVSOP
Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)
Host: 65.55.165.42
*
/2006/06/20/how-e-commerce-will-be-affected-by-ie-7/
Http Code: 200 Date: Dec 17 03:13:02 Http Version: HTTP/1.0 Size in Bytes: 43238
Referer: http://search.live.com/results.aspx?q=podcasts&mrt=en-us&FORM=LIVSOP
Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)
Technorati Tags: MSN, MSNBot, Live, Live search, User Agent, crawlers, bots
Posted in Tech, Internet | 2 Comments »
December 16th, 2007
New crawler in my logs from an outfit called Radian6. From the Web site, they look to be a social media monitoring service for the Google Alerts challenged, I guess much in the same way as those other pre-existing social media monitoring services.
Host: 142.166.3.125
*
/feed/
Http Code: 200 Date: Dec 16 16:52:32 Http Version: HTTP/1.1 Size in Bytes: 7365
Referer: -
Agent: R6FeedFetcher(www.radian6.com/crawler)
Technorati Tags: user agents, Radian6, social media, Radian6 monitors you!
Posted in Internet | No Comments »
December 10th, 2007
Heh. If you happen to Google hack yahoo fantasy football we get a SERP that, well Google isn’t quite sure what to display on the SERP. Cleverhack is currently showing as #3 on the SERP, but with no related text. I’ve never seen a SERP quite like this one before.
You see, I’ve been getting a bunch of hits for this particular keyword combination, but what is most interesting is that up until now, I’ve never used the keywords in a sequential phrase. Sure, the word hack is in the URL, I list Yahoo IM info on my sidebar and I’ve written about fantasy football before (and I’m in the playoffs for two leagues this year!), but never sequentially, until now.
Kind of interesting, and yes I’m ruining the results because I’m blogging about it. And for you guys looking for how to hack Yahoo fantasy football, I don’t have that information on this blog…all I can say is that Randy Moss helped me out and LT has been mediocre up until recently.

Technorati Tags: Google, SERP, related text, Hack Yahoo Fantasy Football
Posted in Internet | 1 Comment »
December 10th, 2007
Feed Each Other is yet another online feed reader released in late September. According to one of the developers behind it, the difference between Feed Each Other and other online feed services is that Feed Each Other
lets you harness the power of your network of friends and colleagues to help you filter and explore the web in an fun, enlightening, efficient way.
It seems like the idea is more of a feature than a standalone feed reading service.
However, I will admit that I do like the strict XHTML they’re using…quite nice.
Host: 75.126.131.34
*
/feed/
Http Code: 200 Date: Dec 09 19:31:02 Http Version: HTTP/1.1 Size in Bytes: 6916
Referer: -
Agent: FeedEachOther :) +http://feedeachother.com/
Technorati Tags: RSS, feed readers, social feed readers, Feed Each Other, user agents, what if I don’t have any friends?
Posted in Internet | 1 Comment »
December 8th, 2007
Two trends I’ve run across recently in the email deliverability world.
First, and this is for you designers who have to work on HTML email campaigns, the Email Standards Project. Because, let’s face it, the need to use old school HTML 4 for compatibility with current email clients makes baby jesus cry.
It’s been kind of quiet on the spam filtering front, aside from the proof of concept .ogg and .mp3 spam. In real deliverability terms, from what I’ve been seeing, there seems to be an increased reliance on URL filtering and on sending IP reputation.
And with that in mind, I was sort of amused to get the following text based email body in my inbox last night. It seems that the spammers are giving up on live URLs and are hoping you’ll be intrigued enough to open a Web browser, find a stock trading site and buy a penny stock.
Hi from Christian . Hope your Friday is cool and happy holidays. Something big for [SOME STOCK] over next few weeks. Check otc boards. Keep an eye out for it and get in early.
Hey, I’m just impressed the guy wished me happy holidays.
Technorati Tags: Email Deliverability, Email Standards Project, HTML Email, Email Clients, SPAM, Text based email
Posted in Tech, Internet | No Comments »