So there’s a new web metric in town according to Nielsen/NetRatings - page views is so ‘96, the new sheriff is time spent on site. The rationale seems reasonable, in this new web2.0 world of AJAX and flash - for many sites pageviews are not a good metric to determine actual usage. Some sites are more applicationy, remaining on one page for awhile as new content comes in.

Now long ago, I wrote the web stats system that the NYTimes website used get its usage stats from. Now, obviously things have progressed a fair bit from then, but there are a lot of issues that remain. Here’s the main one. If you come to a site and then don’t close your browser, perhaps it’s in a tab or perhaps you read it and then walked away from your computer (yes, I know, you do that all the time so do I) how should we count that in time on site? I discovered that there were a couple ways to do that, you could cap a “session” at a certain amount of time (the standard then was 30 minutes!), so if you did that basically 30 minutes would be added to your time on site, or you can simply ignore these final trailing pages (which presents a problem if the visitor - as many do - leave your site after viewing only one page). Well, I would find differences based on this simple decision (using exactly the same data) that would turn the average session from 1 minute to 25 minutes. Essentially, the deciding factor for time per session was not how much time you actually may have spent on the site, but how we handled your trailing pageview. And there’s no “correct” way to handle the trailing pageview because you have no way of knowing if the user is viewing it or not.

Back in the day of log processing, you would actually have this same problem if you closed the browser or went to a different site - we had no way of knowing. With javascript you can probably get this information much of the time, but not every time. It’s a lesser problem but this remains a significant hurdle.

When I raised this issue to the powers that be, it didn’t matter. They understood the issue, but didn’t care. It was simply that if everyone believed the same lies, then they were all working from the same base and no one was getting an advantage over the other, which was the real key. I presume that to be the case now.

There’s other problems with this. Switching to a time on site metric, as they say, makes sense for some web2.0 sites because many of them compress what would normally be many page views into a single page view. However, the vast majority of the web remains page view driven. It does not make sense to move to a time on site for them. Of course YouTube will benefit from having a long video on a page - but why should the NYTimes be punished for having text content that lets you read as fast as you will?

There’s all sorts of problem with the time spent on site number. People visit a single page and leave it in a tab while they check other things. I routinely have dozens of tabs open and I know most other people do as well. Browsers wouldn’t have invented tabbed browsing if people didn’t have many pages open at the same time. The number that they make is interesting and possibly useable for basic understanding, but it is a fiction. And a fiction that only makes sense for a part of the web at that, granted it is a very hyped part of the web, but still a tiny portion of it. It remains inappropriate as the main metric for almost all of the web.

All this not to say that page views and vists are not without their problems. But I strongly believe that is possible to very reasonably accurately count page views (although not everyone does it) and smart counting brings you relatively close to the truth. Something that I think is not currently possible for time on site.

← newer Dell and Google  ↑  Breakfast Links: Wireless world, Chinese Sub & Lesbian Gangs older →

TwitterCounter for @nybble73