
StatCounter’s global search engine referrer data indicated that on Thursday, June 4th, Bing.com brought in more search traffic than Yahoo Search… at least to the sites they provide analytics for.
After the initial buzz - and astonishment - quite a few bloggers, commenters and twitterers jumped in and basically said “But who the heck is StatCounter? Let’s see what Hitwise, Comscore, Compete etc. say about it.”
Although that may be a perfectly reasonable response, it bugged me a little because I’m not convinced that web panels (like Comscore, Hitwise, Alexa, Compete, etc.) are the best way of getting at THAT type of data. Different data requires different approaches and, in this case, I think I prefer StatCounters’ approach.
A panel of web surfers is at its best when gauging “broad-spectrum” results, like the “100,000 most popular websites” for example. But a web panel is NOT a superior method for obtaining narrow-spectrum results, like “most popular monitor resolution” and the like.
Here’s why:
If your user panel size is 2 million (like Comscore’s, the gold standard), you only have 2 million “sources” of data. For broad-spectrum work, where you obtain multiple datapoints per user, that’s great. In other words, if each user views an average of 100 web pages/day, that’s 200 million datapoints/day, and that’s a nice big number.
But for narrow-spectrum work, ie, things that yield fewer datapoints per user (usually just one) like operating system or geographic location, panels are not necessarily superior, and there’s no inherent advantage of using that method over Statcounter’s inbound analytics approach.
The question is: is search engine referrer data a broad-spectrum or narrow-spectrum dataset? Honestly, I’m not sure, but I would argue that it’s fairly narrow:
While a panel member may visit 100 different websites/day, many will use just ONE search engine. (Maybe two for the more adventurous types.) So even with an enormous panel of 2,000,000 users, you’ll still get just 2,000,000 datapoints on the matter of search engine preference, regardless of how many pages they actually view on the site.
And, if it is a narrow-spectrum datapoint, then given the same sample size, I personally would prefer to observe a group of sites instead of observing a group of users. In other words instead of using the panel model of tracking the surfing activity of a set of known users, I suspect you’d get superior results from using the analytics approach and observing the incoming activity of a dispersed, unknown group of RANDOM users. Again, this assumes the user sample size is the same and that we’re tracking something sufficiently narrow.
As for sample size, StatCounter says they track over 10 billion pageloads per month, representing over 2,000,000 unique visitors to the 3,000,000 websites they track. If so, that’s roughly the size of the well-known web panels, and, again, a bit “more random” than web panels since the visitors to StatCounter’s client sites haven’t self-selected to have their online behavior tracked.