Guys, the data mining process is very complicated, there are several organization who has access to site, database, surveys, forum, registered fan clubs etc... Their rule engines and crawlers have very sophisticated algorithm to calculate and estimate data. Every calculation might have been validated in contrast to previous year's collected data in (in different scale, time span). The validation process before publishing those data is another huge process. The final statistic might not be absolutely correct, but to some extent, the range might be different from time to time but it's quite sufficient . This is from my own guess and experience based on my field of work, just to add up to the discussion.
Based on amount of fans.