I spotted a nice tool from Google last week that I hadn’t heard of before. It’s called Google Correlate and it allows you to see two fun little pieces of information – one of which is a way to ‘prove’ (i.e. trick) people into thinking that there are patterns present when there are none.
Firstly, you can see how correlated search terms are to each other by frequency over time. This is not very interesting as a particular search term is most likely to be correlated with variants of itself. For example, if you input ‘dvd’ as a search term the top correlations are ‘dvd burner’ and ‘dvd recorder’. Maybe this would be good for picking under-utilized AdWords keywords.
The second aspect is more interesting. You can see how correlated Google search terms are with real-world data, either time series or information about US states. The time series is cool, but being able to correlate any State-oriented data with Google means you are almost guaranteed to generate something that a tabloid paper can run with. Here are some interesting correlations I’ve found while playing with the tool:
Annual Rainfall of US state (in.) <-> Google search disney vacation package: 0.9093 (link).
Literacy Rates in each US state (% of population) <-> Google search olympics: 0.8897 (link).
Southern US states (as defined by US Census Bureau) <-> Google search crape myrtle: 0.8982 (link).
Populations of US States (inverted) <-> Google search gatorade player of the year: 0.8963 (link)
US States that I have visited <-> Google search virgin america airline: 0.8617 (link)
Do illiterate people hate the olympics? Does rain make you want to go to Disneyland? Do I fly with Virgin America everywhere in the US? The answer to all of these is, of course, no. Even so these kinds of correlations are often presented in media as ‘evidence’ of a subjective opinon that has no basis in fact. I feel the best way to fight this attitude is with education and access to the data cited. Until then I’m sure more than a few newspaper headlines will be generated by correlations such as the ones above.