Friday, 27 January 2012

Meanwhile, Vole discovers a new toy

I've been told (thanks to an academic contact on Twitter) about Google's Ngrams: they use their database of pretty much every book written to extract raw data. For instance, I'm wondering about writing a piece on the use of 'banditti' in literature and the media, if - as I suspect - it's used in English with an anti-Catholic subtext. With Ngrams, I get a really good chart showing me how much and when it was used (click to enlarge):

So the term wasn't very common at all, but usage started fairly suddenly around 1730, there were four peaks - probably connected to Gothic literature like Ann Radcliffe's The Romance of the Forest (1791) and slowly declined. Google links to the texts which use the term (not newspapers, sadly) and allows you to crunch the raw data yourself.

Endless fun. What a great tool. Who knew, for instance, that the word 'git' reached its peak in 1940? Annoyingly, it doesn't distinguish between 'git on up' etc and 'git' as in idiot. I'm certainly looking forward to reading The Magic Git-Flip. (Neal's favourite word, 'gitwizard', appears not to have been taken up in literary circles as yet). 'Plashing' peaked in 1860, used in a very poor poem, in Dickens' All The Year Round magazine. Interesting, my names bump along as a choice in fiction until about 1980, since when usage has increased massively. Which makes me cool. Doesn't it?


neal said...

I've never used that word in my life.

Emma said...

Was just playing with your your new toy. Trying running a search on terrorism for 1800 to 2010.
Was gonig to tweet you but can't access the site from work.