Everything happens somewhere. Along with time, place is therefore a logical way through which to catalogue and understand history. In the study of written history, it is often easy enough to say what was written where—that is, to situate a text or discourse within geographic space. But how do you situate a place within the discursive space defined by a collection of texts? What might a given newspaper, for example, say about a particular place? Or in more general terms, when people talk about a place, what do they talk about?
I will present one possible answer to these questions in the form of an effort to geoparse and geovisualise around 50,000 articles published in the Brisbane Courier between 1885 and 1890. This undertaking has confronted several challenges, not least those posed by working at scale with extraordinarily ‘dirty’ textual data. My solutions (most of which are only partial) to these challenges include novel methods for correcting, geoparsing, and geovisualising text. Among other techniques, these methods employ no fewer than three different applications of the topic modelling algorithm, latent Dirichlet allocation, or LDA.
The output of this effort is an interactive Google Maps layer through which the user can explore the textual associations of individual places. In essence, the map serves as a geo-thematic index to the Brisbane Courier, linking directly to relevant newspaper content hosted on Trove. In addition to presenting this output, I will describe some of the methods used to achieve it, and discuss possible future developments.
Author bio: I am currently a PhD candidate at the University of Queensland. I expect to submit my dissertation about the use of topic modelling in social science in late 2018. Prior to commencing my PhD, I developed oncewasacreek.org, an independent, web-based research project exploring the environmental history of suburbs along the Milton Reach of the Brisbane River. Following my PhD, I plan to expand on my historical studies through the use of the text mining and geoparsing techniques developed through my PhD.