Trying out some topic modelling

So my first presentation of my dissertation project is coming up next week. In the abstract that i submitted to the conference ”Idéhistoria på gång” i heavily emphazised what spatial historical perspectives might add to an already quite extensive historical field like Cold War Culture, or Cold War studies. As a preparation I decided to look into some topic modelling tools available online to sort of give a easy and comprehensible entrypoint to my presentation. If i succeded with this blogg article is apt for discussion.

The deal here is that I want to use the bomb shelter in Swedish urban environments to give an example of how global ideological currents manifest itself in the urban fabric without turning to a classic architectural history. The famous and well known architectural projects of the mid 20th century is of course still interesting but is in a sense too obvious, it is also already very well researched and written of. By looking into the more primitive forms of architecture trends during the cold war era, situated in the basement of ordinary peoples dwellings, I believe it is possible to visualize a vivid relationship between ideological currents and concepts such as ‘Modernity’, down to the individual citizen.

More specifically the purpose is to uncover the ideology that lies behind the civil defense board’s love affair with shelters. What is the percieved effects of a future war? What is it in the city that will be destroyed, both on a material level and on a cultural identity level? Why are shelters proposed as a potential solution? Im using this method to sort of uncover what lies behind cliché-concepts like third world war, total war, atomic bomb, and fall-out shelters. This will also address more specifically how the scientific culture of the state practically enforces their ideological regime onto the individual citizen.att acceptera grottillvaronThe figure above is a attempt to visualize the problems Im trying to investigate. What we’re talking about here is a process (from left to right) from global influence to individual. A process where the  global concept of modernity addresses the national state, which in turn chews on the problem, stating principles and handles the principles over to state departments (institutional middle layer). The state departments uses contemporary ‘modern’ technology, scientific methods and statistics to arrive at a solution. The solution (in this case the shelter) is then integrated into the urban fabric of the modern city and thus inevitably handled in various ways by the citizen.

So, to get back on track. I want to start with explaining what sort of material I am using in this case. This little text analysis elaboration will consist of three different governmental reports from the 1940s and 1950s. I have written of shelters before in an earlier thesis, so Im fairly well briefed on what they contain. Their topics are obvious. They concern mainly civil defense in general and all of them contains the Civil Defense Board’s views on how Shelters should be built and in what extent. Important to know is that the first of them was published in 1944 and resulted in a civil defense act that forced every constructed building of a certain size from 1944 and on to include some sort of shelter in case of war. The other two government reports was publicized 1947 and 1950, that means after the Bomb with a big B. Both of them was a response to the significant change that nuclear weaponry brought to the military scene and therefore the authors discussed what should be done to reduce the risk of total annihilation in case of war.

However, what I am looking for today as I once again turn to these sources is a connection between in one hand shelters and evacuation and in the other hand concepts that co-incides or relates to shelters. Im especially interrested in 1. any spatial reference like cityplanning and distances, 2. concepts that refer to what’s ‘modern’, which includes things that is concidered both old and new and ideas of the future in any way, and 3.  references to technological processes or artefacts like material, techniques or effects of technological breakthroughs and 4. scientific language and use of immutable mobiles to state their case.

The tool I am using in this case is called Voyant Tools, which is a text processing tool free to use. The program gives you a fantastic overview over the document processed. After adding the three text I can get information about the frequency of specific words, either divided per document or in total. Voyeur Tools can generate a wordcloud that vizualises the different findings. To sharpen the analysis I can of course eddit the used stop words.

cirrus+summaryAs you can see here, not suprisingly the word ”Civilförsvaret” [Civil defence] comes first in line with a frequency of 841, followed by ”Skyddsrum” [Shelter] with a frequency of 789.

corpus+statisticsIn this section I can see the frequency of the word ‘shelter’ divided between the three different documents, neatly displayed with a curve to the upper left. The word ‘shelter’ is more common in the last report of the three with a distribution of 83 times out of 10000. The colorful bar to the left alows me to find the specific passage where the word appears. In the box to the lower right I can read examples of how the word is being used in context, and by clicking any of the examples I can get the full paragraph displayed.

So this is the general idea, and well, no real suprise that I found Shelters and Civil Defence in this document. But, looking closer at the word cloud, there is some interesting findings to consider if we go back to the original inquiry. For example, there is actually quite an interesting amount of spatial references here that can be looked further into. I found 154 uses of ‘building’ and an equal number of ‘buildings’, 165 uses of ‘area’, 161 uses of ‘property’. There is also words that when looking further into them reveals that they have a close relation to spatial references.
The word for ‘general’ [allmänna] is also used in meaning of ‘public’. By clicking this word a range of cases appeared which refered to public spaces,  public shelters, public health institutions and public schools. This means that these are topics that are touched upon by the authors in terms of Civil Defence.
Around the word ‘area’ (sometimes used as ‘field’) [område], there is a bunch of interesting uses, especially if I divide the search into two different documents. In the report from 1947 I find concerns about the ‘field’ of technology, its rapid development and new keywords such as areal warfare, flight technology, atomic bombs. The same word in 1950 gives results which  concerns: weaponry, radioactivity, weapon technology, shelters, rock solid shelters, evacuation, underground and below surface.  Thus, there is a difference in usage here that reveals that the authors were occupied with what they thought of as different problems depending on time of publication.

It is possible to play with this indefinitely, but exactly what can this contribute with? Voyant tools give you the possibility to not only display results in word cloud, but, lets you explore in what context and meaning the word occurs. In this case it is also obvious that there are things in this text that occurs more often than for example atomic bomb which is often thought of as the main impulse behind the concept of shelters.cirrusLooking closer at the cirrus cloud tool again I find that many of the words are of economic and organisational character. On the economic side: Cost, property, reimbursement, crowns (currency) frequently occurs. Organisation and Control seems to be another important topic: Organisation, municipality, regulations, extent, tasks & data, function & activity, action and personel. There are also three different state controlled institutions mentioned: municipality, provincial administration and the civil defence board. Perhaps one of the most interesting word is ‘måste’ which means ‘must’ which sort of emphazise the urgency of the matter.

The word: ‘måste’ actually reveals an array of interesting problems:maste-voyantThe graph shows that there is a difference of frequency depending on which document we choose to look at. The peak during 1944 probably is due to the regulatory aspects of the government report. It contained a complete set of regulations for more or less all aspects of Civil Defence. It is therefore not very suprising that ‘must’ is quite common. The report in the middle is however much more vague and is an assessment of the current situation just after the introduktion of atomic weaponry. In the last government report released 1950, the effects of nuclear weaponry were much more researched. At this point, the Civil Defence Board probably felt that it was quite obvious what the effects of the new age was, and what they had to do and were therefore much more confident in their approach. Hence, the word ‘must’ was much more commonly used.

So what can be derived form this little analytical experiment?  It is fairly obvious that when doing close readings of this material the economic factor of urban destruction is a major topic and that I might have to consider this. Mainly it seems to concern destruction of owned property, either private or state owned. The destruction in general also concern public institutions to some extent. Furthermore,  the organisational side of the problem is solved by using a decentralised organisational form. It seems to be obvious or taken for granted that monetary and welfare institutions are expected to be affected during a war situation, but will still function.

The small exploration of the word ‘must’ also reveals that I might expect to find an extended knowledgebase as the civil defence board enters the new decennia. Perhaps I can identify a number of immutable mobiles on the topic of atomic bombs that were introduced to the scientific community from aboard or from national research? Comparing this to public opinion might also be interesting to see what the cloud of contemporary discursive references contains for example political speeches, articles, photographs, movies or anything similar.

The little example displayed in the last figure above must be said to underscore the spatial aspects of threat and might be exactly what I am looking for:

 grund av bostadsbyggandets lokalisering till samhällenas ytterområden ha normalskyddsrum väsentligen kommit att anordnas inom dessa områden. De inre delarna ha blivit eftersatta, ehuru anfallsriskerna och förutsedda skadeverkningar där äro störst och behovet av skydd för den där arbetande och bosatta civilbefolkningen följaktligen  viktigt. Även om en viss sanering av tätorternas innerområden kommer att äga rum, måste man räkna med att samma tendens kommer att göra sig gällande även  ifortsättningen. Ett fortsatt skyddsrumsbyggande efter hittillsvarande riktlinjer, frånsett skyddsrummens skyddsförmåga i och för sig, kan därför icke anses ändamålsenligt. Skyddsrum måste under alla förhållanden anordnas för de tättbefolkade innerområdena.

Freely translated into English with  added emphasis on topics:
Because of housing construction around the outskirts of urban dwellings, [spatial reference] normal type shelters  there have been arranged substantially [exsisting technique]. However, the inner parts have been neglected although they are subject to higher risk of attack [scientific knowledge] as well potential damage,[percieved value] [technological problem] and were protection of the resident and working citizens is consequently crucial.Even though a certain amount sanitation of the urban centres will be  conducted, [spatial planning] one has to take into account that a similar tendency will persist in the future [ideas of modernity and future]. A continuation of present shelter building practices, except for the shelters capacity, can therefore not be considered appropriate [reinterpretation and adaption to technological progression]. Shelters must under all circumstances be arranged for the densely populated urban centers.

Many of these conclusions might seem trivial. The use of this approach is however obvious if seen as a part of the research process rather than an end-product. As I turn to these text for another round of close reading the topics that digital text analytics can provide help and to clear out assumptions of what I might find in the text.