The Internet, Policy & Politics Conferences

Oxford Internet Institute, University of Oxford

Scott Hale: Government on the Web: Using Crawlers and Web Archives to Map Government Presence

This presentation presents work-in-progress on analysing the UK government’s web presence through web-crawls and Internet Archive data. Increasingly, the primary interactions citizens have with their governments are online. Link analysis of government web presence can reveal the distribution and interconnectedness of content. The position of government relative to other institutions, companies, and organisations may be analysed through links between the broader web to government websites. These links give an indication to what extent government plays a watchtower or authority role online or risks becoming marginalised in online social and information networks with a net loss of nodality.

Snapshot webcrawls place the researcher in charge of defining the boundaries and depth of the crawl and can gather a complete look at a single point in time. These crawls, however, give no temporal context. Web archive data has the potential to address this shortcoming, but archive data has issues of its own, such as varying degrees of timeliness and completeness and the fact that it is usually collected by a third party rather than the researcher.

The presentation will present initial findings from a snapshot crawler of UK government and custom tools the team is developing to visualise government networks online. Work in preparing the 30TB UK Web Domain Web Archive for link extraction is also discussed, while the broader project upon completion will offer in-depth reflection on the use of web archives for link analysis and a comparison of the findings from snapshot and web archive approaches.