Facts Are Sacred – a review

Facts are Sacred is the new book by Simon Rogers, the award-wining editor of guardian.co.uk/data and a news editor on the Guardian, working with the graphics team to bring figures to life on the page. He was closely involved with the Guardian‘s ground-breaking decision to crowdsource 450,000 MP expenses records, as well as the organisation’s coverage of the Afghanistan and Iraq ‘Wikileaks’ war logs.

The book, which is available in hard cover and a very inexpensive kindle edition, describes the changing world of data journalism, touching on big data, open data and citizen hacktivism.

Simon describes the methods and approaches taken by his colleagues on the Guardian, and shows how everyone can get involved in creating, analysing and visualising data.

It is clear that in the last four years things have changed dramatically. Governments, their agencies and local authorities have all started to provide open data with varying levels of commitment, standards and approaches; but the fact that they have these limited made inroads is positive and significant and we need to press to make open data and transparent government the norm. Nor does it mean that the job is done. The challenge remains for journalists and citizens alike in contextualising the data, analysing it, checking accuracy and uncovering what the data tells us.

That exposes a need for more, not less, citizen involvement, which itself necessitates better skills in data analysis, understanding statistics and being able to paint a picture (or at least create an infographic) with the data exposed.

Using examples from the 2011 riots, hurricane Sandy, MPs expenses, and more Rogers tells an engaging story of how the Guardian, in particular, interacted with its readers, challenged the government, used open source tools, and broke new ground in not only using data to source new stories, but also in starting to lay the foundations for live data reporting and analysis. This is something that could not have been imagined only 10 years ago, nor when CP Snow, celebrating 50 years as the Guardian’s editor, wrote in 1921 “Comment is free but facts are sacred” from which the title is drawn (let alone when the Guardian’s own first edition in 1821 carried some tabular data about Manchester schools).

This is an excellent book, which takes the pulse of data journalism as it stands in this early phase of open data. It offers us all a chance to develop our skills in data analysis and citizen journalism and reminds us that we all can hold authorities to account, and collaborate to develop further tools and crowd-sourced analysis.

Highly recomended!

Ian Watt6 May 2013

Webteam Survey 2011 – The – Graphs Part 2 of 3

This is the second part of an article, which begins here.

The scatterplot diagram below shows the same data as in part one of this article but with the data for Essex removed for ease of plotting. Doing so means the spread of data is more easily visualised.

Webteam Survey 2011 – Essex eliminated

One might have expected a fairly linear distribution of data from bottom left, sloping upward to the top right somewhere, with bubble sizes (representing web team sizes) starting small at the bottom left and getting bigger as we move right and up. However this is not true. One of the largest bubbles is for Fylde council, down in the bottom left.

If you now move to the third part of this article, I’ll plot only the data for the comparator group that I selected for Aberdeen City Council.

Webteam Survey 2011 – The Graphs – Part 3 of 3

This is the third part of an article that starts here, and continues here.

The Scatterplot diagram below shows only the data for the Aberdeen City Council and its six comparator councils. These were chosen as they are local authorities having both a local population and council FTE size within +/- 25% of that of Aberdeen.

So the clustering is understandable – but the the variation of team sizes, reflected in the bubble size is remarkable. Watch out for Falkirk and South Tyneside whose bubbles are almost exactly aligned. if you can’t see one or other trying refreshing the page (F5) as a jitter function moves the dots around a little.

Webteam Survey 2011 – Aberdeen Comparator group only

Webteam Survey 2011 The Graphs Part 1 of 3

In late Summer 2011 I carried out a survey of webteams in local authorities in the UK. This survey looked, amongst other things, at the size and composition of the webteam, their location, how big the organisation is, the size of the local population and how many visits the sites get each year.

I posted the survey on one of the Communities of Practice (COP) run by the IDEA and The Improvement Service in Scotland. The survey was taken up and promoted by SOCITM.

The reason for my running the survey was to benchmark Aberdeen City Council’s web team with those of other authorities. I promised all participants that they get a copy of the data and analysis when I finished.

SOCITM thought that there was merit in what I did and invited me to present at the Website Takeup and Improvement workshop in Birmingham. When I gave my presentation I used both MS Powerpoint and a set of webpages which I’d created using the Protivis javascript framework.

I posted both my MS Powerpoint Presentation and the Excel spreadsheet of original data back on the COP. I promised to share the graphs via my own blog.

When making the promise I hadn’t foreseen the difficulty in getting Protivis to work with WordPress. I’ve now solved that by using the Protovis Loader Plug-in for WordPress.

Once I did that I had to save the Javascript for each graph to a separate JS file and load that to a site directory. Lastly I had to move each of the three graphs to a separate page to get the mouse-over functions to work. If I’d had time I would have worked out a JS function to add commas to the numeric text that loads beneath each graph – but I might return to this later.

In each of the three scatterplot diagrams that I’ve created the council webteam (circle) is plotted against the local population size (y-axis) and the size of the council measured in FTEs (x-axis). The colour of the bubble represents the council type, and the size of the bubble represents the web team size.

The first example below shows the data collected for all councils. Mouseover the bubbles to show which council is which. The data loads below the graph.

Webteam Survey 2011 – all councils

If you go to the next page you can see the same data replotted with Essex’s data removed. This makes visualising it slightly easier with the outlier removed.