As Colin Jacobs points out, the recent Wikipedia censorship affair provides an excellent example of the kind of obstruction that will be routine once Australia implements a mandatory ISP-level filtering regime
And it provides not just rhetorical ammunition. This overzealous act of censorship provides us with a very real opportunity to legally gather cool, hard data on the performance impact of the 2nd stage filtering infrastructure of the UK clean feed mechanism.
With a simple script, we get users in the UK, Australia and elsewhere to regularly poll random Wikipedia articles and record URLs, response times, status codes and sizes
We keep running these scripts until after IWF and Wikipedia sort out their differences.
We then do a statistical analysis of the performance difference with 2nd stage filtering switched on and with it switched off. This should provide us with an accurate measure of exactly how much overhead 2nd stage filtering adds to typical response times.
I will be writing a small bash script that generates the requisite log files and publishing it here. Please follow this blog entry and the #ukaucensorwall topic in Twitter for further updates. I can be contacted via @jon_seymour in Twitter or jon.seymour at gmail.com
If anyone else can provide technical expertise to assist with this measurement exercise, please post a comment here or to the Twitter topic with contact details.
For now, please:
- follow @ukaucensorwall for updates
- post to #ukaucensorwall to contribute
- poll this page http://tinyurl.com/6rwsec occasionally for future updates
- enlist others to the cause (particularly in the UK, but elsewhere would be good too - it will provide a useful control)
Update (2008-12-08 16:05 UTC): Need to think about this for some more so there won't be a script to run, if at all, for 16 hours or so.
Revised Plan 2008-12-08 22:00:00 UTC
A simple script isn't going to do the job because of the risk of inadvertently DDOS'ing WikiPedia should I get too many volunteers.
That said, I do have a simple bash script available which I will share with a restricted number of users in the UK and Australia if there is interest. This could be used to collect some stats right now, just in case the 2nd stage filtering is switched off before I finish the final implementation. You need either Mac OSX, Cygwin + Windows or Linux + wget to run it. Contact me by e-mail if you are interested in running it - the script MUST NOT be shared with anyone because of DDOS risks if it went viral.
Current thought is to either use BOINC or write a custom Java client that communicates with a governing servlet. This will allow tight control over the load WikiPedia gets subjected to and better control of which URLs are tested (which may help with the analysis). I'll probably go with Java rather than BOINC since Java is what I know best unless there is someone who knows how to do BOINC implementations.
I am not going to be able to spend any time on this until later this evening (Sydney time), however if you can lend a hand before that, please let me know and I can do the cyber equivalent of waving my hands in the air to sketch the plan in more detail.
If anyone who can assist with these tasks or issues, let me know:
- BOINC implementations
- java programming
- tomcat + apache hosting
- statistical analysis - particularly to help plan the test method
- legal and ethical considerations
- enlisting UK volunteers - we really need them :-)!
- enlisting other Australian + international volunteers - useful controls
Revised Plan 2008-12-09 00:41:00 UTC
IWF has backed down so we can't measure the 2nd stage filter this way anymore. However, if get prepared, we will be in a better position to do it if a similar issue occurs in the future.