pseudomonas: "pseudomonas" in London Underground roundel (Default)
pseudomonas ([personal profile] pseudomonas) wrote2013-12-21 11:09 pm
Entry tags:

O2 vs Wikipedia - a quick look inside the minds of the folks who build the blockers.

O2, in common with just about all mobile companies, has blocking. Unlike most, they helpfully provide a URL checker http://urlchecker.o2.co.uk/ where anyone can check if a URL is blocked. update: that page has been taken down "to ensure it's fit for purpose and provides transparent info to [O2's] customers".

There are three levels of blocking:

Open Access - what people who've asked for "no filtering at all" see.
Default Safety - what people who've signed up without expressing preferences see.
Parental Control - what people who've actively asked for a child-friendly device see.

Now, O2's Parental Control is a funny old thing. It allows http://www.mcdonalds.com but blocks http://www.childline.org. To be honest, it blocks most of the internet apart from a tiny number of mostly corporate sites. It allows amazon.co.uk but blocks amazon.com. We may never know why - this is all done by their unspecified third-party partner (rumour has it that this is probably Symantec).

Wikipedia seems to be an interesting case - it's allowed, but certain pages are blacklisted. This is all done very shoddily, if the URL checker is to be believed. So https://en.wikipedia.org/wiki/Penis is blocked but https://en.wikipedia.org/wiki/Penises is allowed, even though they both go to the same damn page1 Also, they block Penis but fail to block Clitoris2

The choice of which pages to block on Wikipedia is interesting. A bit of playing around revealed that there wasn't much consistency; it looked like, rather than applying a classifier to every page, someone had made a list of a few pages with titles that seemed dodgy to them, and had called it a day. This seemed an ideal opportunity to find out what the spirit was behind the blocking, especially since they kindly tell us what the category of nastiness is.

Wikipedia has a nice list of the 5000 most visited pages. I ran them through the checker3 and made a list of the aberrations, sorted by category. Pages in more than one category will appear twice; if they're blocked in one and not in the other, they're still blocked to the user.


Page titleCategoryOpen AccessDefaultParental
Alcoholic_beverageAlcoholAllowedAllowedBlocked
BeerAlcoholAllowedAllowedBlocked
Bourbon_whiskeyAlcoholAllowedAllowedBlocked
BrandyAlcoholAllowedAllowedBlocked
Scotch_whiskyAlcoholAllowedAllowedBlocked
VodkaAlcoholAllowedAllowedBlocked
WhiskyAlcoholAllowedAllowedBlocked
GayLifestylesAllowedAllowedBlocked
LesbianLifestylesAllowedAllowedBlocked
TransgenderLifestylesAllowedAllowedBlocked
Anal_sexMature ContentAllowedAllowedBlocked
CunnilingusMature ContentAllowedAllowedBlocked
FellatioMature ContentAllowedAllowedBlocked
G-SpotMature ContentAllowedAllowedBlocked
Human_sexual_activityMature ContentAllowedAllowedBlocked
Kama_SutraMature ContentAllowedAllowedBlocked
Oral_sexMature ContentAllowedAllowedBlocked
SadomasochismMature ContentAllowedAllowedBlocked
Sex_positionMature ContentAllowedAllowedBlocked
Sexual_arousalMature ContentAllowedAllowedBlocked
Sexual_intercourseMature ContentAllowedAllowedBlocked
TribadismMature ContentAllowedAllowedBlocked
Anal_sexSexual AdviceAllowedBlockedBlocked
CunnilingusSexual AdviceAllowedBlockedBlocked
FellatioSexual AdviceAllowedBlockedBlocked
G-SpotSexual AdviceAllowedBlockedBlocked
Human_sexual_activitySexual AdviceAllowedBlockedBlocked
Oral_sexSexual AdviceAllowedBlockedBlocked
SadomasochismSexual AdviceAllowedBlockedBlocked
Sex_positionSexual AdviceAllowedBlockedBlocked
Sexual_arousalSexual AdviceAllowedBlockedBlocked
Sexual_intercourseSexual AdviceAllowedBlockedBlocked
TribadismSexual AdviceAllowedBlockedBlocked
Birth_controlSexual EducationAllowedAllowedBlocked
EnemaSexual EducationAllowedAllowedBlocked
ErectionSexual EducationAllowedAllowedBlocked
GonorrheaSexual EducationAllowedAllowedBlocked
PenisSexual EducationAllowedAllowedBlocked
Pubic_hairSexual EducationAllowedAllowedBlocked
Sexually_transmitted_diseaseSexual EducationAllowedAllowedBlocked
VaginaSexual EducationAllowedAllowedBlocked
VulvaSexual EducationAllowedAllowedBlocked
Suicide_methodsSuicideAllowedBlockedBlocked



Notice that their "lifestyles" category has only three items within the top 5000 Wikipedia pages4. What these have in common is left as an exercise for the reader. Whether that falls foul of the Equality Act 2010 is left as an exercise for the reader who knows about English law.

Notice also that for instance the list does not include the following top-5000 pages: Asexuality, Celebrity_sex_tape, Child_pornography, Homosexuality, Human_sexuality, List_of_female_porn_stars, List_of_Masters_of_Sex_episodes, List_of_pornographic_actresses_by_decade, Masters_of_Sex, Pansexuality, Pornhub, Pornographic_film_actor, Pornographic_film, Pornography, Revenge_porn, Same-sex_marriage, Same-sex_marriage_in_the_United_States, Sex, Unsimulated_sex, YouPorn ... and that's just in the top 5000 out of 4 million. Anyone who thinks the filter is effective is going to be very disappointed. And those are just some of the sex-related pages - they make no attempt to block pages about war, death, torture, or other potentially distressing subjects. Again, speculation about the mindset behind this is left to the reader.

This all reflects very badly on O2; but I think we should assume that the other ISPs are every bit as incompetent, until they present us with evidence to the contrary.

If anyone would like to help me with a similar but more extensive project for TalkTalk, BT or Sky, has a line with one of those ISPs, a willingness to give me SSH access to something at your end (probably helps with that bit if you're a wee bit tech-y), and a preparedness to turn the dreaded filters on for a bit, please let me know in the comments.

I've been ranting about this at more length at [twitter.com profile] pseudomonas

.


1 Note as an aside that the URL checker claims to be able to tell the difference between the two httpS URLs. This is very worrying if it's true, but my suspicion is that it's not and the URL checker is just shoddily written and assuming they're plain http.
2 Perhaps because they had problems finding it.
3 Actually, I misinterpreted how the URL checker dealt with encoding and ones with brackets, punctuation, apostrophes, and diacritics got skipped. Sorry.
4 Since you asked and to save you a click or two, "Bisexual" isn't in the top-5000 list of pages, but that Wikipedia page is indeed classified as "lifestyle".

[personal profile] tarchannan 2013-12-22 03:11 am (UTC)(link)
Thanks, this is interesting. Good work.
sunflowerinrain: Singing at the National Railway Museum (Default)

[personal profile] sunflowerinrain 2013-12-22 06:46 pm (UTC)(link)
Excellent piece of work. Thank you. I contented (not really the correct term for how I felt) myself with a succinct, somewhat pejorative, and much less informative snarl on Facebook.

atreic: (Default)

[personal profile] atreic 2013-12-22 08:34 pm (UTC)(link)
Gosh, that's all very interesting.
kaberett: Trans symbol with Swiss Army knife tools at other positions around the central circle. (Default)

[personal profile] kaberett 2013-12-23 11:51 pm (UTC)(link)
Thanks lots for writing this up.
sunflowerinrain: Singing at the National Railway Museum (Default)

[personal profile] sunflowerinrain 2013-12-24 02:38 pm (UTC)(link)
I just followed a link to an article in New Statesman, and wondered if you have suddenly acquired a lot of followers on Twitter.

That article is a bit crude for sharing where my aunt and uptight cousins can see it, but it does make another, potentially useful, point about the option to have the filter active or not: that an abusive partner could block an adult's access to help sites. If this is so, my support for merely boycotting the filters is not sufficient.