From a32442c4cc529707dc13dbe1842d505b47308665 Mon Sep 17 00:00:00 2001 From: Nick White Date: Mon, 15 Nov 2010 21:55:01 +0000 Subject: Add a couple more html examples for tests --- tests/html/guardiangoogle.html | 6460 +++++++++++++++++++++++++++++++ tests/html/guardiangoogle.html.simple | 59 + tests/html/newyorkerleaks.html | 6793 +++++++++++++++++++++++++++++++++ tests/html/newyorkerleaks.html.simple | 58 + 4 files changed, 13370 insertions(+) create mode 100644 tests/html/guardiangoogle.html create mode 100644 tests/html/guardiangoogle.html.simple create mode 100644 tests/html/newyorkerleaks.html create mode 100644 tests/html/newyorkerleaks.html.simple (limited to 'tests') diff --git a/tests/html/guardiangoogle.html b/tests/html/guardiangoogle.html new file mode 100644 index 0000000..094a0f6 --- /dev/null +++ b/tests/html/guardiangoogle.html @@ -0,0 +1,6460 @@ + + + + + + + + + Google is polluting the internet | Micah White | Comment is free | guardian.co.uk + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + +
+ + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +

Google is polluting the internet

+ +

The danger of allowing an advertising company to control the index of human knowledge is too obvious to ignore

+ + +
+ + + +
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ google eye logo +
Google advertising … 'We have public libraries. We need a public search engine.' Photograph: Dominic Lipinski/PA
+
+ +

An advertising agency has monopolised, disorganised, and commercialised the largest library in human history. Without a fundamental rethinking of the way knowledge is organised in the digital era, Google's information coup d'état will have profound existential consequences.

Google was originally conceived to be a commercial-free search engine. Twelve years ago, in the first public documentation of their technology, the inventors of Google warned that advertising corrupts search engines. "[W]e expect that advertising-funded search engines," Larry Page and Sergey Brin wrote, "will be inherently biased towards the advertisers and away from the needs of the consumers." And they condemned as particularly "insidious" the sale of the top spot on search results; a practice Google now champions.

Under the sway of CEO Eric Schmidt, Google currently makes nearly all its money from practices its founders once rightly abhorred. Following its $3.1bn acquisition of DoubleClick in 2007, Google has became the world's largest online advertising company. With ad space on 85% of all internet sites, upwards of 98% of Google's revenue comes solely from polluting online knowledge with commercial messages. In the gleeful words of Schmidt, "We are an advertising company." Google is not a search engine; it is the most powerful commercialising force on the internet.

Every era believes their way of organising knowledge is ideal and dismisses prior systems as nonsensical. Academic libraries in the US use subject categorisation derived from Sir Francis Bacon's 17th-century division of all knowledge into imagination, memory and reason. Yet who today, aside from one or two exceptions, would try to organise the internet using a handful of categories? For a generation trained to use Google, this approach seems outmoded, illogical or impossible. But modern search engines, which operate by indexing instead of categorising, are also fundamentally flawed.

Three hundred years ago, Jonathan Swift foresaw the cultural danger of relying on indexes to organise knowledge. He believed index learning led to superficial thinking. Swift was right and a growing of teachers and public intellectuals are coming to the realisation that search engines encourage skimming, light reading and trifling thoughts. Whereas subject classification creates harmony and encourages serendipity; indexes fracture knowledge into snippets making us stupid. Thanks to Google, the superficiality of index learning is infecting our culture, our society, and our civilisation.

Google did not invent the index. That honour goes to the 500 monks led by Hugh of St Cher who compiled the first concordance of the bible in 1230. Nor was Google the first to dream of indexing all of human knowledge. Henry Wheately had the idea in 1902 for a "universal index". And Google was not the first to cynically dump advertisements into the search-engine index. What makes Google unique is the extent to which it has, oblivious to the consequences, made a business out of commercialising the organisation of knowledge.

The vast library that is the internet is flooded with so many advertisements that many people claim not to notice them anymore. Ads line the top and right of the search results page, are displayed next to emails in Gmail, on our favourite blog, and beside reportage of anti-corporate struggles. As evidenced by the tragic reality that most people can't tell the difference between ads and content any more, this commercial barrage is having a cultural impact.

The omnipresence of internet advertising constrains the horizon of our thought. Seneca's exhortations to live a frugal life are surrounded by commercials for eco-holidays. The parables of Jesus are mere fodder for selling bamboo flooring. The juxtaposition of advertisements with wisdom neutralises the latter. The prevalence of commercial messages traps us in the marketplace. No wonder it has become nearly impossible to imagine a world without consumerism. Advertising has become the distorting frame through which we view the world.

There is no system for organising knowledge that does not carry with it social, political and cultural consequences. Nor is an entirely unbiased organising principle possible. The trouble is that too few people realise this today. We've grown complacent as researchers; lazy as thinkers. We place too much trust in one company, a corporate advertising agency, and a single way of organising knowledge, automated keyword indexing.

The danger of allowing an advertising company to control the index of human knowledge is too obvious to ignore. The universal index is the shared heritage of humanity. It ought to be owned by us all. No corporation or nation has the right to privatise the index, commercialise the index, censor what they do not like or auction search ranking to the highest bidder. We have public libraries. We need a public search engine.

In 1998, Larry Page and Sergey Brin made a promise: "We believe the issue of advertising causes enough mixed incentives that it is crucial to have a competitive search engine that is transparent and in the academic realm." Now it is up to us to realise the dream of a non-commercial paradigm for organising the internet. Only then will humanity find the wisdom it needs to deal with the many crises that threaten our shared future.

+ + + +
+ + + + +
+ + + + + + + + + + + + + + + + + + + + + +
+
+ +
+ + +
+
+
+ +
+ +
+
+
+ +
+ +
+
+
+ +
+ +
+
+
+ +
+
+ +
+
+ +
+

Your IP address will be logged

+
+
+
+
+ + + +
+
+ + + + +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + +
+ + +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +

Comments in chronological order (Total 123 comments)

+ + + + + + + +
+
    +
  • This symbol indicates that that person is The Guardian's staffStaff
  • +
  • This symbol indicates that that person is a contributorContributor
  • +
+
+ + + + + +
+ + Showing first 50 comments | + Show all comments | + Go to latest comment + +
+ + +
+ + + + + + +
    + +
  • +
    + + + + + + + + +

    + + + + LukeRijnhurt + +

    +

    30 October 2010 12:06PM

    +
    +
    + +

    I don't get it ? isn't it the same with every big company ? create a better search engine or a better model for internet advertisement and see how google falls .

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + norgate + +

    +

    30 October 2010 12:11PM

    +
    +
    + +

    "Google did not invent the index". Wow, I never guessed that.

    I suppose advertising pollutes The Guardian as well does it? Adverts for holidays next to the latest social crisis. Simple answer: don't look at the bloody things.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + UnevenSurface + +

    +

    30 October 2010 12:15PM

    +
    +
    + +

    Excellent article. When Wikipedia supplants Google (and it's not that far off), we will be free of their financial censorship at last.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + nufubar + +

    +

    30 October 2010 12:16PM

    +
    +
    + +

    If you think that Google is polluting the internet I guess you haven't heard of spam emails & pornsites.

    + +
    +
  • + +
  • + +
  • + +
+ + + +
    + +
  • +
    + + + + + + + + +

    + + + + mikeeverest + +

    +

    30 October 2010 12:18PM

    +
    +
    + +

    Am I the only person who has never, ever clicked on an advert on the internet?

    Anyway, the author is right to point out the inherent danger of allowing profit to drive the shaping of our access to information and knowledge, and in his statement that no method of organising knowledge is unbiased.

    And Wikipedia's dominance (and malleability) also bears examination.

    However, serious consideration should be given to the benefits and drawbacks of the alternatives. Allowing librarians and scholars to "organise" knowledge into subjects also carries with it the risk of bias (academics are not notoriously unbiased in their presentation of theories and evidence) AND ossification, in that led by the nose we can simply deepen the ruts of specialisation and miss links that can be explored fruitfully.

    In the end it's up to every human being to take responsibility for their own MIND and the use to which they put it. Perhaps it would be more potent to point your critical gaze at the BBC and Eastenders, or our pub culture, or capitalism as purveyor of modern opiates and distractions from learning, sense and genuinely lived - as opposed to vicarious - experiences?

    Grazing the web (via Google and Wikipedia) I make connections between the Stoics, Buddhism, Jung, Emergence, and many more streams of ancient and modern consciousness that all one way or another describe something I find strange, thrilling and important.

    I could categorise it, but in doing so would colour it with associations and a heritage that belong to other cultures and times and not truly to "the thing in itself", as Kant would call it. Categories obscure as much as enlighten, mislead as well as lead.....

    In the end life is a personal responsibility. Eastenders or Russell.....Google or The British Library.....I could never have got this from a library; not in a million years.....so we should celebrate, as well as refine....

    This article is similar in a way to the threads on cif about Universities having souls....I think they do, but their soul is an emergent property of the complex web of relationships between people and knowledge, and it is migrating....as souls do....onto the internet....People navigate, not all unaided.....when the pupil is ready the teacher will appear.....not necessarily at a time and institution appointed (and anointed) by the Government or academic convenience....the times they truly are a changing....

    Isn't life a miracle, and absolute joy....what will you learn today?

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + MoneoSionaLeto + +

    +

    30 October 2010 12:19PM

    +
    +
    + +

    If adevrtising keeps interesting sites going ...thats fine by me!!!!!!!
    I just ignore and forget in about 2 nanoseconds

    + +
    +
  • + +
  • + +
  • + +
+ + + +
    + +
  • +
    + + + + + + + + +

    + + + + Epanastis25Martiou + +

    +

    30 October 2010 12:23PM

    +
    +
    + +

    Errr...fair point - but if you object so much, start using another search engine and also don't click on any adverts on the internet...and if enough people do that, Google's model will fail.

    It will either way - when the next black swan appears

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + Maino + +

    +

    30 October 2010 12:27PM

    +
    +
    + +

    What an almighty load of drivel this is.

    White assumes we're all passive vessels that obediently consume without any form of critical capacities whatsoever.

    Alarmist nonsense.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + Drottle + +

    +

    30 October 2010 12:27PM

    +
    +
    + +

    It's surprising that there hasn't been any serious attempt (AFAIK) to set up an open-source search engine.

    I suppose the problem is that search engines need vast amounts of storage, processing and connection bandwidth. They don't come cheap. But a benevolent billionaire might be able to fund the project as a hobby or tax loss, as Mark Shuttleworth has done for Ubuntu Linux and Jimmy Wales has done for Wikipedia.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + MickGJ + +

    +

    30 October 2010 12:27PM

    +
    +
    + +

    Oh for goodness sake. Even if the internet is the "library of human knowledge" --which I doubt--then it's unlikely that seekers after the wisdom of Thomas Aquinas will be siphoned of to buy holidays from Thomas Cook on the basis that people "can't tell ads from content".

    If the author wants to make his case he should at least have included some examples of how google has actually distorted a search rather than merely peppered it with adverts. There are other search engines, as well as ad blockers, or you can just use /

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + Burgau205 + +

    +

    30 October 2010 12:27PM

    +
    +
    + +

    Well if that's how you feel, you are free to start an organisation in competition with Google which presents your version of the human conditional.

    This is just another version of the tired and boring politics of envy.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + EMF4EVER + +

    +

    30 October 2010 12:29PM

    +
    +
    + +

    I think wikipedia is probably the answer to this article, when generations yet unborn look back at our time, they can look at our interweb and think to themselves, wow that's quite neat.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + cocainemidget + +

    +

    30 October 2010 12:31PM

    +
    +
    + +

    eh..

    The vast library that is the internet is flooded with so many advertisements that many people claim not to notice them anymore.

    ..i dont notice 'em any more, i second your observation sir. the guardian and irish times sites (my two main news sources) are 'polluted' with ads but as long as it keeps their content free i can deal with it. wikipedia has no ads, there's a 'public search engine of knowledge' to content yourself with mr white..

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + vakibs + +

    +

    30 October 2010 12:33PM

    +
    +
    + +

    Micah White
    So what is your idea for a non-commercial paradigm for organizing the internet ?

    I have scoured through your essay but you didn't propose anything substantial with respect to an alternative. Probably the Google guys have also spent 10 years wondering how to make money out of their system, and finally settled down on advertising because they could find none else ?

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + cocainemidget + +

    +

    30 October 2010 12:34PM

    +
    +
    + +

    also, the 'library of human knowledge' is mostly comprised of vacuous blogs and porn..

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + AnomieAndBonhomie + +

    +

    30 October 2010 12:39PM

    +
    +
    + +

    Now it is up to us to realise the dream of a non-commercial paradigm for organising the internet.

    Awe, do we gotta? Why don't we all just install FlashBlock and get on with our lives.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + kdsandeep + +

    +

    30 October 2010 12:39PM

    +
    +
    + +

    maybe he works for television industry or print industry you know. Internet ads are bad, but TV, Print ads are good is his message. I wonder what he has to say about facebook ads.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + PhilipHuntPPUK + +

    +

    30 October 2010 12:40PM

    +
    +
    + +

    If you don't like adverts on websites there's always Adblock Plus. (Personally, I don't mind, as long as they aren't animated.)

    Regarding Google:

    1. there are no barriers to entry -- anyone can write their own software to crawl the web and allow people to search their index

    2. Google wasn't the first search engine. It got where it is today by being a lot less advert-intensive and short-term "commercial" than its rivals, who put a little search box in the middle of the screen surrended by ads and other crud.

    3. sponsored links in Google searches are clearly marked as such, and easy to remove (with Adblock Plus).

    4. if you don't like Google, there are plenty of other search engines.

    In short, this article is overblown nonsense.

    + +
    +
  • + +
  • + +
  • + +
+ + + +
    + +
  • +
    + + + + + + + + +

    + + + + FreedomFromHope + +

    +

    30 October 2010 12:42PM

    +
    +
    + +

    I ignore all advertising +so do most people

    If you think you're ignoring it it's probably doing its job very well.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + libertarianSW + +

    +

    30 October 2010 12:43PM

    +
    +
    + +

    An advertising agency has monopolised, disorganised, and commercialised the largest library in human history. Without a fundamental rethinking of the way knowledge is organised in the digital era, Google's information coup d'état will have profound existential consequences.

    My gosh! What a pile of nonsense, first the Internet itself is not a 'library', second knowledge 'organization'? get a manual of HTML and we can start talking about it and third Google is simply an 'index', not the 'internet index' as such.

    What Google did differently 10 years ago was fixing an old problem: Old search engines used to alter the search results orders, favouring commercial sponsored 'links' first rather than by relevance. Google simply kept the 'relevance' untouched, they used 'sponsored' links in the side of the screen. No more to add to it.

    Any web master can 'massage' their 'search terms' as they wanted, search engines will always depends on information sources that cannot be verified, therefore the problem.

    Whilst other technologies such as 'Semantic Web' (RDF) will allow to generate 'multiple dimensions' for search/publish information, still is a long way to go.

    But more importantly, if you don't like Google...use Bing, Yahoo, etc,etc. Simple...

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + KidProQuo + +

    +

    30 October 2010 12:47PM

    +
    +
    + +

    Vacuous crap for the hand-wringers/conspiracy theorists.

    Ads line the top and right of the search results page, are displayed next to emails in Gmail, on our favourite blog, and beside reportage of anti-corporate struggles.

    Er, no, not on my computer. I simply block the ads.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + greendragonreprised + +

    +

    30 October 2010 12:49PM

    +
    +
    + +

    Have you not seen the adverts on this page?

    I have been working in internet advertising for over a decade so excuse my opinion that I know what I'm talking about here.

    Firstly, a lot of sites, especially the smaller non-Commercial ones rely on a small but steady stream of advertising revenue in order to pay for their hosting.

    Without these sites, which offer minority interests and minority points of view the web would be nothing but an online shopping arcade. So they use an adserver or have an Adsense banner ? Big deal.

    The implication that somehow Google could manipulate their SERPS in order to promote some sources of knowledge ahead of others is far more serious, but Google already has sponsored (i.e. paid for) listings and they display them clearly.

    The generic set of search returns rely on a complex algorithm which Google keeps secret because if they made it public an 'alternative Google' could be set up overnight stealing their IP rights.

    Webmasters know this algorithm relies on PageRank (which itself is derived from incoming links), context, content, page title, URL, TLD, incoming link anchor text, description and metadata, though what weighting is given to each element is unknown, though when changes are made they can be picked up instantly.

    The next generation of search engines will used semantic search and Google will have to change if it wishes to keep its dominant position in search.

    Example of this, with Google Instant on started typing A P P L E S (without the gaps) and see how long it takes before it stops listing Steve Jobs company. Semantic search will fix this.

    + +
    +
  • + +
  • + +
  • + +
+ + + +
    + +
  • +
    + + + + + + + + +

    + + + + jimbojohnson + +

    +

    30 October 2010 12:52PM

    +
    +
    + +

    Advertising on the internet is useless, because the ads are read just as superficially as anything else.

    I agree that skimming the internet is concerning, and does indeed lead to a lack of depth and interest. The problem, is the virtual infiniteness of it. An improvising musician, for example; before they play a note, has an infinite number of possibilities to choose from. The sheer difficulty of plucking something out of this, means that most often, what is played is a safe, underwhelming permutation of something the performer has done before. Or a disjointed fleeting grab of half developed ideas. To overcome this, normally a performer will impose limits beforehand. And the same should go for browsing the internet effectively. So in conclusion; if you're enjoying viewing "Big Bootys vol7," ignore the impulse to click vol8 to see if it's any better. It's not. And it's even worse of you skim through vol 1-12 viewing thirty seconds of each, although tempting, it makes the whole experience something of a waste of time

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + greendragonreprised + +

    +

    30 October 2010 12:53PM

    +
    +
    + +

    MoneoSionaLeto

    30 October 2010 12:17PM

    I ignore all advertising
    so do most people

    Indeed they do. The standard expected click rate is 1%, and the standard expected completion rate is also 1%, meaning for 1 click you need 100 views of the ad, and to make 1 sale you need 100 clicks, so by putting an ad on your page you need to generate 10,000 views to expect 1 sale.

    There are a lot of people making small amounts of money with this, enough to cover their costs. Very small are making serious sums.

    + +
    +
  • + +
  • + +
  • + +
+ + + +
    + +
  • +
    + + + + + + + + +

    + + + + MrJoe + +

    +

    30 October 2010 12:56PM

    +
    +
    + +

    The danger of allowing an advertising company to control the index of human knowledge is too obvious to ignore.

    Google doesn't control the index of human knowledge - it control an index of human knowledge.

    There are dozens of search engines - if you don't like Google, use another.

    + +
    +
  • + +
  • + +
  • + +
+ + + +
    + +
  • +
    + + + + + + + + +

    + + + + heavyrail + +

    +

    30 October 2010 1:02PM

    +
    +
    + +

    Twelve years ago, in the first public documentation of their technology, the inventors of Google warned that advertising corrupts search engines. "[W]e expect that advertising-funded search engines," Larry Page and Sergey Brin wrote, "will be inherently biased towards the advertisers and away from the needs of the consumers." And they condemned as particularly "insidious" the sale of the top spot on search results; a practice Google now champions.

    No, there's an enormous difference between the practice they condemned and the practice they now champion. In the practice they condemned, the advertising was disguised as the top result.

    Though Google now puts ads in a fairly prominent position, it doesn't put them in place of the content. And it clearly marks them as sponsored links, so they don't pollute the search results at all.

    Google is still the best search engine, still good at displaying search results, and still keeps results and advertising separate. Your attacks on it are entirely unjustified.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + EastFinchleyite + +

    +

    30 October 2010 1:02PM

    +
    +
    + +

    Somebody has to pay for it.

    The choice is not between a Google with advertising and a Google without advertising, it is between Google as it is or nothing at all.

    Unless of course you wish to fund the internet and search engines by subscription. Hands up all those prepared to pay for what we get free now.

    Thought not.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + happybeing + +

    +

    30 October 2010 1:11PM

    +
    +
    + +

    Great article.

    Many of these comments are pathetically unsound because they either miss the issue or address it. Examples: "I don't read/respond to advertising", "the guardian does this", "use another engine" etc.

    There is one ready solution: ad blocking browser ad-ons. though that only works so long as it doesn't catch on, so it will no doubt continue to be great for me :-).

    I use ad blocker and it transforms the web for me. Ironically though, I allow ads in Google because of my professional interest in web search, marketing and SEO. And yes I have clicked on those ads. I've also placed ads there too, though not currently as organic search is working well.

    Contradictions? Of course, respect them!

    PS WTF! Why the hell is Google Chrome highlighting "Google" as a spelling error as I edit this!?

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + SamVega + +

    +

    30 October 2010 1:17PM

    +
    +
    + +

    Let the Local Authorities run search engines. That will fix everything.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + EMF4EVER + +

    +

    30 October 2010 1:17PM

    +
    +
    + +

    The adverts don't bother me, although I don't like the animations I can't be bothered with the add-on business, I just use internet explorer for flash stuff and mozilla for my main browser, wave of the future huh?

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + clarinette02 + +

    +

    30 October 2010 1:24PM

    +
    +
    + +

    This is an excellent article. It will never said enough how much the Googlopoly is threatening not only for our cultures but simply for our identites.
    Google has made himself 'indispensable' such a role should be left in the hands of a private entity acting for 'free' at the benefice of users' privacy. More on Googolopoly and the need to take concious: /

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + HumanWrongs + +

    +

    30 October 2010 1:25PM

    +
    +
    + +

    and a growing (sic) of teachers and public intellectuals are coming to the realisation that search engines encourage skimming,

    ..teachers and public intellectuals (?)..I think I'll stick with Google thanks.

    I'm really not sure why the Guardian features this chap as all his articles are slightly bonkers.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + bettybugbear + +

    +

    30 October 2010 1:38PM

    +
    +
    + +

    As evidenced by the tragic reality that most people can't tell the difference between ads and content any more, this commercial barrage is having a cultural impact.

    "Tragic"? Do me a favour.
    Given that millions of people use Google as a commercial directory - ie to search for holidays, restaurants, property, products etc - so what?
    And who are all these people that are so bleedin' thick they can't tell the difference between an ad and content? Where is your evidence?
    And why the assumption that all non-ad "content" is so wonderful? It is just as possible for it to be as skewed, distorted, innacurate and biased as something that is designed to sell.

    + +
    +
  • + +
  • + +
  • + +
+ + + +
    + +
  • +
    + + + + + + + + +

    + + + + hiphoppopotamus + +

    +

    30 October 2010 2:02PM

    +
    +
    + +

    My, what a load of cobblers.

    Google got where it is today by organising information better than anyone else. The main thing it sells are relevant search results. If advertising encroaches to the degree that it affects relevance and integrity of the search results, people will simply go elsewhere.

    And it doesn't help that the author has to resort to lying:

    And they condemned as particularly "insidious" the sale of the top spot on search results; a practice Google now champions.

    AdWords are not search results. They appear next to search results, but they don't interfere with the results themselves in any way. I find it ironic that all it took was a split-second Google search to research this.

    It's no surprise, though, that the author would be opposed to the split-second verification of information, given that he's disposed to saying things like:

    Without a fundamental rethinking of the way knowledge is organised in the digital era, Google's information coup d'état will have profound existential consequences.

    Thanks to Google, the superficiality of index learning is infecting our culture, our society, and our civilisation.

    The omnipresence of internet advertising constrains the horizon of our thought.

    To which all I can say is 'citation needed'. I mean, it's nice for the author and all that someone pays him to talk like this, but I doubt anyone other than he and his PhD supervisor find it socially useful.

    Google doesn't just organise web pages, but geographical information, art, news, stock quotes, patents, books - all that information for free to anyone who wants it (and wouldn't be able to access it otherwise, especially if they live in a developing country).

    How much time has Google personally saved each of us? Trillions of man hours easily. It would be very nice to sit in an oak-panelled office browsing 'categories' of information at your leisure, but the vast majority of people simply don't work like that.

    What about the thousands of small websites and blogs (some of them important political voices) that depend on AdWords for survival?

    If you think a non-profit can deliver anything remotely like this, you're insane. Google is possibly one of the biggest forces for good of the information age, and we should thank our lucky stars that this responsibility has fallen into the hands of a company which, by and large, seems to have human interests at heart.

    The answer to any perceived problem is media literacy, learning which information to trust and which has vested interests. This problem isn't unique to the internet, and would still exist no matter who organised the world's information. This article is awful.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + MostUncivilised + +

    +

    30 October 2010 2:02PM

    +
    +
    + +

    Now it is up to us to realise the dream of a non-commercial paradigm for organising the internet. Only then will humanity find the wisdom it needs to deal with the many crises that threaten our shared future.

    I don't think that's worked out too well for China since Google has been heavily censored there.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + globalgypsy + +

    +

    30 October 2010 2:08PM

    +
    +
    + +

    I have always had a deep dislike for having advertising constantly squirted at me, and in the olden days, would not watch commercial TV or listen to commercial radio, because I could not stand the ads.

    Now, in the day of the internet, I find I can almost totally avoid advertising. Adblock means web pages are cleaned. Now, I read content which formally I would have obtained from newspapers or magazines, and which would have been surrounded in advertising, is now clean. Video which was previously saturated is now available via file sharing, completely stripped of ads.

    If you do not like ads, do not expose yourself to them. For me, the internet age means I hardly ever have to see or listen to the damn stuff. Just great!

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + GJJeffreys + +

    +

    30 October 2010 2:10PM

    +
    +
    + +

    Get Firefox, Adblock Plus and NoScript and learn how to use them.

    Kills 99% of browser pollution, even around the rim.

    (No cute furry animals were harmed during the making of this anti-commercial)

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + DaveParker + +

    +

    30 October 2010 2:11PM

    +
    +
    + +

    Whereas subject classification creates harmony and encourages serendipity; indexes fracture knowledge into snippets making us stupid.

    I wouldn't be so sure of that: subject classification is just as much about fracturing knowledge, albeit on a more structured basis. "A place for everything, and everything in its place" - one of classification's goals - doesn't in itself favour harmony or serendipity compared to Google's jumble of search results, and can prodice similarly arbitrary results. Electonic media's ability to put a single item in as many locations as we like (unlike a copy of a book) of course means that in theory we should be able to devise the perfect arrangement of knowledge ("the right places for everything, and evrything in all the places appropriate to it"), but I haven't been much encouraged by the results I've seen.

    Wikipedia's actually very interesting in its demonstration of our conceptual backwardness in relation to what the technology can deliver. The site's provision of unlimited hyperlinking offers opportunities for economy, hierarchy and depth unavailable to past compilers of knowledge, but instead we get gargantuan entries burdened with redundant detail that could be more usefully spun off into more specific items. That's because we prefer linearity to complexity, with the result that even when we approach something as potentially rich and multi-layered as online classification we tend to approach it just like identifying the right gap to stick a book in on a shelf, which it isn't.

    I have to confess I had to search something bland just to remind myself what Google's "sponsored links" are: if asked I too would have had a hard time identifying them, so I'd probably count as one of the dummies even though I know they're not what I was looking for and accordingly don'r click on them: like many others I just don't "see" them, so I probably wouldn't remember what they are. For some people on the other hand they may well be just what they were looking for. There's no shortage of dumb on the internet, but users may be not quite that dumb - even those with rather pedestrian wants.

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + clarinette02 + +

    +

    30 October 2010 2:17PM

    +
    +
    + +

    For the ones who are asking why does it matters, and think anyway they never look at the adds, or think, just don't use Google if not happy,....
    For serving you free indexed search results accompanied with a zest of adverts, Google keeps records on users, Google sniffles private house wifi/emails/passwords information, Google stores information from our web browsing and every single activities he can intercept.
    The right to privacy is a fundamental human rights.
    Collecting and storing data is a security risk.
    The right to search internet as a source of knowledge is fundamental.

    For all these reasons and more, I do agree with the author, we do need a public, impartial, controlled and respected search engine, acting as a public service and not advertising driven search results.
    See for more: /

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + dirkbruere + +

    +

    30 October 2010 2:21PM

    +
    +
    + +

    So how is a search engine that does not do ads supposed to generate the billions required to keep it going? Ah... "The Government"

    + +
    +
  • + +
  • + +
  • + +
+ +
    + +
  • +
    + + + + + + + + +

    + + + + jimsweetman + +

    +

    30 October 2010 2:21PM

    +
    +
    + +

    How can anyone hanker after categorised or schematic knowledge in the 21C let alone suggest that it is somehow disinterested and pure. On the contrary, categorisation in knowledge has always been a manifestation of power employed by whoever is powerful. No surprise that there is a lot of it in the Bible then!

    Google is also not an index in any historical sense of the word. It is a mechanism for looking through stuff and semantic searching is just around the corner.

    Finally, knowledge has been reconceptualised through the internet as a whole and not through Google and the writer needs to cope with that.

    Pretty sloppy stuff really, editor. A bit more rigour would help!

    + +
    +
  • + +
  • + +
  • + +
+ + +
+
+ + +
+ + Showing first 50 comments | + Show all comments | + Go to latest comment + +
+ + + + + + + + + +

Comments on this page are now closed.

+ + + + + + + + + + + + +
+

Comments

+

Sorry, commenting is not available at this time. Please try again later.

+
+ + +
+ +
+ + +
+ +
+ + + + + + + + + + + + + + + + + + + + + + +
+ + + + + +
+ + + + + + + + + + + + + +
+ +
+

On Comment is free

+
+ +
+
+ + +
+ + + + + + + + +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+

Bestsellers from the Guardian shop

+
+ +
+
    +
  • + JAZZ - A Film by Ken Burns +
  • + +
  • JAZZ - A Film by Ken Burns

  • + +
  • JAZZ is the critically acclaimed, definitive history of Jazz music from its roots in the 19th century up to today. Save £50 on RRP.

  • + +
  • From: £29.99

  • + +
+ +
+ +
+ + + + + + + + + + + + + +
+ +
+

Latest posts

+ + + +
+ + +
    +
  • + + + + +
  • +
  • + + + + +
  • +
+ + +
+ + + + + + + + + + + + + + + + +
+
+ + + + + + +
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + +
+
+

+ +

+
+
+
+

This week's bestsellers

+
    + +
  1. + + + + + + 1.  + History of the World in 100 Objects +

    + + by Neil MacGregor + + + £18.00 + +

    +
  2. + +
  3. + + 2.  + Delete This at Your Peril +

    + + by Bob Servant + + + £4.99 + +

    +
  4. + +
  5. + + 3.  + Eyewitness Decade +

    + + by Roger Tooth + + + £17.50 + +

    +
  6. + +
  7. + + 4.  + Granta Book of the Irish Short Story +

    + + by Anne Enright + + + £20.00 + +

    +
  8. + +
  9. + + 5.  + Corrections +

    + + by Jonathan Franzen + + + £4.99 + +

    +
  10. + +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + + +
+ + +
+ + + + + + +
+ + + + +
+
+ + + +
+
+ + +
+ + + + + +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/tests/html/guardiangoogle.html.simple b/tests/html/guardiangoogle.html.simple new file mode 100644 index 0000000..0d1de23 --- /dev/null +++ b/tests/html/guardiangoogle.html.simple @@ -0,0 +1,59 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Google is polluting the internet | Micah White | Comment is free | guardian.co.uk

+ +
+ google eye logo +
Google advertising … 'We have public libraries. We need a public search engine.' Photograph: Dominic Lipinski/PA
+
+ +

An advertising agency has monopolised, disorganised, and commercialised the largest library in human history. Without a fundamental rethinking of the way knowledge is organised in the digital era, Google's information coup d'état will have profound existential consequences.

Google was originally conceived to be a commercial-free search engine. Twelve years ago, in the first public documentation of their technology, the inventors of Google warned that advertising corrupts search engines. "[W]e expect that advertising-funded search engines," Larry Page and Sergey Brin wrote, "will be inherently biased towards the advertisers and away from the needs of the consumers." And they condemned as particularly "insidious" the sale of the top spot on search results; a practice Google now champions.

Under the sway of CEO Eric Schmidt, Google currently makes nearly all its money from practices its founders once rightly abhorred. Following its $3.1bn acquisition of DoubleClick in 2007, Google has became the world's largest online advertising company. With ad space on 85% of all internet sites, upwards of 98% of Google's revenue comes solely from polluting online knowledge with commercial messages. In the gleeful words of Schmidt, "We are an advertising company." Google is not a search engine; it is the most powerful commercialising force on the internet.

Every era believes their way of organising knowledge is ideal and dismisses prior systems as nonsensical. Academic libraries in the US use subject categorisation derived from Sir Francis Bacon's 17th-century division of all knowledge into imagination, memory and reason. Yet who today, aside from one or two exceptions, would try to organise the internet using a handful of categories? For a generation trained to use Google, this approach seems outmoded, illogical or impossible. But modern search engines, which operate by indexing instead of categorising, are also fundamentally flawed.

Three hundred years ago, Jonathan Swift foresaw the cultural danger of relying on indexes to organise knowledge. He believed index learning led to superficial thinking. Swift was right and a growing of teachers and public intellectuals are coming to the realisation that search engines encourage skimming, light reading and trifling thoughts. Whereas subject classification creates harmony and encourages serendipity; indexes fracture knowledge into snippets making us stupid. Thanks to Google, the superficiality of index learning is infecting our culture, our society, and our civilisation.

Google did not invent the index. That honour goes to the 500 monks led by Hugh of St Cher who compiled the first concordance of the bible in 1230. Nor was Google the first to dream of indexing all of human knowledge. Henry Wheately had the idea in 1902 for a "universal index". And Google was not the first to cynically dump advertisements into the search-engine index. What makes Google unique is the extent to which it has, oblivious to the consequences, made a business out of commercialising the organisation of knowledge.

The vast library that is the internet is flooded with so many advertisements that many people claim not to notice them anymore. Ads line the top and right of the search results page, are displayed next to emails in Gmail, on our favourite blog, and beside reportage of anti-corporate struggles. As evidenced by the tragic reality that most people can't tell the difference between ads and content any more, this commercial barrage is having a cultural impact.

The omnipresence of internet advertising constrains the horizon of our thought. Seneca's exhortations to live a frugal life are surrounded by commercials for eco-holidays. The parables of Jesus are mere fodder for selling bamboo flooring. The juxtaposition of advertisements with wisdom neutralises the latter. The prevalence of commercial messages traps us in the marketplace. No wonder it has become nearly impossible to imagine a world without consumerism. Advertising has become the distorting frame through which we view the world.

There is no system for organising knowledge that does not carry with it social, political and cultural consequences. Nor is an entirely unbiased organising principle possible. The trouble is that too few people realise this today. We've grown complacent as researchers; lazy as thinkers. We place too much trust in one company, a corporate advertising agency, and a single way of organising knowledge, automated keyword indexing.

The danger of allowing an advertising company to control the index of human knowledge is too obvious to ignore. The universal index is the shared heritage of humanity. It ought to be owned by us all. No corporation or nation has the right to privatise the index, commercialise the index, censor what they do not like or auction search ranking to the highest bidder. We have public libraries. We need a public search engine.

In 1998, Larry Page and Sergey Brin made a promise: "We believe the issue of advertising causes enough mixed incentives that it is crucial to have a competitive search engine that is transparent and in the academic realm." Now it is up to us to realise the dream of a non-commercial paradigm for organising the internet. Only then will humanity find the wisdom it needs to deal with the many crises that threaten our shared future.

+ + + +
+ diff --git a/tests/html/newyorkerleaks.html b/tests/html/newyorkerleaks.html new file mode 100644 index 0000000..4973f25 --- /dev/null +++ b/tests/html/newyorkerleaks.html @@ -0,0 +1,6793 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +WikiLeaks on the wars in Iraq and Afghanistan : The New Yorker + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ + + + + + + + + + + +
+ + + +
+ + + + + + + + + + + + + + + + +
+
+ +
+ + + + + + + + + + + + + + +
+ + + +
+ + + + + + + +

Comment

+ + + + + + +

Leaks

+ + + + + + + + + +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + by Steve Coll + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + November 8, 2010 + + + + + + + + + +

+ + + + + + + + + + + + + + + + + + + + +
+
+
Text Size:
+
Small Text
+
Medium Text
+
Large Text
+
+ +
+ Print + E-Mail + + + Feeds + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + +
+ + + + + + +
+ + + + + +
+ + + + + + +
+ +

Julian Assange, the founder of WikiLeaks, had a tumultuous youth in Australia and grew into an autodidact with eclectic skills and a deep distrust of hierarchies and governments. In 2006, as he prepared to launch a digital enterprise devoted to the exposure of secrets, he wrote a sort of manifesto about the structure of official conspiracy and its effects on human welfare. He quoted Shakespeare, Machiavelli, and Lord Halifax; the writing veers between lucidity and opaqueness. Its tone, familiar from science fiction, echoes the purifying language of purges and revolutions: “We must understand the key generative structure of bad government. We must develop a way of thinking about this structure that is strong enough to carry us through the mire of competing political moralities and into a position of clarity.”

In July, WikiLeaks defied the Obama Administration by publishing seventy-six thousand intelligence and military field reports from the Afghan war. In October, it posted nearly four hundred thousand secret documents generated on the front lines of the Iraq conflict. The archives are bracing and valuable. There is a literary quality to their all-caps urgency and secret jargon. They disclose important new facts about civilian casualties, the torture of detainees by our allies, Iran’s exported violence, the disruptions caused by private contractors, and the debilitating patterns of clandestine warfare in two benighted regions.

America’s all-volunteer military has left many in the country at a remove from the debasements of the wars; the WikiLeaks archives offer an authentic transcript of them. All wars are terrible, but some must be fought. A democracy is strengthened when its citizens are confronted with the raw truths that follow from the choices of their elected leaders.

Whether WikiLeaks will prove over time to be a credible publisher of such truths is another question. Assange disclosed the names of informants in some of the war reports, even though doing so might endanger them and possibly cause their death. That action has prompted defections from the organization, as has some of Assange’s recent comportment. Internal messages quoted in the Times portray him as a self-aggrandizing control freak. In Sweden, prosecutors are reportedly investigating sexual-assault allegations against him. No charges have been filed in the case, and last week, on CNN’s “Larry King Live,” Assange dismissed it as a “relatively trivial matter,” adding that King “should be ashamed” for raising the subject. In response, King, a scholar of the communications strategies of accused celebrities, tutored him on his tone-deafness: “Rape is not trivial. To say they”—the allegations—“were false, that’s your answer. ‘They’re false.’ That’s fine. That’s all we wanted to hear.”

Henry David Thoreau, in his founding essay on civil disobedience, wrote that “action from principle . . . divides the individual, separating the diabolical in him from the divine.” He meant that a dissenter’s human frailty should not undermine the righteousness of his message. In the case of the WikiLeaks project, however, the sources of doubt involve more than Assange’s behavior and his editorial calls. They also involve his political conceptions and acuity.

In rolling out the Iraq files, Assange won an endorsement from Daniel Ellsberg, the former RAND Corporation analyst who, in 1971, leaked the Pentagon Papers to the press. Assange has suggested that his organization’s disclosures are similarly important. At a press conference in London, he called the Iraq documents “the most comprehensive and detailed account of any war ever to have entered the public record.” In fact, the archives that WikiLeaks has published are much less significant than the Pentagon Papers were in their day. Ellsberg and his collaborators in the press exposed lies by President Lyndon Johnson and his cabinet about critical decisions in the Vietnam War, such as Johnson’s exaggeration of enemy action in the Gulf of Tonkin incident, which he used as a rationale for escalating combat. The WikiLeaks files contain nothing comparable. Nor are they distinctively comprehensive; there are many open archives in the United States and Europe that chronicle the depredations of wars past, unit by unit, prison camp by prison camp. It is not necessary to promote the value of the WikiLeaks archive by overstating its importance.

If the organization continues to attract sources and vast caches of unfiltered secret documents, it will have to steer through the foggy borderlands between dissent and vandalism, and it will have to defend its investigative journalism against those who perceive it as a crime. Assange is animated by the idea of radical transparency, but WikiLeaks as yet lacks a fixed address. Nor does it offer its audiences any mechanism for its own accountability. If the organization were an insurgency, these characteristics might be in its nature. Assange declares that he is pioneering an improved, daring form of journalism. That profession, however, despite its flaws, has constructed its legitimacy by serving as a check on governmental and corporate power within constitutional arrangements that assume the viability of the rule of law. The Times and the Washington Post, in successfully defending their decision to publish the Pentagon Papers before the Supreme Court, extended considerably the political impact of their revelations.

WikiLeaks has recently been in discussions with lawmakers in Iceland about trying to concoct the world’s most extensive press-freedom regime there. The idea apparently is to transform Iceland, in the aftermath of its recent, disastrous experiments with offshore banking, into the Cayman Islands of First Amendment-inspired subversion. A volcanic-island nation may well find whistle-blowing to be a compatible flagship industry. And it could provide the project with a sustainable basis for legal legitimacy.

It is not clear, however, that such normalcy within a national system would entirely suit Assange’s purposes. In a part of his manifesto titled “State and Terrorist Conspiracies,” he wrote, “To radically shift regime behavior we must think clearly and boldly, for if we have learned anything, it is that regimes do not want to be changed.” If dissenters hacked and published enough secret information harbored by governments, he went on, this might disrupt what he imagined to be the absolute dependency of governments on flows of hidden data. “An authoritarian conspiracy that cannot think efficiently cannot act to preserve itself against the opponents it induces,” Assange concluded. That is, he believed that he could break governments by siphoning the secrets that nourish them.

But something like the opposite may be the case: if WikiLeaks cannot learn to think efficiently about its publishing choices, it will risk failure, not only because of the governmental opponents it has induced but also because so far it lacks an ethical culture that is consonant with the ideals of free media. 

+ +
+ + + + + + + + + + + + + + + + +
+ + + + + +
+
ILLUSTRATION: TOM BACHTELL
+
+ + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ To get more of The New Yorker's signature mix of politics, culture and the arts: Subscribe Now +
+ + + + + + + + + + +
+ + + + + + + +
+ + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + +

Related Links

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + +
+ + + + + + + + + +
+ + + + +
+ + + + + + + +
+
+
More In This Section
+ + + + + + + + + + + + +
Comment: Uncomfortable Climate by Elizabeth Kolbert
+ + + + + + + + + + + + + +
Comment: Electoral Dissonance by Hendrik Hertzberg
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Comment: Recession Election by Hendrik Hertzberg
+ + + + + + + + + + + + + +
Comment: Pride and Prejudice by Margaret Talbot
+ + + + + + + + + + + + + +
Comment: Writing And Winning by Adam Gopnik
+ + + + + + + + + + + + + +
Comment: Behind Closed Doors by Steve Coll
+ + + + + + + + + + + + + +
Comment: Bewitched by Rebecca Mead
+ + + + + + + + + + + + + +
Comment: Schoolwork by Nicholas Lemann
+ + + + + + + + + + + + + +
Comment: Intolerance by Lawrence Wright
+ + + + + + + + + + + + + +
Comment: Iraq’S Cost by Hendrik Hertzberg
+ + + +
View All
+ + + + +
+
+ + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + +
+ 11 + 15, + 2010 +
+ + + +
+ + + + +
+ + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + +
Electoral Dissonance: Hendrik Hertzberg on midterm fallout.
+ + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + +
+ + + + + + + + + + + +
SUBSCRIBE TO THIS PODCAST
+ + + + + + + +
+ + + + + + +
+ + + + + + + + + + + + + +
+ + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + ASK THE AUTHOR + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + +
+ + + + + + + + + + + + + +
+ + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + + + + +
+ + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + +
 
+ + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + +
+ + + + + + + + + + + + + +
+ + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + EDITOR'S CHOICE + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + +
+ + + + + + + + + + + + + +
+ + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + NEWS DESK + + + + + + + + + + +
+
+ + + + + + + + +
+
+ + + + + + +
+
+ + + + + + + + + + +
MORE POSTS
+ + + + + + + +
+ + + + + + +
+ + + + + + + + + + + + + +
+ + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + HENDRIK HERTZBERG + + + + + + + + + + +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + +
+ + + + + + + + + + + + + +
+ + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + FINGER PAINTING + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + +
+ + + + + + + + + +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + +
+ + + + + + + + + + + + + +
+ + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + +
+ + + + + + + + + + + + + +
+ + + + +
+ + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + +

 

+
+ + + + + + +
    +
  • + + +

    Links to articles and Web-only features, in your inbox every Monday.

    + + + +
  • +
  • + + +

    A weekly note from the New Yorker's cartoon editor.

    + + + +
  • +
+ + + + + +
+ + + +
+ + + + +
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + +
+
+ + + + + + + + + + + + + +
+
+ + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ powered by + + + + + + + + + + + +
+ + + +
+ + + + + + + + + + + + + +
+ + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + +

THE MAGAZINE: NOVEMBER 22, 2010

+ + + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + +
+ + + + + + + + + + + + + +
+ + + +
+
+
+ + + + + + + + + + + +
+ + + +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Subscribe to The New Yorker
+ + + + + + + + + +
+ + + + +
+
+ +
+
+
+
+
Events & Promotions
+
+
+ + + + + + + + + + + +
+ + + +
+
+
+
+
+ + + +
+
+ + + + + + + + + + + +
+ + + +
+
+ + + +
+
+
+
RSS Feeds
+
Stay up to date on everything happening at newyorker.com.
+
+
+
+
+
+
+ + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + THE NEW YORKER FESTIVAL + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + THE NEW YORKER TABLET EDITION + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + DIGITAL EDITION AND ARCHIVE + + + + + + + + + + + + + + + + + +
+
+ + + + + + +
+
+ + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + E-READER EDITION + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + AUDIO EDITION + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + FIND US ON ITUNES + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + JUST IN TIME FOR SUMMER + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + NOW IN PAPERBACK + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + THE NEW YORKER STORE + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + THE CARTOON BANK + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + ON THE TOWN + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + + + + +
Follow The New Yorker on Facebook, Twitter, Tumblr, iTunes + + +Facebook +Twitter +Tumblr +iTunes +Foursquare +
+ + + + + + + + +
+
+ + + + + + + + + + + + +
+ + + + + + +
+ + + + + + + + + + + + + +
+ + + + + + + +
+
+
+ + + + + + + + + + + +
+ + + +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + +
+
+ + + + +
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + +
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/tests/html/newyorkerleaks.html.simple b/tests/html/newyorkerleaks.html.simple new file mode 100644 index 0000000..12309ac --- /dev/null +++ b/tests/html/newyorkerleaks.html.simple @@ -0,0 +1,58 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

WikiLeaks on the wars in Iraq and Afghanistan : The New Yorker

+ +

Julian Assange, the founder of WikiLeaks, had a tumultuous youth in Australia and grew into an autodidact with eclectic skills and a deep distrust of hierarchies and governments. In 2006, as he prepared to launch a digital enterprise devoted to the exposure of secrets, he wrote a sort of manifesto about the structure of official conspiracy and its effects on human welfare. He quoted Shakespeare, Machiavelli, and Lord Halifax; the writing veers between lucidity and opaqueness. Its tone, familiar from science fiction, echoes the purifying language of purges and revolutions: “We must understand the key generative structure of bad government. We must develop a way of thinking about this structure that is strong enough to carry us through the mire of competing political moralities and into a position of clarity.”

In July, WikiLeaks defied the Obama Administration by publishing seventy-six thousand intelligence and military field reports from the Afghan war. In October, it posted nearly four hundred thousand secret documents generated on the front lines of the Iraq conflict. The archives are bracing and valuable. There is a literary quality to their all-caps urgency and secret jargon. They disclose important new facts about civilian casualties, the torture of detainees by our allies, Iran’s exported violence, the disruptions caused by private contractors, and the debilitating patterns of clandestine warfare in two benighted regions.

America’s all-volunteer military has left many in the country at a remove from the debasements of the wars; the WikiLeaks archives offer an authentic transcript of them. All wars are terrible, but some must be fought. A democracy is strengthened when its citizens are confronted with the raw truths that follow from the choices of their elected leaders.

Whether WikiLeaks will prove over time to be a credible publisher of such truths is another question. Assange disclosed the names of informants in some of the war reports, even though doing so might endanger them and possibly cause their death. That action has prompted defections from the organization, as has some of Assange’s recent comportment. Internal messages quoted in the Times portray him as a self-aggrandizing control freak. In Sweden, prosecutors are reportedly investigating sexual-assault allegations against him. No charges have been filed in the case, and last week, on CNN’s “Larry King Live,” Assange dismissed it as a “relatively trivial matter,” adding that King “should be ashamed” for raising the subject. In response, King, a scholar of the communications strategies of accused celebrities, tutored him on his tone-deafness: “Rape is not trivial. To say they”—the allegations—“were false, that’s your answer. ‘They’re false.’ That’s fine. That’s all we wanted to hear.”

Henry David Thoreau, in his founding essay on civil disobedience, wrote that “action from principle . . . divides the individual, separating the diabolical in him from the divine.” He meant that a dissenter’s human frailty should not undermine the righteousness of his message. In the case of the WikiLeaks project, however, the sources of doubt involve more than Assange’s behavior and his editorial calls. They also involve his political conceptions and acuity.

In rolling out the Iraq files, Assange won an endorsement from Daniel Ellsberg, the former RAND Corporation analyst who, in 1971, leaked the Pentagon Papers to the press. Assange has suggested that his organization’s disclosures are similarly important. At a press conference in London, he called the Iraq documents “the most comprehensive and detailed account of any war ever to have entered the public record.” In fact, the archives that WikiLeaks has published are much less significant than the Pentagon Papers were in their day. Ellsberg and his collaborators in the press exposed lies by President Lyndon Johnson and his cabinet about critical decisions in the Vietnam War, such as Johnson’s exaggeration of enemy action in the Gulf of Tonkin incident, which he used as a rationale for escalating combat. The WikiLeaks files contain nothing comparable. Nor are they distinctively comprehensive; there are many open archives in the United States and Europe that chronicle the depredations of wars past, unit by unit, prison camp by prison camp. It is not necessary to promote the value of the WikiLeaks archive by overstating its importance.

If the organization continues to attract sources and vast caches of unfiltered secret documents, it will have to steer through the foggy borderlands between dissent and vandalism, and it will have to defend its investigative journalism against those who perceive it as a crime. Assange is animated by the idea of radical transparency, but WikiLeaks as yet lacks a fixed address. Nor does it offer its audiences any mechanism for its own accountability. If the organization were an insurgency, these characteristics might be in its nature. Assange declares that he is pioneering an improved, daring form of journalism. That profession, however, despite its flaws, has constructed its legitimacy by serving as a check on governmental and corporate power within constitutional arrangements that assume the viability of the rule of law. The Times and the Washington Post, in successfully defending their decision to publish the Pentagon Papers before the Supreme Court, extended considerably the political impact of their revelations.

WikiLeaks has recently been in discussions with lawmakers in Iceland about trying to concoct the world’s most extensive press-freedom regime there. The idea apparently is to transform Iceland, in the aftermath of its recent, disastrous experiments with offshore banking, into the Cayman Islands of First Amendment-inspired subversion. A volcanic-island nation may well find whistle-blowing to be a compatible flagship industry. And it could provide the project with a sustainable basis for legal legitimacy.

It is not clear, however, that such normalcy within a national system would entirely suit Assange’s purposes. In a part of his manifesto titled “State and Terrorist Conspiracies,” he wrote, “To radically shift regime behavior we must think clearly and boldly, for if we have learned anything, it is that regimes do not want to be changed.” If dissenters hacked and published enough secret information harbored by governments, he went on, this might disrupt what he imagined to be the absolute dependency of governments on flows of hidden data. “An authoritarian conspiracy that cannot think efficiently cannot act to preserve itself against the opponents it induces,” Assange concluded. That is, he believed that he could break governments by siphoning the secrets that nourish them.

But something like the opposite may be the case: if WikiLeaks cannot learn to think efficiently about its publishing choices, it will risk failure, not only because of the governmental opponents it has induced but also because so far it lacks an ethical culture that is consonant with the ideals of free media. 

+ +
+ -- cgit v1.2.3