Michael Nellis 07 Mar 2004
Return to Celebrate Freedom Index
Return to Chronology
03 Mar 2004
The U.S. Supreme Court heard arguments over COPA in 2003 and ruled that it was unconstitutional and send it back down to the 3rd Circuit Court of Appeals. On 02 Mar 2004, it was back before the Supreme Court and a group of now 'net savvy justices. That doesn't mean the arguments made any more sense, however. The very first flaw in the arguments to support COPA was about the ubiquity of internet pornography sites. This argument was put forth by Attorney General Theodore Olson. Olson told the court how he had sat at his computer over the weekend and done a keyword search for: "free porn". Olson told the court how his search found 6,230,000 web sites. This, he said, proved the necessity for "Harmful To Minors" laws such as COPA.
This, I say, proves what an ignoramus Olson is. It also proves that despite the 'net savvy of the "Supremes",[01] they too are still woefully ignorant of search engines and the 'net.
It's only to be expected, really. There's so much information on any one topic floating around already, that to be a specialist in any field requires the equivalent of a Masters Degree at least. In Star Trek: The Next Generation, every commissioned officer held a degree in some field; usually engineering. This was a part of the culture, I believe, as a commentary on the current state of the "Information Age".[02] The U.S. Supreme Court justices hold their degrees in law. Not in cybernetics, and certainly not in the field of search engines. Same holds true for Olson, but his argument before the court was so rife with egregious errors, and has such a potential for erosion of civil liberties, I am less inclined to show him any sympathy.
When I first came across the report on these arguments I posted a story about it at the Library and Information Science News web site. The story excited a comparatively large volume of commentary, in the course of which two vitally pertinent and related aspects of this case came out. These came about because of a test by Seth Finkelstein that tested Olson's assertion about the proliferation of internet porn.[03]
Seth Finkelstein is an anti-censorware activist who has studied and attempted to study internet filters. "Attempted", as he has been forbidden to perform some studies under "copyright" laws. When he read about Olson's argument, he tested Olson's assertion. The article, by Associated Press, did not specify which engine Olson had used. Mr. Finkelstein used Google. A very handy, dandy tool for such a study. Anyone who uses Google extensively knows that particular search engine "bundles" multiple hits. I occasionally do a vanity search for own my web site, using a portion of the URL, just to see who is linking to me. By excluding a portion of prospective hits I cut down the number of returns, as of this writing, to something like 390 or so. However, when I scan the entries listed on the results pages, there are only sixty-three entries listed.
Why is that? Because many of the pages on my site contain a table of contents in which each link to each separate section has the full URL for the introductory page to that section. In the Celebrate Freedom section, for instance, the full site URL is included in at least forty-eight navigation links contained in eight pages.[04] The full URL is also included for each instance where there is a link (active or static), to an image file. This seems to be necessary for inter-directory navigation. As a result, Google advises me that many hits have not been shown because of similarities to those shown on the results pages. Despite the numerous repetitions of the full URL on pages at my site, the title of my site is shown only in the first hit. My full site address is also appended automatically to every comment I post to LISNews, and I do that a lot.
So, Mr. Finkelstein keyed in "free porn", ran through every results page that came up, and hit the wall at the number 875 mark.[05] In his comment at LISNews, he stated that there seemed to be fewer than 900 unique porn sites.[06] Keep in mind, however, that this statement was made tongue in cheek. He made it to illustrate the point that anybody could juggle numbers to give skewed results.
However, Seth's comment primed a neuron, and when someone tritely dismissed Seth's study as only proving Olson's contention, I realized the full scope of Olson's ignorance and the flaw in his argument. Which flaw Seth had pointed out in the comment at his site (and maybe that's where I got it from).
First: Olson's hit count was for web pages, not web sites as he reportedly asserted.
Secondly: when you do a keyword search the engine should return a result for every hit it finds regardless of the content of the page on which the keyword resides. That means that Olson, assuming he used Google, shouldn't have gotten hits for pages at porn sites alone, his search should also have returned hits for every web log comment, civil liberties activist site, and even newspaper articles which included the phrase "free porn", on every page that had been archived by Google at the time he did his search. Oddly enough, the results of Seth's test seem to have turned up hits for porn sites only. An odd anamoly. I find it hard to believe that no one has used the phrase "free porn" in a comment or newspaper or magazine article since COPA was passed.
Thirdly: many pornography sites overload their front pages with redundant repetitions of keywords. They do this to artificially move their sites up in the search engine standings. A practice that has been a bone of contention with a number of people since Google first began ranking pages, almost. This means that each page probably generated one hit for each incident of the phrase "free porn" on that page.[07]
There are other arguments that call Olson's Ubiquity Assumption of Internet Pornography into question. The biggest counter-point is: How many eight year old children sit down at the computer and do a keyword search for "free porn"? What is the number of children who end up at a porn site on a day to day basis and how does it compare to the number of clicks all children send in a given day while online? How does that compare to what one might statistically expect?
Also, let's assume solely for the sake of this commentary that Olson had gotten hits for web sites alone, based on the idea that the "free porn" lure was placed only on the index page. As of 07 Mar 2004, Google's copyright notice claimed 4,285,199,774 archived web pages.[08] Starting with "million", each subsequent "-illion" is an increase of three orders of magnitude, or a one thousand fold increase. Four million, two hundred eighty-five thousand pages constitutes one-tenth of one percent of the number of archived pages. The number of hits Olson got can be computed with the formula 6230000/4285000000 to get 1.45390898483080513E-3. Which gives a proportion, in decimal notation, of 1.454*10^-3, or, in standard notation, 0.1454 percent. That's a fairly small percentage, but keep in mind that we are comparing apples and oranges to begin with.
Other questions that might arise are: How can we separate the non-porn news and views pages such as web logs from those that do purvey pornography? Also: How many pages does each site have? What is the average number of hits for "free porn" per page per site? And how many of those 875 hits were for domains instead of sites? I would tend to discount such questions, however, as one is almost invariably taken to the front page for porn sites. In most cases, registration is required. Although the percentage of sites that don't require registration is certainly pertinent.
Aside from all this, Olson's argument, and this commentary, deal only with sites/pages that advertise free porn. Pay for view porn sites are another matter. If, as has been stated, as much as two percent of internet content is pornography, valid or self-identified pornography, there are a lot of cheap bastards out there. The Supreme Court does not seem to have addressed the volume of pay for view sites; at least there was very little mention of them in the article. One paragraph of the article said:
Free pornography is easy to find online, placed there as a hook to lure paying customers, the Bush administration and its backers argue. Minors can find that free material as easily as adults, although it would be illegal for a store owner to sell them a paper copy of a magazine that shows the same images.Hence, the focus on freely accessible pornography. At another level, one argument in favour of striking down COPA is that even some free access sites require proof of age verification, while one argument in favour of implementing COPA is that many pay for view sites have a FREE TOUR feature that shows some pretty raunchy stuff.
The truth of the matter is that there is just no way to get any kind of accurate picture of how many pornography web sites are out there.[09] And while that might seem to support Olson's contention, it only means that he can't do anything more than guess and talk through his hat. In the end, the best solution is to supervise your children when they are on the internet as much as you do when they are crossing the street. And learn how to disable the JAVA script feature in the TOOLS, INTERNET OPTIONS menu. And don't ever leave your credit card or a receipt lieing around where your adolescent son can copy the number off it. Or else teach him yourself where to look for his porn so you don't get caught by surprise with funny credit billings.
FOOTNOTES:
[01] You can see the Associated Press report on the arguments, and an analysis of the arguments by Tony Mauro, both at First Amendment Center.
Return to place in main text
[02] In 1964 Isaac Asimov wrote a complaint about the explosive growth of information:
The other day I was looking through a new textbook on biology [Biological Science: An Inquiry Into Life [...]].
Unfortunately, though, I read the Foreword first (yes, I'm one of that kind) and was instantly plunged into the deepest gloom. Let me quote from the first two paragraphs:
With each new generation our fund of scientific knowledge increases fivefold. . . . At the current rate of scientific advance, there is about four times as much significant biological knowledge today as in 1930, and about sixteen times as much as in 1900. By the year 2000, at this rate of increase, there will be a hundred times as much biology to 'cover' in the introductory course as at the beginning of the century.
Imagine how this affects me. I am a professional "keeper-upper" with science and in my more manic, ebullient, and carefree moments, I even think I succeed fairly well.
Then I read something like the above quoted passage and the world falls about my ears. I don't keep up with science. Worse, I can't keep up with science. Still worse, I'm falling farther behind every day.
--Isaac Asimov, Magazine of Fantasy and Science Fact, Mar 1964
[reprinted in the collection of essays Asimov On Numbers, pg 91]
Return to place in main text
[03] You can see Seth's write up on his test at his web site. Remember: take it with a grain of salt.
Return to place in main text
[04] http://www.angelfire.com/scifi/dreamweaver/...
When I do a search, I look for: "/scifi/dreamweaver/".
Return to place in main text
[05] You can configure the number of results to be shown on an individual page by clicking on the PREFERENCES option and selecting 10, 50, or 100 in the pull down menu.
Return to place in main text
[06] In retrospect, that should be 875 unique sites offering free access to pornography. The question of pay sites is not taken into account for this commentary; nor were they taken into account for Seth's test, but he failed to specify free access sites in the comment he posted to LISNews.
Return to place in main text
[07] Yes, we have free porn!!!! If you like free porn then surf these links to other sites that offer free porn! Free porn is good!!! Free porn is fun!!! Etc, ad nauseum.
Return to place in main text
[08] Rounded down to 4.285 billion, or "*10^9". In decimal notation, "1.00" equals one, or one hundred percent; ".50" is one half, or fifty percent, commonly expressed in exponential notation as 5.0*10^-1.
(And you thought you'd never have to use that stuff once you got out of high school.)
Return to place in main text
[09] And that's for only those sites that self-identify as pornographic and require some kind of age verification. This commentary ignores everything that the ultra-righteous would call pornographic that does not self-identify as porn. All nude art, sites dedicated to sexuality, naturalism, etc, etc, etc.
Return to place in main text
Return to Chronology
03 Mar 2004
Return to Celebrate Freedom Index