Monday, February 11, 2008

Specifications (Part I) ... What Exactly are They?

There seems to be much confusion as to what is a specification -- that is, what constitutes CSI? I am no expert on the subject, however, I would like to add my two “sense” based on my understanding of the math and concepts involved. I have started discussing complex specified information here [link]. This is merely an extended discussion, clarification, and a bit deeper of a probe.

First, we must begin by defining a specified pattern. What is the basic idea behind a specified pattern? Why must we define a specified pattern? It is because within patterns, there can be something (potentially non-random characteristics such as function) which separates one set of patterns from all other possible patterns.

Dembski begins to discuss a specified pattern by stating: “The crucial difference between (R) [a random pattern] and (ψR) [a pseudo-random pattern -- the Champernowne sequence] is that (ψR) exhibits a simple, easily described pattern whereas (R) does not. To describe (ψR), it is enough to note that this sequence lists binary numbers in increasing order. By contrast, (R) cannot, so far as we can tell, be described any more simply than by repeating the sequence.” I will continue to discuss the difference between random and pseudo random patterns (as it relates to specificity) further on.

Dr. Dembski describes specified patterns as those patterns which can be described and formulated independent of the event (pattern) in question.

Stephen Myers expounds upon this subject in his paper, “DNA and the Origin of Life: ...”:

“Moreover, given their [proteins’] irregularity, it seemed unlikely that a general chemical law or regularity could explain their assembly. Instead, as Jacques Monod has recalled, molecular biologists began to look for some source of information or “specificity” within the cell that could direct the construction of such highly specific and complex structures. To explain the presence of the specificity and complexity in the protein, as Monod would later insist, “you absolutely needed a code.”21"

... and ...

“In essence, therefore, Shannon’s theory remains silent on the important question of whether a sequence of symbols is functionally specific or meaningful. Nevertheless, in its application to molecular biology, Shannon information theory did succeed in rendering rough quantitative measures of the information-carrying capacity or syntactic information (where those terms correspond to measures of brute complexity).33 ... In essence, therefore, Shannon’s theory remains silent on the important question of whether a sequence of symbols is functionally specific or meaningful. Nevertheless, in its application to molecular biology, Shannon information theory did succeed in rendering rough quantitative measures of the information-carrying capacity or syntactic information (where those terms correspond to measures of brute complexity).33"

... and ...

“Since the late 1950s, biologists have equated the “precise determination of sequence” with the extra-information-theoretic property of specificity or specification. Biologists have defined specificity tacitly as “necessary to achieve or maintain function.” They have determined that DNA base sequences, for example, are specified not by applying information theory but by making experimental assessments of the function of those sequences within the overall apparatus of gene expression.36 Similar experimental considerations established the functional specificity of proteins. Further, developments in complexity theory have now made possible a fully general theoretical account of specification, one that applies readily to biological systems. In particular, recent work by mathematician William Dembski has employed the notion of a rejection region from statistics to provide a formal complexity-theoretic account of specification. According to Dembski, a specification occurs when an event or object (a) falls within an independently given pattern or domain, (b) “matches” or exemplifies a conditionally independent pattern, or (c) meets a conditionally independent set of functional requirements.37"

This ability to formulate a pattern independent of the event in question can be understood with a couple of examples. But, we must first start with a communication system. Why? So that we can exchange a pattern across that communication system and see if it can indeed be formulated independent of the pattern in question.

Let’s say that we wanted to send the pattern: 111111111111111111111111111111 across our communication system. First, can we compress it? Yes we can. How? By defining or formulating it differently. On way to do that is by compressing the pattern to “print 1 X 30.” Thus we can send this information across the communication channel and the receiving party will receive the exact same information than if we had sent the original pattern of 1s. Therefore, the original pattern can be defined independently of the pattern itself. So, compressibility is one way to define a specified pattern.

Now, let’s look at another pattern: 1 2 3 5 7 11 13 17 19 23 ... . Is this a specified pattern? In order to evaluate this and the next examples of patterns, we need to understand the difference between what Dembski calls a pseudo-random and a random pattern. A random pattern is algorithmically complex and can not be defined or formulated according to a system of rules whereas a pseudo-random pattern merely looks random, since it is algorithmically complex (not regular as the last example of 1s), however it is not random since it can be defined and formulated as a separate pattern according to a system of rules. This usually entails meaning or function.

So, let’s return to our example of the pattern: 1 2 3 5 7 11 13 17 19 23 ... . Can we define or formulate it in a different way yet retain the same information? Sure we can. This pattern conforms to a system of mathematical rules and can be defined or formulated differently than by merely repeating the pattern. We can send: “all numbers, beginning with one, which are divisible only by themselves and one” or its equivalent mathematical notation across the communication channel and the other end would be able to reconstruct the original pattern -- a sequence of prime numbers.

Now, let’s take a look at proteins. When it comes to measuring specificity, this is exactly like measuring specificity in a meaningful sentence, as I will soon show. Functional specificity merely separates functional pattern “islands” from the sea of random possible patterns. When specific proteins are brought together, you can have a pattern which creates function []. [link] That functional pattern itself is formulated by information contained in DNA which is encoded into RNA and decoded into the specific system of functional proteins. The functional pattern as the event in question is defined independently as a pattern of nucleic acids. Thus, the independent formulation for the system of protein function is sent across a communication channel as RNA which is the independent formulation of the function (I have provided a definition of “function” [here “philosophical foudations”] [link]). The RNA is independent of the function itself for which it codes (that is, the information for formulating the function doesn’t come from the protein pattern itself as per Central Dogma of Biology [] [link]). You don’t send the function itself across the communication channel, you send an informational pattern. Again, that is what is referred to as functional specificity.

What about the pattern: “Can you understand this?” This pattern is most definitely specified, since it can be defined according to a pre-set English dictionary and produce a function through meaning/ideas. How do you pass an idea/meaning through a communication channel? You do so, by using informational units derived from a pre-set system which are separate from the actual pattern of ideas/meaning and which can be processed according to the rules of that pre-set system at the other end of the communication channel. The fact that you can send the same idea using different patterns in the same language or even different patterns by using another language shows that the ideas themselves are independent from the pattern which is sent across the communication channel. That is how we know that the idea “contained” in the pattern is defined independent of the pattern itself. We could even state the same meaning in a different way – “Do you have the ability to comprehend what these symbols mean?” Either way, the idea contained in the above pattern (question) can be transferred across a communication channel as an independent pattern of letters. This is referred to as functional semantic specificity – where specific groups of patterns which produce semantic/meaningful function are “islands” of specified patterns within a set of all possible patterns.

What about the event: ajdjf9385786gngspaoi-whwtht0wuetghskmnvs-12? Is that pattern specified? Well, is it compressible? Hardly, if at all. Can it be stated any other way, in terms of definition, function, or formulation? Well, this question can only be answered through cryptographic methods. If there is no function, or formulation (description or formulation which is independent of the pattern) then the only way to deliver it across a communication channel is to actually send the pattern itself. Of course, you could send a phonetic spelling of each unit across the communication channel and this would show that the pattern of each unit is specified, but that doesn’t specify the pattern as a whole – the pattern that emerges from the string of specified units. Therefore, until non-arbitrary cryptographic evidence states otherwise, the above pattern is not specified.

To sum up, as has been shown above, a specified pattern is described, independent of the event in question, by the rules of a system. As such, explanations other than chance are to be posited which can create informational patterns that are described by the rules of a system. However, specificity is still not quite good enough by itself to determine previous intelligent cause.

What is needed is a specification, which is a highly improbable specified pattern. But, how do we determine what is highly improbable? We take a look at the available probabilistic resources – that is how many bit operations were needed or used in order to arrive at the specified pattern. Measuring the specified pattern against how long it took to arrive at that pattern and how may different trials where associated with that pattern will tell us if the specified pattern is also highly improbable – beyond all probabilistic resources necessary to generate the specified pattern by chance.

This measure of complexity, which is added onto specificity to create CSI or a specification, is akin to the complexity (improbability) of drawing a royal flush 5 times in a row. Hmmm ... should we begin looking into causes other than chance?

Now, let’s take a look at measuring for a specification. First, it must be understood that this only applies to patterns which can be measured probabilistically. Since a specification includes, but is not limited to function, I will use an example of specification based on compressibility, since compressibility is a way of independently formulating a certain pattern as I have shown above.

Let’s return to our first example – the long pattern of 1s. Dembski has stated that the higher the compressibility of the pattern, the higher the specificity. Why is that the case? That is the case, since the less compressible the pattern is, the more it becomes algorithmically complex and the more random it becomes. These algorithmically complex patterns are the types of patterns that will be generated by random processes.

For example: It is way more likely for a pattern with the same compressibility as “100101111101100001010001011100" to be generated by a random flip of the coin than for a pattern with the same compressibility as “111111111111111111111111111111" to be generated. Why? Because, the only other pattern that can be formulated with the same algorithmic compressibility as the pattern of 30 1s is “000000000000000000000000000000,” which can be compressed to “print 0 X 30.” However, there are many more other patterns with the same compressibility as the first, more random pattern (assortment of 50% 1s and 50% 0s which are sorted in a truly random fashion as per the rules of statistics*). So, it is easier for chance the “find” one of the less compressible patterns, because there are more patterns with the same lower compressibility than there are patterns with the same higher compressibility -- in the above case there are only 2 out of 1,073,741,824 patterns which have the highest compressibility and those two patterns are shown above in the repeating 1s and the repeating 0s.

Now, let’s calculate the pattern: 111111111111111111111111111111 and see if it is a specification.
First we need to calculate it’s specificity. That is done by multiplying its probability (as 1 in 1,073,741,824) with how many other patterns have that same compressibility (in this case only one other pattern as shown above).

So, the specificity of this pattern = 2 * 1/1073741824

Now, in order to move on to finding out if we have a specification here, we must first understand the context in which we actually found this pattern. Let’s say that we are running a search on a string of characters 30 bits long – the size of the pattern above. Let’s say we start at a random point such as: 111100001010011110000110111101. Now, let’s also say that in 30 bit flips (30 operations), we arrive at the above pattern of thirty 1s which we are calculating. Is it reasonable to presume that the pattern was arrived at by pure chance? Let’s make the calculation and find out.

Specification: ?>1 = -log2 [number of bit operations * specificity] = ?
Specification: ?>1 = -log2 [30 * 2/1073741824] = -log2
Specification: ?>1 = approx. 24 bits of CSI = a specification

It is obvious that the discovery of the pattern was not random, but was somehow guided. It can be rejected as being the result of strictly random processes for 2 reasons

1. it is not in the nature of random processes to generate specificity -- in this case regularities (high compressibility) -- and
2. because the number of random bit operations falls extremely short of the probability of arriving at the end pattern taking in account the number of trials (probabilistic resources). To say that this pattern was the result of chance would be to resort to a “chance of the gaps” type of argument. In fact, it has been shown that evolutionary algorithms (which are non-random) are a necessity to arrive at specifications such as the one above. In fact, Dembski and Marks have discussed such algorithms on their evolutionary informatics site.

Now, let’s compare that example to a pattern that has the same probability as a higher percentage of all possible combinations, such as the pattern: 110101000001101001110001111010, which is highly incompressible (thus more algorithmically complex and more random). If you wanted to “compress” this pattern you may end up with: “print 1 X 2, 0101, 0 X 5, 1 X 2, 01, 0 X 2, 1 X 3, 0 X 3, 1 X 4, 010"

Now, let’s get to the calculation.
Specification: ?>1 = -log2 [number of bit flips * number of specified patterns * probability]
Specification: ?>1 = -log2 [30 * X * 1/1,073,741,824]

Now, I haven’t included the number of specified patterns which are as compressible as the above random number, since I am not sure exactly how to calculate that number. However, you could theoretically calculate all other possible compressed patterns which contain the same amount of information and are just as random as the compressed pattern above, and as far as I am aware, that is precisely with what algorithmic information theory deals. Dembski has shown the math involved with algorithmic compressibility in “Specification: The Pattern that Signifies Intelligence” and has shown and concluded: “To sum up, the collection of algorithmically compressible (and therefore nonrandom) sequences has small probability among the totality of sequences, so that observing such a sequence is reason to look for explanations other than chance.”

The corollary to what Dembski has summed up is that algorithmically incompressible sequences make up the rest of the sequence space and thus have a large probability among the totality of sequences. So, we do know that the “X” in the above equation will be a very large number and will produce a less than one amount of CSI and there will be no specification.

For example, even if only 1/32 of all possible patterns are algorithmically random, then the equation would play out as follows:

Specification: ?>1 = -log2 [30 * 33,554,432 * 1/1,073,741,824]
Specification: ?>1 = -log2 [.9375]
Specification: ?>1 = approx. .093 = not > 1 = not a specification

So far, I’ve only shown an example of a specification that was not algorithmically complex. But now, let’s briefly discuss a specification that is algorithmically complex (non-repetitive) and also pseudo-random.

In this case, pseudo-random patterns are those patterns which are algorithmically complex and thus they appear to be random, however, they are specified because of function or they match some independent pattern as set by a system of rules ie: mathematical, linguistic, rules of an information processing system, etc. Basically, as I have stated above, these are the types of patterns which form “islands” of function/pseudo-randomness within a sea of all possible patterns.

When measuring for a functional specification (within a set of functional "islands"), you apply the same equation, however, when measuring the specificity you take into account all other FUNCTIONAL patterns (able to be processed into function *by the system in question*) that have the same probability of appearance as the pattern in question. You do that instead of taking into account all equally probable compressible patterns, since you are now measuring for functional specificity as opposed to compressible specificity. Therefore, you can only measure for functional specificity and then a specification based upon a high understanding of the system and pattern in question.

Furthermore, according to the NFL Theorem, an evolutionary algorithm based on problem specific information is necessary in order to arrive at better than chance performance, which is exactly what a specification is calculating.

The next question: will a random set of laws cause an information processing system and evolutionary algorithm to randomly materialize? According to recent work on Conservation of Information Theorems ID theorists state that the answer is "NO!" In fact, getting consistently better than chance results without previous guiding, problem specific information is to information theory what perpetual motion free energy machines are to physics. To continue to say life was a result of chance would be to appeal to a “chance of the gaps” non-explanation. Physicist Brian Greene states (I found this on God3's Blog [link]):

‘If true, the idea of a multiverse would be a Copernican Revolution realized on a cosmic scale. It would be a rich and astounding upheaval, but one with potentially hazardous consequences. Beyond the inherent difficulty in assessing its validity, when should we allow the multiverse framework to be invoked in lieu of a more traditional scientific explanation? Had this idea surfaced a hundred years ago, might researchers have chalked up various mysteries to how things just happen to be in our corner of the multiverse and not pressed on to discover all the wondrous science of the last century? …The danger, if the multiverse idea takes root, is that researchers may too quickly give up the search for underlying explanations. When faced with seemingly inexplicable observations, researchers may invoke the framework of the multiverse prematurely – proclaiming some phenomenon or other to merely reflect conditions in our own bubble universe and thereby failing to discover the deeper understanding that awaits us. ‘

To invoke multiple universes to explain phenomenon within our universe is merely inflating one’s probabilistic resources beyond reason, thus causing a halt on further investigation since a chance of the gaps “explanation” has already been given.

As a professor Hassofer put it:

“The problem [of falsifiability of a probabilistic statement] has been dealt with in a recent book by G. Matheron, entitled Estimating and Choosing: An Essay on Probability in Practice (Springer-Verlag, 1989). He proposes that a probabilistic model be considered falsifiable if some of its consequences have zero (or in practice very low) probability. If one of these consequences is observed, the model is then rejected.
‘The fatal weakness of the monkey argument, which calculates probabilities of events “somewhere, sometime”, is that all events, no matter how unlikely they are, have probability one as long as they are logically possible, so that the suggested model can never be falsified. Accepting the validity of Huxley’s reasoning puts the whole probability theory outside the realm of verifiable science. In particular, it vitiates the whole of quantum theory and statistical mechanics, including thermodynamics, and therefore destroys the foundations of all modern science. For example, as Bertrand Russell once pointed out, if we put a kettle on a fire and the water in the kettle froze, we should argue, following Huxley, that a very unlikely event of statistical mechanics occurred, as it should “somewhere, sometime”, rather than trying to find out what went wrong with the experiment!’”

So, merely observe an information processing system and evolutionary algorithm self- generate from a truly random set of laws and the foundation of ID Theory is falsified. Or, show how that is even theoretically possible. Science is based on observation and testing hypothesis, and data trumps every time. See the discussion on the Conservation of Information Theorem [link] for why Evolutionary Algorithms will not generate themselves out of a random set of laws.


Zachriel said...
This comment has been removed by the author.
Zachriel said...

This is Dembski's paper, including his definition of specification.

Specification: The Pattern That Signifies Intelligence

σ = –log2[ ϕS(T)·P(T|H)]

T is the pattern.
H is the chance hypothesis.
P(T|H)] is the probability of the pattern given the chance hypothesis.

ϕS is the Semiotic Agent.
ϕS(T), called the specificational resources, is "the number of patterns for which S’s semiotic description of them is at least as simple as S’s semiotic description of T."

Zachriel said...

CJYman: Now, I haven’t included the number of specified patterns which are as compressible as the above random number, since I am not sure exactly how to calculate that number.

The vast majority of possible sequences of sufficient length have no discernable pattern to a Semiotic Agent. Hence, the shortest description is close to the length of the sequence.

So if we take an apparently random sequence of sufficiently large L, then the probability of that sequence given a uniform probability distribution is 2^-L and the specificational resource is ϕS(T)=2^L. Specification would then be -log2(1) = zero.

Now, let's take the sequence of prime numbers, in binary for conveniences. 10 11 101 111 1011...

The the probability of that sequence given a uniform probability distribution is still 2^-L, but the specificational resource is very small (say using the qualitative description "primes"). So ϕS(T)=5 and P(T|H) is 2^-L. For large L, specification would consequently be very large.

Zachriel said...

Though Dembski claims the Specificational Resource is a quantitative measure, unless a description language is specified, we often see it used qualitatively. But let's work around that problem.

The real problem is when working with a sequences such as 00000000000000... Assuming a uniform probability distribution, we get the exact same answer as we do for the highly specified "primes". Perhaps we could use a different Chance Hypothesis, but how do we define the Chance Hypothesis without already knowing the answer, or falling into circular reasoning. Perhaps we could average the digits across the sequence and assume a non-uniform distribution. But then what about the sequence 01010101010101...?

Zachriel said...

Today, William got an incredible deal on an old Victorian house. Highly satisfied with his business acumen, William settled in for a blissful night of sleep in his new home.


William woke with a start. He listened intently. But he didn't hear anything, so he settled back to sleep.


William listened even more closely this time until, after a bit, the creaking noise died away. For some reason, he recalled the seller's maniacal laughter just after William signed the papers to buy the house.


William was trembling and his teeth were rattling. He thought about getting out of bed to investigate. Instead, he pulled the covers over his head.


Hmm, William thought. Being a famous design theoretician, I can use the patented (not really) Dembski Inference to determine if the pattern is being caused by a ghost, er some unspecified intelligent cause.


For our first calculation. Let's assume the pattern is 01010101010101 …

Using Dembski's Inference, what can we infer about the pattern without risking a venture about the house? Be sure to show your math (e.g. Chance Hypothesis). And remember! No peeking from underneath the covers!

cindy said...

can you check the site and give feedback