Reality Cheque: No Free Lunch Theorems (Part I)

Monday, February 11, 2008

No Free Lunch Theorems (Part I)

When discussing the NFL Theorems I do not have the expertise to evaluate the math and its actual relevance to the real world. At the moment, I must take for granted that the authors of the NFL Theorem know what they are talking about. So, taking the NFL Theorems and the authors description of them as granted, I am merely re-stating what the authors already stated about the NFLT.

The quotes from the NFL Theorems Paper are centered and in blue:

“Given our decision to only measure distinct function evaluations even if an algorithm revisits previously searched points, our definition of an algorithm includes all common black-box optimization techniques like simulated annealing and evolutionary algorithms.”

Thus, the algorithms discussed include evolutionary algorithms.

“In this paper we present a formal analysis that contributes towards such an understanding by addressing questions like the following. Given the plethora of black box optimization algorithms and of optimization problems, how can we best match algorithms to problems (i. e. how best can we relax the black box nature of the algorithms and have them exploit some knowledge concerning the optimization problem). In particular, while serious optimization practitioners almost always perform such matching, it is usually on an ad hoc basis; how can such matching be formally analyzed? More generally, what is the underlying mathematical “skeleton” of optimization theory before the flesh of the probability distributions of a particular context and set of optimization problems are imposed.”

Thus, one of the purposes of this paper is to formally analyse how prior knowledge of the optimization problem is exploited and needs to be utilized in order to match it with the proper evolutionary algorithm, which is a feat performed by optimization practitioners (intelligent programmers).

“As emphasized above, the NFL theorems mean that if an algorithm does particularly well on average for one class of problems then it must do worse on average over the remaining problems. In particular, if an algorithm performs better than random search on some class of problems then in must perform worse than random search on the remaining problems. Thus comparisons reporting the performance of a particular algorithm with a particular parameter setting on a few sample problems are of limited utility.”

"In particular, if for some favorite algorithms a certain well behaved f results in better performance than does the random f then that well behaved f gives worse than random behavior on the set all remaining algorithms. In this sense just as there are no universally efficacious search algorithms, there are no universally benign f (optimizations problems) which can be assured of resulting in better than random performance regardless of ones algorithm.”
[parenthesis added]

There is no universal evolutionary algorithm that will solve all optimization problems at better than random performance. Likewise, there is no optimization problem that just any evolutionary algorithm will be able to “solve” at better than random performance. Thus, if an evolutionary algorithm performs better than random search on one class of optimization problems, it will perform worse than random search on average over all remaining classes of optimization problems. Dembski and Marks have reiterated this point in Conservation of Information in Search: Measuring the Cost of Success:: “... there is no magic-bullet search algorithm that successfully resolves all problems [7], [32].”

So, what causes the evolutionary algorithm to produce better than average on a class of problems? The next quotes begin to discuss this ...

“In particular, we show that an algorithm’s average performance is determined by how aligned it is with the underlying probability distribution over optimization problems on which it is run.”

“In any of these cases, P(f) or “p” must match or be aligned with “a” to get desired behavior. This need for matching provides a new perspective on how certain algorithms can perform well in practice on specific kinds of problems.”

“First, if the practitioner has knowledge of problem characteristics but does not incorporate them into the optimization algorithm, then P(f) is effectively uniform. (Recall that P(f) can be viewed as a statement concerning the practitioner’s choice of optimization algorithms.) In such a case, the NFL theorems establish that there are no formal assurances that the algorithm chosen will be at all effective. Second, while most classes of problems will certainly have some structure which, if known, might be exploitable, the simple existence of that structure does not justify choice of a particular algorithm; that structure must be known and reflected directly in the choice of algorithm to serve as such a justification. In other words, the simple existence of structure per se, absent a specification of that structure, cannot provide a basis for preferring one algorithm over another ... The simple fact that the P(f) at hand is non-uniform cannot serve to determine one’s choice of optimization algorithm.”

“Intuitively, the NFL theorem illustrates that even if knowledge of “f” perhaps specified through P(f) is not incorporated into “a” then there are no formal assurances that “a” will be effective. Rather effective optimization relies on a fortuitous matching between “f” [optimization problem] and “a” [evolutionary algorithm].” [brackets added]

The quotes above state that for the algorithm to perform better than random search on a class of problems, the evolutionary algorithm must be correctly matched beforehand, based on knowledge of the problem, to the specific optimization problem. Knowledge of characteristics of the problem must be incorporated into the algorithm. This is similar to how you won’t be able to find a card at consistently better than chance performance unless you are aware of an organization of the deck that would help you find the card and you incorporate this knowledge into the algorithm that you use to find the card. Likewise, the “exploitable structure,” such as hill climbing structures, must be known and incorporated into the choice of algorithm to take any advantage of these useable search structures. As Dr. Dembski has stated in “Active Information in Evolutionary Search,” merely assuming that there does exist an exploitable search structure, without incorporating that knowledge into the choice of algorithm, doesn’t get us anywhere:

“Such assumptions, [of non-uniform search space “links,” “hill climbing optimizations,” etc.] however, are useless when searching to find a sequence of, say, 7 letters from a 26-letter alphabet to form a word that will pass successfully through a spell checker, or choosing a sequence of commands from 26 available commands in an Avida type program to generate a logic operation such as XNOR [16]. With no metric to determine nearness, the search landscape for such searches is binary -- either success or failure. There are no sloped hills to climb. We note that such search problems on a binary landscape are in a different class from those useful for, say, adaptive filters. But, in the context of the NFLT, knowledge of the class in which an optimization problem lies provides active information about the problem.” (Brackets added)

He is basically stating the same thing as the “find a card” example. If you assume that there is a specific organization that will help you find the card without actually knowing for sure, you still have nothing but chance to go on since there may actually be no useful organization at all or out of all possible organizations, the one that you are banking on is such a small possibility that for all practical purposes you still have no better than chance assurance of finding that card.

Dr. Dembski’s quote directly above states the same concept as found within the NFL Theorem as I just quoted a few paras above: “... while most classes of problems will certainly have some structure which, if known, might be exploitable ...; that structure must be known and reflected directly in the choice of algorithm to serve as such a justification [for choice of a particular algorithm]... The simple fact that the P(f) at hand is non-uniform cannot serve to determine one’s choice of optimization algorithm ... Rather effective optimization relies on a fortuitous matching between “f” (optimization problem) and “a” (evolutionary algorithm).” [brackets added]

The above quotes from both the NFL Theorem paper and Dr. Dembski and Marks’ paper state what I have already referred to before – the necessity of using prior knowledge of the search space structure to match search space structure to the correct adaptive algorithm before the search ever begins. The next quote, taken from the conclusion of the paper, states that based on the theorems proved within the paper, it is important that problem-specific knowledge is incorporated into the behavior of the algorithm.

“A framework has been presented in which to compare general purpose optimization algorithms. A number of NFL theorems were derived that demonstrate the danger of comparing algorithms by their performance on a small sample of problems. These same results also indicate the importance of incorporating problem specific knowledge into the behavior of the algorithm.”

Now, let’s tie this all together:

1. The algorithms discussed include evolutionary algorithms.

2. The NFL Theorems formally analyse how prior knowledge of the optimization problem needs to be utilized in order to match it with the proper algorithm.

3. In order for evolutionary algorithms to perform better than average over one class of problems, the characteristics of the problem must be known and incorporated into the search algorithm. The exploitable structure of the search space must be known and incorporated into the choice of algorithm. Furthermore, search algorithm must be matched correctly to the optimization problem by utilizing prior knowledge of the problem.

4. Thus, it is important that problem specific information is incorporated into the behavior of the evolutionary algorithm. Therefore, evolutionary search is not blind search and is guided by incorporating prior knowledge of the problem and knowledge of any exploitable structure [within the search space] into the evolutionary algorithm.

IOW, here is an example of what the NFL Theorem is telling us:

I have a shuffled deck of cards. I lay them face down in a row on a table. Before you turn over any cards can you generate a search algorithm (method of search) that will help you find any card at consistently better than chance performance over many trials with the same shuffled deck?

Also, if by chance there just happens to be some type of exploitable search structure, such as after every heart comes a spade, within the arrangement of cards after random shuffling then, before flipping over any cards can you blindly (without knowledge of that specific arrangement) generate a search algorithm that will help you locate any card at consistently better than chance performance over many trials with the same shuffled deck?

IOW, will a blind set of laws (that is, only law and chance) generate, over many trials, consistently better than chance performance?

According to common sense, testing and observation, and NFL Theorems, the answer to the questions above is simply “no.”

Furthermore, the NFL Theorem applies to all searches which result in better than chance performance over many trials. And now I will leave you with a little gem to think about. Convergent evolution is the observation of many separate biological evolutionary trials converging upon the same extremely highly improbable forms and functions and biochemical systems [link].
Seriously, think about it. Drop any prejudice and use some reason here.

19 comments:

Zachriel said...: CJYman: "There is no universal evolutionary algorithm that will solve all optimization problems at better than random performance."

That is correct, as was proven by Wolpert and Macready. Seeing as we're discussing random searches.

—
EXTRA CREDIT:

Consider a landscape of a hundred billion trillion (10^23) elements with each element of the set having a random real-number value between 0 and 1. How many samples must we take to be confident that we have found an element with a value in the 99th percentile?

Can we optimize our search strategy beyond random sampling?; February 11, 2008 at 4:42 PM
Zachriel said...: CJYman: "There is no universal evolutionary algorithm that will solve all optimization problems at better than random performance."

Correct. A specific algorithm and an arbitary search space. Many, indeed most, search spaces are highly chaotic.

CJYman: "Likewise, there is no optimization problem that just any evolutionary algorithm will be able to 'solve' at better than random performance."

Correct again. An arbitrary algorithm and a specific search space. "Just any evolutionary algorithm" includes all sorts of chaotic behaviors, such as moving away from local optimums, instead of moving towards them, or moving orthogonally, or whatever.

But that is not what we are considering.

We are considering a specific algorithm and a specific search space. We are considering a class of natural algorithms that select for local optimums and whether this is effective (not necessarily the most optimized) at searching for solutions to biological or words or other problems of interest.; February 12, 2008 at 7:55 AM
CJYman said...: You obviously haven't read through my case, since nothing you have posted here negates my summary in the least.; February 13, 2008 at 2:24 PM
CJYman said...: It's up to you to prove that my point is wrong. Just read through and see what I'm saying.

BTW: did you read my most recent post re: Moderation?; February 13, 2008 at 2:25 PM
CJYman said...: Zachriel:
"Can we optimize our search strategy beyond random sampling?"

The answer is in the NFLT. Add problem specific information.

And to the rest of your comment I say ... sure ... and ...?; February 13, 2008 at 2:27 PM
Zachriel said...: Zachriel: How many samples must we take to be confident that we have found an element with a value in the 99th percentile?

I take your silence as indicating that you don't really know about the capabilities and limitations of random search, and for whatever reason it doesn't concern you that you don't know—even though every algorithm is being compared to a random search.

CJYman: You obviously haven't read through my case, since nothing you have posted here negates my summary in the least.

My comments are directly relevant, but you apparently refuse to consider the possibility, as you didn't bother to actually respond.

CJYman: In order for evolutionary algorithms to perform better than average over one class of problems, the characteristics of the problem must be known and incorporated into the search algorithm.

That is incorrect. No one has to "know" anything. The algorithm has to be reasonably matched to the problem. But we have a naturally occurring evolutionary algorithm, and a naturally occurring class of problems. We are talking about a *specific* class of algorithms and a *specific* class of problems.

The question as far as biology is concerned is whether the class of natural algorithms that select for local optimums are effective (not necessarily the most effective) at searching for solutions.; February 13, 2008 at 3:40 PM
Zachriel said...: Zachriel: Can we optimize our search strategy beyond random sampling?

CJYman: The answer is in the NFLT. Add problem specific information.

What problem specific information can you add about a random landscape?; February 13, 2008 at 3:43 PM
CJYman said...: Zachriel:
"How many samples must we take to be confident that we have found an element with a value in the 99th percentile?

...

I take your silence as indicating that you don't really know about the capabilities and limitations of random search, and for whatever reason it doesn't concern you that you don't know—even though every algorithm is being compared to a random search."

Actually, I already answered in the thread that you originally posed the question. My answer was and still is: You merely measure for CSI [with a specificity in the 99th percentile, of course] in one sample.

Or, another answer is that you only need two samples. If, within those two samples, you find the same highly compressible target then measure for CSI in both samples. Average them and you will have an approx. measure of CSI. If you do get CSI, then you can also measure for active information. Then, upon getting CSI and active information, you can, if you wish, continue to run more samples and continue to average the results to get a more precise measurement of the CSI and active information. I hope I have been able to explain this clearly enough.

I've already worked out one example on paper. I can go through it with you in more detail if you wish.

Finding CSI in one sample is useful in the context of our whole universe (as one sample). Finding CSI and active information in two or more samples is useful when dealing with separate converging samples and is seen in convergent evolution of life.

CJYman: You obviously haven't read through my case, since nothing you have posted here negates my summary in the least.

Zachriel:
"My comments are directly relevant, but you apparently refuse to consider the possibility, as you didn't bother to actually respond."

Ahh ... but I did. You glossed over any relevant answer as usual. But, then yes, I didn't provide a detailed explanation the first time ... However, I *did* provide a perfectly relevant answer.

CJYman: In order for evolutionary algorithms to perform better than average over one class of problems, the characteristics of the problem must be known and incorporated into the search algorithm.

Zachriel:
"That is incorrect. No one has to "know" anything."

Evidence please. One information processing system, evolutionary algorithm, and better than chance performance on average
(or CSI in one sample) observed coming about through nothing but random variables and a random set of laws would be a good start. Oh, wait, but then according to the authors of the NFLT, you are operating on nothing but random chance and can't expect better than random performance. And of course, this lines up with what has been published on conservation of information theorems:

" A "learner... that achieves at least mildly than better-than-chance performance, on average, ... is like a perpetual motion machine - conservation of generalization performance precludes it.”

--Cullen Schaffer on the Law of Conservation of Generalization Performance. Cullen Schaffer, "A conservation law for generalization performance," in Proc. Eleventh International Conference on Machine Learning, H. Willian and W. Cohen. San Francisco: Morgan Kaufmann, 1994, pp.295-265."

So, just prove your assertion and then inform the authors of the NFLT. Do I need to re-quote the fact that you have no better than random probabilities at arriving at a better than chance performance if you do not incorporate *characteristic of the problem to be solved* into the behaviour of the algorithm? Now, how do you get this problem specific information? Please provide evidence for your answer. I have already done so for my position.

Did you not read my case or even this post before you responded?

Zachriel:
"The algorithm has to be reasonably matched to the problem. But we have a naturally occurring evolutionary algorithm, and a naturally occurring class of problems. We are talking about a *specific* class of algorithms and a *specific* class of problems.:

Of course they are naturally occurring ... what's your point? You may want to actually read through and attempt to understand my full case before you continue to misrepresent my position.

Zachriel:
"The question as far as biology is concerned is whether the class of natural algorithms that select for local optimums are effective (not necessarily the most effective) at searching for solutions."

Sure and, as per NFLT and COI Theorems, it is because of problem specific information. So the question is ... what causes problem specific information?

Please back up your position with data (observations), a theoretical foundation, and a falsifiable hypothesis.

Zachriel:
"Can we optimize our search strategy beyond random sampling?"

CJYman: The answer is in the NFLT. Add problem specific information.

Zachriel:
"What problem specific information can you add about a random landscape?"

And you seriously say that you've read Dr.s Dembski and Mark's papers?!?!?!? You obviously need to read them again.

Please don't pretend to have read and understood Dr.s Dembski and Marks papers and then ask a question like this. It really betrays your ignorance and dishonesty. I can understand you asking such a question if you were being honest about not really understanding the ID position on active information as put forward by D&M.

One answer:
Any measure of nearness to the target obtained after random trails and then keeping the nearest is one way of adding active, problem specific, information when considering a random landscape (which includes all possible combinations as potentially searchable).; February 14, 2008 at 2:10 PM
Zachriel said...: Zachriel: Consider a landscape of a hundred billion trillion (10^23) elements with each element of the set having a random real-number value between 0 and 1. How many samples must we take to be confident that we have found an element with a value in the 99th percentile?

CJYman: Or, another answer is that you only need two samples.

After taking two samples from the landscape there is a 98% chance of *not* being in the 99th percentile. On the other hand, after 916 samples we can be 99.99% confident that we have found at least one sample in the 99th percentile. That in a landscape of hundred billion trillion. So, it appears that random search is not so ineffective, after all.

CJYman: Oh, wait, but then according to the authors of the NFLT, you are operating on nothing but random chance and can't expect better than random performance.

We're not talking about a random search space, but a highly organized environment. We're not talking about a random search algorithm, but an evolutionary algorithm that seeks local optimums. No Free Lunch Theorems only apply when averaged over either all search spaces or all search algorithms. Biological evolution concerns a specific class of search algorithm over a specific class of search space.

Zachriel: What problem specific information can you add about a random landscape?

CJYman: And you seriously say that you've read Dr.s Dembski and Mark's papers?!?!?!? ...

Any measure of nearness to the target obtained after random trails and then keeping the nearest is one way of adding active, problem specific, information when considering a random landscape (which includes all possible combinations as potentially searchable).

Sorry. There is no way to optimize the search of a random landscape. There is no point analyzing the previous samples because each new sample in uncorrelated with all other samples. That's what we mean by random.; February 14, 2008 at 4:13 PM
CJYman said...: Zachriel, you seem to not understand the causal difference between a random search space and an organized search space, which according to information theory are low information and high information respectively thus confusing yourself. A simple question which would clear this up for you is: "what causes an organized search space and the problem specific information necessary for evolutionary algorithms?" Please provide data (observation/experimental evidence) for your answer. Science deals with data after all.

Zachriel:
"After taking two samples from the landscape there is a 98% chance of *not* being in the 99th percentile. On the other hand, after 916 samples we can be 99.99% confident that we have found at least one sample in the 99th percentile. That in a landscape of hundred billion trillion. So, it appears that random search is not so ineffective, after all."

The efficiency of random search (the monkeys typing on a keyboard) has already been discussed in Conservation of Information in Search ...

From the abstract:

"Conservation of information theorems indicate that any search algorithm performs on average as well as random search without replacement unless it takes advantage of problem-specific information about the search target or the search-space structure. Combinatorics shows that even a moderately sized search requires problem-specific information to be successful.

... and further in the paper (part A. Monkey at a Typewriter):

"A “monkey at a typewriter” is often used to illustrate the
viability of random evolutionary search. It also illustrates the
need for active information in even modestly sized searches." Keep reading the article.

... and has been dealt with in this blog post. I re-quote from NFLT:

"“First, if the practitioner has knowledge of problem characteristics but does not incorporate them into the optimization algorithm, then P(f) is effectively uniform. (Recall that P(f) can be viewed as a statement concerning the practitioner’s choice of optimization algorithms.) In such a case, the NFL theorems establish that there are no formal assurances that the algorithm chosen will be at all effective.

IOW: random choice of algorithm = no formal assurance of effectiveness. Thus, with no formal, mathematical assurance, chances are still no better than random search.

Now you're trying to tell me that random search *is* effective? Care to provide some evidence?

Furthermore, if random search was even partially useful, how come it can't find CSI? That is because random search hovers around statistical randomness (algorithmic incompressibility).

... Then you flip sides and say ... oh, but we aren't really talking about a random landscape here:

Zachriel:
"We're not talking about a random search space, but a highly organized environment."

It *is* obvious that our universe is not caused by random processes.

Yes, it is obvious that our environment and laws are highly organized and extremely far from statistical randomness. In fact, far beyond 99% of potential mathematical relationships between matter and laws in our universe, according to physicists, would either cause our universe to collapse upon itself almost as soon as it appears, or would cause our universe to be composed of nothing but thermal energy which would soon dissipate as the universe expanded at an extremely high rate and never slowed down, or would cause a universe of nothing but a uniform, homogeneous mixture of hydrogen and helium.

Unless you can provide evidence that life can form in cases where the universe collapses upon itself almost as fast as it appears, or in cases where there is nothing but speedily dissipating thermal energy, or in cases where there exists nothing but hydrogen and helium gas, then far beyond 99% of mathematically possible universes, are dead universes.

And then there is the fact that information processing systems don't randomly generate themselves, unless programmed by intelligence.

In relation to the CSI in life and the active information necessary for evolution, check out our three options as I have explained them in my Summary.

Zachriel:
"We're not talking about a random search algorithm, but an evolutionary algorithm that seeks local optimums. No Free Lunch Theorems only apply when averaged over either all search spaces or all search algorithms. Biological evolution concerns a specific class of search algorithm over a specific class of search space."

Correct, and as such, according to the NFLT, as I've already noted problem specific information is needed to provide better than chance performance over a specific class of algorithm operating on a specific class of search space.

Did you not read:

"Second, while most classes of problems will certainly have some structure which, if known, might be exploitable, the simple existence of that structure does not justify choice of a particular algorithm; that structure must be known and reflected directly in the choice of algorithm to serve as such a justification."

... and ...

"Rather effective optimization relies on a fortuitous matching between “f” [optimization problem] and “a” [evolutionary algorithm].” [brackets added]"

What causes that fortuitous matching. Will non-intelligently guided process cause that? Please provide evidence.

... and ...

"“These same results also indicate the importance of incorporating problem specific knowledge into the behavior of the algorithm.”"

What causes problem specific information (problem/target characteristics incorporated into the behavior of the search algorithm)? Will random, unguided processes cause problem specific information? Evidence please.

Zachriel:
"Sorry. There is no way to optimize the search of a random landscape. There is no point analyzing the previous samples because each new sample in uncorrelated with all other samples. That's what we mean by random."

Understood, but you stated "random landscape" not "random search."

Now, how have any of your points above refuted any of my points of conclusion as stated near the end of this blog post on the NFLT? I'm just asking because it seems that, even though I went through it point by point (referencing what the authors state about the NFLT at each point) you seem to be having trouble following along.; February 20, 2008 at 11:15 AM
Zachriel said...: Let's review.

Zachriel: Consider a landscape of a hundred billion trillion (10^23) elements with each element of the set having a random real-number value between 0 and 1. How many samples must we take to be confident that we have found an element with a value in the 99th percentile?

Can we optimize our search strategy beyond random sampling?

CJYman: The answer is in the NFLT. Add problem specific information.

Zachriel: What problem specific information can you add about a random landscape?

CJYman: One answer: Any measure of nearness to the target obtained after random trails and then keeping the nearest is one way of adding active, problem specific, information when considering a random landscape (which includes all possible combinations as potentially searchable).

Zachriel: Sorry. There is no way to optimize the search of a random landscape.

CJYman: Understood, but you stated "random landscape" not "random search."

Not only did I state random landscape, I defined it (and you repeated it). You said we could add "problem specific information" to optimize our search strategy. This is incorrect. No such optimization is available.; February 20, 2008 at 1:38 PM
Zachriel said...: Zachriel: Consider a landscape of a hundred billion trillion (10^23) elements with each element of the set having a random real-number value between 0 and 1. How many samples must we take to be confident that we have found an element with a value in the 99th percentile?

CJYman: Or, another answer is that you only need two samples.

Zachriel: After taking two samples from the landscape there is a 98% chance of *not* being in the 99th percentile. On the other hand, after 916 samples we can be 99.99% confident that we have found at least one sample in the 99th percentile.

CJYman: The efficiency of random search (the monkeys typing on a keyboard) has already been discussed in Conservation of Information in Search ...

You also left this error uncorrected.; February 20, 2008 at 1:43 PM
Zachriel said...: CJYman: Now you're trying to tell me that random search *is* effective? Care to provide some evidence?

A random search will find a sample in the 99th percentile after just 916 samples with 99.99% confidence—this on a random landscape of a hundred billion trillion (10^23) elements. The choice of search algorithm doesn't matter, by the way. Even the most ingenious algorithm is no better than any other.; February 20, 2008 at 1:51 PM
Zachriel said...: I'm averse to moving forward until you correct some of your previous misstatements, but this is of interest.

Zachriel: We're not talking about a random search algorithm, but an evolutionary algorithm that seeks local optimums. No Free Lunch Theorems only apply when averaged over either all search spaces or all search algorithms. Biological evolution concerns a specific class of search algorithm over a specific class of search space.

CJYman: Correct, and as such, according to the NFLT, as I've already noted problem specific information is needed to provide better than chance performance over a specific class of algorithm operating on a specific class of search space...

What causes that fortuitous matching. Will non-intelligently guided process cause that? Please provide evidence.

This is a question we might discuss. But it has nothing to do with the No Free Lunch Theorems, except perhaps to frame the question.

We happen to have a natural evolutionary algorithm; imperfect replicators replicating imperfectly competing for local resources. We also happen to have a landscape highly ordered by proximate relationships.; February 20, 2008 at 2:03 PM
CJYman said...: Zachriel:
"Not only did I state random landscape, I defined it (and you repeated it). You said we could add "problem specific information" to optimize our search strategy. This is incorrect. No such optimization is available."

My mistake. I actually misunderstood what you were getting at somewhere along the way.

The point that I was attempting to get at is that in order to input problem specific information into a random landscape that random landscape must be re-organized in such a way that it will incorporate a search algorithm that can make use of that re-organized landscape. IOW, to input problem specific information into a random landscape and an algorithm, you must re-organize the random landscape and match it with the proper search algorithm to cause non-random, efficient search.

According to COI, the amount of problem specific information necessary to find the organized landscape/proper algorithm match is equal to or greater than the amount of problem specific information (measure of efficiency) that organized landscape/proper algorithm can output. Simply explained, information is conserved and a random set of laws acting upon random variables (no problem specific information) -- a random landscape and random search -- will not produce an efficient search algorithm.

Do you have any evidence to the contrary?; February 29, 2008 at 12:03 PM
CJYman said...: Zachriel:
"A random search will find a sample in the 99th percentile after just 916 samples with 99.99% confidence—this on a random landscape of a hundred billion trillion (10^23) elements. The choice of search algorithm doesn't matter, by the way. Even the most ingenious algorithm is no better than any other."

Oh yes, that is true as far as I can tell ... except for "even the most ingenious algorithm is no better than any other."

But first, the first part of your quote doesn't effect the fact that random search will not find CSI. Case in point:

Random search hovers around statistical randomness and thus it will not find algorithmically compressible CSI.

Let's do some science and create a test. Create a sequence of 30 1s and 0s randomly dispersed. Now, randomly flip those bits and tell me when you get an string of 30 1s or 30 0s. If you do so and the probabilistic resources are calculated and you arrive at CSI, then you are on your way to seriously challenging the foundations of ID Theory.

Now, create an evolutionary algorithm which will aid in finding a string of 30 1s or 30 0s. Oh wait ... a very similar experiment has already been done by Schneider and has been discussed
here.
The end result is that the properly applied algorithm will be more efficient than random search.

Now, let's do some more science. Let's create an experiment to see if it is true that "even the most ingenious algorithm is no better than any other."

Oh, wait, the example above negates your assertion and that concept has already been covered by the NFLT. It states that no algorithm is better than any other *unless* problem specific information is incorporated into the behavior of the search algorithm thus matching non-random/organized landscape with the proper non-random search algorithm.

According to NFLT, problem specific information takes the form of:
1. Characteristics of a usable organized landscape.
2. Characteristics of the problem to be solved or target to be attained.

The above is almost an exact quote from the NFLT.

Seriously now, did you read nothing of what I posted re: NFLT before you started blindly commenting here?

If you are still contending that a random landscape and random search will perform efficient search (either finding CSI [functional 500 bit protein complex] or consistently finding a target/solution at better than chance [convergent evolution]
then please provide evidence.

So, put up or ... just continue to blindly attack good science based on solid theoretical foundation (NFLT and COI), observation, and experimentation with the systems in question (information processing systems and evolutionary algorithms).; February 29, 2008 at 12:41 PM
CJYman said...: CJYman:
"Or, another answer is that you only need two samples."

Zachriel:
"You also left this error uncorrected."

It was based on a misunderstanding of your point which I have clarified in the above 2 comments.

Furthermore, the effects of random search and how inefficient they actually are compared to the size of the probability space has been discussed in the article that I directed you to:

CJYman:
"The efficiency of random search (the monkeys typing on a keyboard) has already been discussed in Conservation of Information in Search ..." (part IV, "Critiquing Evolutionary Search Algorithms," point A) "Monkey at a Typewriter."); February 29, 2008 at 12:48 PM
CJYman said...: CJYman:
"What causes that fortuitous matching. Will non-intelligently guided process cause that? Please provide evidence."

Zachriel:
"This is a question we might discuss. But it has nothing to do with the No Free Lunch Theorems, except perhaps to frame the question."

Actually this has everything to do with the NFLT as it shows that only problem specific information (as defined) will produce efficient search.

And yes, on its own, the NFLT helps to frame the question by showing the conditions that are necessary in order to arrive at efficient search. Then, when the argument from CSI, observations, and the COI are applied to NFLT we arrive at intelligent design as the best explanation. If you continue to read through my case, you will see the whole picture and it will bring you to my 3 choices summary.

We can discuss all available options at my 3 choices summary.

Zachriel:
"We happen to have a natural evolutionary algorithm; imperfect replicators replicating imperfectly competing for local resources. We also happen to have a landscape highly ordered by proximate relationships."

Of course its all "natural." However, just saying its all "natural" doesn't provide a scientific explanation of what causes it. If natural = not caused by intelligence, then that's like saying that because we have a "natural" computer containing a "natural" information processing system which is running a "natural" evolutionary algorithm which is fine tuned to a "natural" organized/non-random landscape which produces "natural" CSI, further "natural" information processing systems, and "natural" convergent evolution -- that is, "natural" consistently better than chance performance --, then we are validated in presuming that all the above "natural" systems are not caused by intelligence.

However, that is patently absurd and completely false since we know that intelligent foresight is necessary to cause each one of those "natural" systems. If you wish to continue to discuss this aspect, please read my posts "Philosophical Foundations for ID" part 1 and 2; then respond accordingly at those blog posts. You can find these 2 posts under "A Case for a Naturalistic Hypothesis of Intelligent Design" at the top of the left side bar.

Do you have any contrary evidence?

I understand what you are getting at here, however, if you will actually put aside all prejudice and read through my case and summary, you will see why I have included 3 possibilities in my summary. Science deals with probabilities, experiments, and data. In my summary, I have shown why, in accordance with my full case, the best scientific hypothesis out of the 3 possible scenarios is ID Theory.

ID Theory also provides the most fruitful explanation since it doesn't rely on chance of the gaps and can also be used in many other scientific areas such as plagiarism detection, SETI, archeology, and forensics.

In fact, if you look at my whole case, you will see that evolution by natural selection is evidence of Intelligent Design.

The next question is to discover which features are targets (result directly from the problem specific information) and which features are the result of randomly mutated targets. I am not getting into it here since it does not effect my overall case, however Mike Gene's "The Design Matrix" is helpful in this area.; February 29, 2008 at 1:24 PM
Zachriel said...: This is the thread you said to post my question, which I did in the first comment. In order to make a valid mathematical argument, each individual step has to be well-defined. We're still stuck on step one.

Zachriel: Consider a landscape of a hundred billion trillion (10^23) elements with each element of the set having a random real-number value between 0 and 1.

CJYman: The point that I was attempting to get at is that in order to input problem specific information into a random landscape that random landscape must be re-organized in such a way that it will incorporate a search algorithm that can make use of that re-organized landscape.

If we "reorganize" a random landscape, it isn't random. The question is *given* a random landscape, how well does a search algorithm work. And you still have no idea.

CJYman: Simply explained, information is conserved and a random set of laws acting upon random variables (no problem specific information) -- a random landscape and random search -- will not produce an efficient search algorithm.

The question is how efficient would a search algorithm be on a random landscape. How many samples—using any search algorithm you choose—would it take to be confident of finding one in the 99th percentile?

CJYman: Oh yes, that is true as far as I can tell ... except for "even the most ingenious algorithm is no better than any other."

That is incorrect. It doesn't matter how ingenious your algorithm. Each new sample is completely uncorrelated with every previous sample.

CJYman: Random search hovers around statistical randomness and thus it will not find algorithmically compressible CSI.

Read the question again. I didn't specify random sampling. How many samples from a random landscape do we need to be confident that we have found an element with a value in the 99th percentile? You might compare a random sampling with the most ingenious methodology you can devise.

CJYman: Furthermore, the effects of random search and how inefficient they actually are compared to the size of the probability space has been discussed in the article that I directed you to:

Perhaps you would be surprised to discover that the size of the probability space (assuming it's sufficient large as to make a Poisson calculation relevant) doesn't matter to answer the question, how many samples from a random landscape do we need to be confident that we have found an element with a value in the 99th percentile?; March 1, 2008 at 5:28 AM

Reality Cheque

New Moderation Policy

A Case for a Naturalistic Hypothesis of Intelligent Design

Clearing the Air of Anti-ID Obfuscations

Welcome

Blog Archive

Links

ID relevant mathematics

Politics

Galileo Galilei

Albert Einstein

Robert Wright

Sir Francis Bacon, "Novum Organum"

J. B. S. Haldane

Max Planck

Louis Pasteur

C. S. Lewis

Cyril Burt (British neurologist)

G. K. Chesterton

Robert Jastrow

Monday, February 11, 2008

No Free Lunch Theorems (Part I)

19 comments:

About Me