How to arrive at a reliable model

Introduction

By inductive reasoning, and imagination, many possible explanations can be provided for any series of events. However, most of these explanations will be wrong.

Knowledge on the other hand is characterized by the ability to repeatedly predict a particular outcome for a particular set of conditions.

The modern scientific method is a method for discovery of knowledge about nature – for knowledge about how things work – for discovery of mechanisms which affects the world surrounding us.

But even more important, the modern scientific method is a way of revealing untruth. For every corroborated scientific statement there are 1000 ideas which are not corroborated – ideas which are wrong, or not properly tested.

To know something about nature we need to get rid of all the competing ideas. Ideas which are plausible – but still wrong. Ideas comes so easy – truth is hard to get by.

I used to think that all scientists provided truth and nothing but the truth – that´s wrong. Scientists got beliefs too. Scientists got pet theories they defend. Scientists tend to be proponents for their pet ideas – ideas which might turn out to be true, or more likely – ideas which might turn out to be wrong or defective.

And that is ok! Where would we have been without scientists which defends their ideas? But still, we must not forget about all the ideas which didn´t survive. All the ideas which did seem like a good idea at the time- but wasn´t.

And that is what science is about – being both creative and cruel. Good scientists needs to be kind of bipolar. Good scientists must do their very best to come up with possible ideas, and then do their very best to get rid of the bad ideas among all possible ideas.

We have to get rid of bad ideas – because what we are left with – after having disposed all the bad ideas, are the best ideas -if any at all. Often we just don´t know – we have no idea – and we just have to go along without knowing.

However – those ideas which has survived close scrutiny – and still have predictive powers – those are the ideas which are most valuable to us. Those are the ideas we can use to make things that really works. Those are the ideas we can use to create value – to build wealth and prosper. Thats why the scientific method is so important to humanity. The method makes us capable of creating wealth and prosper. And that is what the modern scientific method is about – to get rid of bad ideas – and discover ideas with predictive powers – ideas which can be used to create things that are important to us.

Karl Popper was the master mind who described and explained the modern scientific method to us. The method is commonly known as the hypotetico – deductive method. Karl Popper called it the empirical method in his original work “The logic of scientific discovery”. Karl Poppers ideology is commonly known as critical rationalism. The first 26 pages of The logic of scientific discovery contains the essense. Enjoy some soothing reading from the master himself: The logic of scientific discovery.
(First published 1935 First English edition published 1959)

The method must have been understood by quite a few before Karl Popper, as indicated by this quote by Lord Acton (1834 – 1902):
“There is nothing more necessary to the man of science than its history, and the logic of discovery . . . : the way error is detected, the use of hypothesis, of imagination, the mode of testing.”
– Lord Acton

Popper himself mentioned in his autobiographical “Unended Quest” that he got his idea reading Einstein’s book on his then-new relativistic theory. Popper notes Einstein writing that if the cosmological red-shift due to gravitational potential didn’t exist, his theory would have to be abandoned. This immediately led Popper to his idea of falsification as the scientific method.

But Karl Popper was the one who described it, explained it, argued for it and was the greatest proponent for it. Karl Popper was thorough in his writings, he did put forward proper arguments for his method and he addressed possible arguments against his view. Poppers work is very well written and truly exhilarating. I encourage all to look into his works in general and  The logic of scientific discovery in particular.

However I miss a shorter methodical guideline. The reason why I wrote this post was that after having searched internet for a short and yet comprehensive guideline on the so called hypotetico-deductive model, I could not find one I liked. During my reading of Poppers Work I also registered a few quotes which I thought was very central and important to the method. Therefore, I decided to write my own guideline on the method and include these quotes by Karl Popper. Before we start, it should be noted that this post is written having quantitative empirical models in my mind.

Ok – here we go:

0 The modern scientific method – simply put

The stages in the modern scientific method are simply put:
1 Imagine
Propose an idea, a hypothesis or theory
2 Check
Check that the idea is logical and scientific consistent
3 Deduct
Deduce testable predictions from the idea
4 Test
Expose the idea to severe testing

1 Imagine – guess how things work

The first step in science is about coming up with an idea. Everybody does it – all the time:

“In general we look for a new law by the following process: First we guess it…”
– Richard Feynman

“a new idea, put up tentatively, and not yet justified in any way—an anticipation, a hypothesis, a theoretical system, or what you will”
– Karl Popper

“The true sign of intelligence is not knowledge but imagination.”
– Albert Einstein

However, imagination does not bring us knowledge about nature. Imagination brings us ideas. Our new idea might be right or it might be wrong. An overwhelming amount of our ideas are wrong – but we still get along.  Most of our wrong ideas doesn´t kill us – at least not at once. And still, many of our ideas seems to fit our observations, even though they are wrong. How can it be?

The reason is the problem of induction

1.1 The problem with inductive reasoning

The problem of inductive reasoning deserves some particular mentioning at this stage. Because at this stage we normally lack data. At the same time we tend to be exited by a possible discovery of a mechanism.

Have you ever thought about how easy plausible ideas comes about? The classical example is that after having seen a number of white swans one might be tempted to state that all swans are white. That is inductive reasoning – trying to inference general laws from a limited number of instances. The problem with this is illustrated by the fact that there actually exists black swans. Black swans exists and breed in southeast and southwestern parts of Australia.

The problem of induction has been well known for a long time. David Hume payed much attention to this problem. “As Hume wrote, induction concerns how things behave when they go “beyond the present testimony of the senses, or the records of our memory”. Hume argues that we tend to believe that things behave in a regular manner, meaning that patterns in the behaviour of objects seem to persist into the future, and throughout the unobserved present.”

Karl Popper described it this way:
“Believers in inductive logic assume that we arrive at natural laws by generalization from particular observations. If we think of the various results in a series of observations as points plotted in a co-ordinate system, then the graphic representation of the law will be a curve passing through all these points. But through a finite number of points we can always draw an unlimited number of curves of the most diverse form. Since therefore the law is not uniquely determined by the observations, inductive logic is confronted with the problem of deciding which curve, among all these possible curves, is to be chosen.”

Series of observations are also called time series. By time series analysis, curves of many various forms can be fitted to a series of observations. By fourier analysis a time series can even be decomposed into individual wave form components having a frequency and an amplitude. But past performance is no guarantee of future results.

That past performance is no guarantee of future results is demonstrated by the fact that it is possible to use fourier analysis to decompose a time series for a stock prize into individual wave form components having a certain amplitude and frequency. The time series can then be accurately reconstructed from these components. The stock prize curve can be continued into the future by these components. But the future development of the stock prize still cannot be accurately predicted from these components. If it could we would all be rich.

By performing time series analysis one might be fooled into believing that natures own trends, curves or components are revealed. That fallacy is one of the core problems of induction. The belief that what has been observed is reproducible or valid for other conditions than those at which the data was collected. Combined with the fallacy to believe that nature behaves orderly – that nature doesn´t also behave in stochastic, unrepeatable and chaotic ways. Try to drop a paper from 10 meters height and see if it lands at the same spot every time.

“Prediction is very difficult, especially if it’s about the future.”
–Nils Bohr, Nobel laureate in Physics

“This quote serves as a warning of the importance of validating a forecasting model out-of-sample. It’s often easy to find a model that fits the past data well–perhaps too well!–but quite another matter to find a model that correctly identifies those patterns in the past data that will continue to hold in the future.”
Robert Nau

Inductive errors are probably the most common of all cognitive errors. Human brains are constantly striving to extract patterns from data to derive predictive advantage. In so doing, we often see patterns that do not actually exist, or place a degree of confidence in perceived patterns that is incommensurate to the actual strength of the relevant evidence.

2 Check

Let us now say that I have succeeded in coming up with an idea, a hypothesis, a theory or whatever. It is then wise to expose the idea to some sound skepticism. I don´t know yet if the idea is good or bad, and I find it wise to get rid of bad ideas before bad ideas get the best of me – or those I try to enforce it upon.

Karl Popper phrased it this way:
«We may if we like distinguish four different lines along which the testing of a theory could be carried out.

First there is the logical comparison of the conclusions among themselves, by which the internal consistency of the system is tested.

Secondly, there is the investigation of the logical form of the theory, with the object of determining whether it has the character of an empirical or scientific theory, or whether it is, for example, tautological.

Thirdly, there is the comparison with other theories, chiefly with the aim of determining whether the theory would constitute a scientific advance should it survive our various tests.

And finally, there is the testing of the theory by way of empirical applications of the conclusions which can be derived from it. ”
– Karl Popper; The logic of scientific discovery

The first thing which should be done, is to check that the idea, hypothesis, theory or whatever is logically consistent.

2.1 Check that the idea is logical consistent

“The first principle is that you must not fool yourself and you are the easiest person to fool.”
– Richard P. Feynman

The check for logical consistency is a check which is worthwhile taking seriously.  There are a whole lot of logical fallacies to chose from, and even more than those identified here:

Logical Fallacies

Many of our ideas are founded on some kind of logical fallacy. And logical fallacies ofthen comes in disguise. The fallacy can be hard to recognize. It can be a tedious challenge to decompose the argument, identify the fallacy and convince the proponent that the idea is based on a fallacy.

You can add to it that people tend to be very fond in their pet theories. To tell a proponent of an idea that the idea is based on a logical fallacy is pretty much like trying to tell a mother that her child has a bad behavior. Most people don´t easily let go on theories, and certainly not theories they have invested a lot of effort in.

“I know that most men, including those at ease with problems of the greatest complexity, can seldom accept even the simplest and most obvious truth if it be such as would oblige them to admit the falsity of conclusions which they delighted in explaining to colleagues, which they have proudly taught to others, and which they have woven, thread by thread, into the fabric of their lives.”
– Leo Tolstoj

“Science progresses one funeral at a time.”
— Max Planck

However, we should be ok with people defending their pet ideas. The most wonderful and valuable ideas could be lost if they were not defended by their proponents.  But we should also be reasonable enough to regard uncorroborated or even unfalsifiable theories as temporary, and suspend judgement about them rather than trying to save the world on the basis of such theories.

2.2 Check if the idea is falsifiable – that it has predictive power

If my idea survives the check for logical consistency it should also be checked for empirical and scientific content. This is the check which tells us if our idea can at all be regarded as a scientific idea within the norm I relate to here. If it isn´t possible to perform a test, of which our idea forbids a particular range of outcomes – our theory simply isn´t a scientific idea. Our idea isn´t more worth than having no idea at all.

Further, the predictive power of a theory is related to the range of possible outcomes it forbids. A theory which allows everything forbids nothing, the predictive powers of the theory is then zero, zap, zilch, nada.

Here is an example of a scientific theory having great predictive power. If I add 4.187 kJ to 1 kg of H2O at 300 K, the temperature will increase by 1 K. Not 0.9 K or 1.1 K but 1.0 K. That´s what I call predictive power. the theory forbids all other outcomes than 1.0 K

Here is an example of an idea having no predictive power:
Tomorrow it will rain or not rain.
That´s an idea without predictive powers, an idea which isn´t falsifiable, The theory forbids nothing. Every possible outcome is allowed by my idea. My idea isn´t scientific.

It´s time to turn to Karl Popper again for a take on the criteria that it must be possible to test a theory  to regard it as scientific. After all – it was his idea:
“But I shall certainly admit a system as empirical or scientific only if it is capable of being tested by experience. These considerations suggest that not the verifiability but the falsifiability of a system is to be taken as a criterion of demarcation. In other words: I shall not require of a scientific system that it shall be capable of being singled out, once and for all, in a positive sense; but I shall require that its logical form shall be such that it can be singled out, by means of empirical tests, in a negative sense: it must be possible for an empirical scientific system to be refuted by experience.»
– Karl Popper

“Let us now imagine that the class of all possible basic statements is represented by a circular area. The area of the circle can be regarded as representing something like the totality of all possible worlds of experience, or of all possible empirical worlds. Let us imagine, further, that each event is represented by one of the radii (or more precisely, by a very narrow area—or a very narrow sector—along one of the radii) and that any two occurrences involving the same co-ordinates (or individuals) are located at the same distance from the centre, and thus on the same concentric circle. Then we can illustrate the postulate of falsifiability by the requirement that for every empirical theory there must be at least one radius (or very narrow sector) in our diagram which the theory forbids.”
– Karl Popper

It is worth noting that if no conclusive test exists now, but we expect that it will be possible to perform a test some time in the future, or that relevant data will be available in the future, we must then suspend judgement about the theory until a conclusive set of tests or observations has been performed. If the idea is testable in theory but not in practice, it will not be a scientific theory in practice. We are then obliged to suspend judgement about it.

2.3 Check that the available data corroborates the idea

Often the idea is based on data – ranging from data from a single observation to a range or group of observations. If the idea is based on a data record – a so called time series – here are some checks which should be conducted before opening the champagne.

  1. Are data for relevant dependent and independent variables recorded?
  2. Are the measurement uncertainties known, and sufficiently low, for all variables?
  3. Has spurious correlation between variables been excluded?
  4. Is there both Correlation and causality between the dependent and the independent variables?
  5. Are the observations reproducible for this particular set of conditions?

Regarding point 5 it is important to recognize that a test is only valid at the set of conditions it has been conducted and reproduced. It is inductive reasoning to apply an idea outside outside the particular set of conditions it has been tested. Applying an idea, model or theory outside the set of conditions it has been tested is commonly known also as extrapolation or interpolation.

Here is an example of inductive reasoning:
If I add 4.187 kJ to 1 kg of H2O, the temperature will increase by 1 K – as I just demonstrated by adding 4.187 kJ to 1 kg of H2O at 300 K . 

This is inductive reasoning because:

If I add 4.187 kJ to 1 kg of H2O at 268 K  the temperature will increased by approximately 2 K.

The explanation is that : At 300 K, H2O will be in liquid form and have a specific heat capacity of 4.187 kJ/kgK.  At 268 K H2O will be in ice form and have a specific heat capacity; 2.027 kJ/kgK.

2.4 Check that the idea is consistent with established theories

If I happen to come up with an idea, which might seem brilliant in my mind, it is normally a good idea to check if the idea is consistent with established or already corroborated theories. If my idea has less predictive power than existing theories, my idea may not be worth chasing.

If, for instance, I come up with a brilliant perpetual motion machine it is a good idea to check if the theory is consistent with already existing theories. e.g:
First law of thermodynamics
– Second law of thermodynamics

3 Deduct

3.1 Deduce testable predictions from the idea

Many ideas will have been proven wrong, or failed to pass the checks, before we get this far. Many proponents don´t even attempt to bring their ideas this far, they are just happy as long as the idea works fine inside their head.

Some will have resorted to inductive reasoning. As soon as an explanation has been found which seems to work fine inside somebody´s head – or as soon as a curve matching the data has been found – some will pop the champagne and start out on a crusade to the change the world. Ignorant to the fact that: To a limited number of data points we can fit an infinite number of curves of various forms. Only the fantasy sets a limit for the number of explanation we can come up with to explain what has been observed, or the range and type of outcome we can predict.

But if my idea has survived so far – it is now time to expose my idea to testing. And now it is time to remember that my idea will be corroborated by the severity of tests it has been exposed to and survived, and not at all by inductive reasoning in favor of it.

So now is the time to be merciless to the child of my own thoughts – the time to deduce necessary consequences of my idea, and design ingenious, brilliant, cruel and conclusive tests in an attempt to falsify my idea. I must remain  well aware that those tests which my idea has not been exposed to will be the ones my opponents will think about and ask: How do you know that it is … ?- or how do you know that it isn´t … ?

And when I deduct necessary consequences of my idea, I will also have to remember that theories tend to be valid for a particular range of relevant conditions. Hence, when I deduce necessary consequences, or predicting particular result of my idea, I should also define the range of conditions for which I predict that the theory will be valid.

3.2 Objective and subjective statements

There is another requirement which is quite fundamental to science. Statements within empirical sciences must be objective. All statements within science must be inter – subjectively testable. It must be possible for other individuals than the proponent of a statement to expose the statement to testing. It must be possible for other individuals to check, verify and repeat testing and expose the idea, hypothesis or theory to any conceivable test. If the statement is not testable it isn´t falsifiable, if the statement isn´t falsifiable it isn´t a scientific statement.

A model must also be regarded as a combination of ideas, hypothesis and theories. consequently the model must also be available for testing. It must be possible for opponents to test the model. If not, it isn´t falsifiable in practice.

A subjective statement – an idea, hypothesis, model or theory – is characterized by the simple fact that the statement cannot be exposed to a conceivable test by other individuals than the subject putting forward the statement, whether the subject is an individual or a group.

It is time again to turn to Karl Popper and the Logic of scientific discovery:
“a subjective experience, or a feeling of conviction, can never justify a scientific statement … within science it can play no part except that of an object of an empirical (a psychological) inquiry. No matter how intense a feeling of conviction it may be, it can never justify a statement. Thus I may be utterly convinced of the truth of a statement; certain of the evidence of my perceptions; overwhelmed by the intensity of my experience: every doubt may seem to me absurd. But does this afford the slightest reason for science to accept my statement? Can any statement be justified by the fact that Karl Popper is utterly convinced of its truth? The answer is, ‘No’; and any other answer would be incompatible with the idea of scientific objectivity.”
– Karl Popper

The current controversy related to the work of United Nations Intergovernmental Panel on Climate Change is the main reason why I started to dive into this. A good example of subjective statements can be found in the following document: Guidance Note for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties

In that document, made up in a hasty way, IPCC made a very fundamental error within science by attempting to make a consistent system for the expression of subjective qualitative statements. Statements about degree of agreement, degree of robustness and qualitative level of confidence. These are kind of statements which are incompatible with objective science.

Here are some extracts from the guidance note:
“These notes define a common approach and calibrated language that can be used broadly for developing expert judgments and for evaluating and communicating the degree of certainty in findings of the assessment process. These notes refine background material provided to support the Third and Fourth Assessment Reports1,2,3; they represent the results of discussions at a Cross-Working Group Meeting on Consistent Treatment of Uncertainties convened in July 2010. They also address key elements of the recommendations made by the 2010 independent review of the IPCC by the InterAcademy Council. Review Editors play an important role in ensuring consistent use of this calibrated language within each Working Group report.” Ref: Guidance Note for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties

The kind of subjective statements are illustrated by the following guidance in the guidance note:
“Use the following dimensions to evaluate the validity of a finding: the type, amount, quality, and consistency of evidence (summary terms: “limited,” “medium,” or “robust”), and the degree of agreement (summary terms: “low,” “medium,” or “high”).” Ref: Guidance Note for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties

The “level of confidence” is illustrated by the confidence scale on the right side of the figure.

All the statements within this picture are subjective statements. Statements which cannot be exposed to any conceivable test. Consequently this kind of statement does not belong within objective empirical science.

 

The kind of qualitative statement used by IPCC was referred to as probability of hypothesis or as non-numerical probability statements by Karl Popper. Karl Popper also referred to non – numerical probability statements as “probability of hypothesis”. Probability of hypothesis can be interpreted as the probability that a particular statement is true – about which he had the following to say:
“All this glaringly contradicts the programme of expressing, in terms of a ‘probability of hypotheses’, the degree of reliability which we have to ascribe to a hypothesis in view of supporting or undermining evidence.”

Note that probability of hypothesis must not be confused with quantitative statements about uncertainty in measurement – which can be tested in objective ways. See section below about uncertainty in predictions and measurements.

It is time again for some quotes by Karl Popper:
“Whatever may be our eventual answer to the question of the empirical basis, one thing must be clear: if we adhere to our demand that scientific statements must be objective, then those statements which belong to the empirical basis of science must also be objective, i.e. inter-subjectively testable. Yet inter-subjective testability always implies that, from the statements which are to be tested, other testable state- ments can be deduced. Thus if the basic statements in their turn are to be inter-subjectively testable, there can be no ultimate statements in science: there can be no statements in science which cannot be tested, and therefore none which cannot in principle be refuted, by falsifying some of the conclusions which can be deduced from them.”
– Karl Popper

“scientific theories are never fully justifiable or verifiable, but that they are nevertheless testable. I shall therefore say that the objectivity of scientific statements lies in the fact that they can be inter-subjectively tested.”
– Karl Popper

4 Test – Compare predictions with empirical observations

Finally we have arrived at the pivot point in science – testing.

The predictive power of my idea is proportional to the range of events which are prohibited by my idea and inversely proportional to the range of events which are allowed by my idea. A theory which allows everything predicts nothing.

It is also important to keep in mind that the test results are only valid for the conditions of the test. The moment we start to extrapolate or interpolate the results, or take the results for valid under other conditions, under other influencing parameters,  we are in unknown territory.

It is again time for a few selected quotes from Karl Popper; The logic of scientific discovery:
“Every scientific theory implies that under certain conditions, certain thing will happen. Every test consists in an attempt to realize these conditions, and to find out whether we can obtain a counter-example even if these conditions are realized; for example by varying other conditions which are not mentioned in the theory.»

«Testing by experiment thus has two aspects: variation of conditions is one; and keeping constant the conditions which are mentioned as relevant in the hypothesis is another – the one aspect which interests us here. It is decisive for the idea of repeating an experiment.»

« This fundamentally clear and simple procedure of experimental testing can in principle be applied to probabilistic hypotheses in the same way as it can be applied ton non-probabilistic or, as we may say, for brevity´s sake, «causal» hypothesis.

“Tests of the simplest probabilistic hypotheses involve such sequences of repeated and therefore independent experiments – as do also test of causal hypotheses. And the hypothetically estimated probability of propensity will be tested by the frequency distributions in these independent test sequences. (The frequency distribution of an independent sequence ought to be normal, or Gaussian; and as a consequence it ought to indicate clearly whether or not the conjectured propensity should be regarded as refuted or corroborated by the statistical test.»

«An experiment is thus called «independent« of another, or of certain conditions, or not affected by these conditions, if and only if they do not change the probability of the result. And conditions which in this way have no effect upon the probability of the result are called irrelevant conditions.”

“I should say that those objective conditions which are conjectured to characterize the event (or experiment) and its repetitions determine the propensity, and that we can in practice speak of the propensity only relative to those selected repeatable conditions; for we can of course in practice never consider all the conditions under which an actual event has occurred or an actual experiment has taken place.”

«Thus in any explanatory probabilistic hypothesis, part of our hypothesis will always be that we have got the relevant list of conditions .. characteristic of the kind of event which we wish to explain.”

4.1 About uncertainty of measurement and predictions

Uncertainty is related to the probability distribution for a measurement result. Uncertainty express the probability for finding the “true” value of the measurand in a particular range around the measured value. This range is defined by the uncertainty specification.

Hence, for a measurement to be complete, it must consist of the measured value, the unit, and the uncertainty.

Similarly, a necessary part of a prediction will be to predict a probability distribution for that prediction. The probability to find the true value within a quantified range around the predicted value.

For a prediction to be complete it must consist of the measured value, the unit, and the uncertainty.

When we have a both a complete prediction – and a complete measurement – we are then able to decide if the prediction was right or wrong. We will expect that the prediction and the empirical result, the measurement, in the long run will differ by less than the combined uncertainty of the prediction and the empirical observation.

The following example might illustrate that a prediction without a stated uncertainty cannot be falsified and consequently it cannot be regarded as a scientific theory within the norms we are here referring to.

Prediction:
“In accordance with my aerodynamic model – I have calculated that if I drop this feather from a height of 5 meters it will hit the ground in 3 seconds.”
Test result:
“The feather hit the ground in 6 seconds”
Possible conlusion if uncertainty is not stated up front:
“My areodynamic model is correct – 5 seconds is pretty close to 3 seconds
This illustrate that a prediction without a stated uncertainty and also an upper limit for the uncertainty isn´t falsifiable. A school class would be able to predict an outcome with similar accuracy.

The uncertainty should be expressed in a standardized way to be a meaningful statement about uncertainty – and to avoid confusion about it. The following standard is the only broadly accepted international guideline for expression of uncertainty:

JCGM 100: 200 Evaluation of measurement data – Guide to the expression of uncertainty in measurement

The guideline can be found here:
(Bureau International des Poids et Measures – Guides in metrology)

About the guide:
“This Guide establishes general rules for evaluating and expressing uncertainty in measurement that are intended to be applicable to a broad spectrum of measurements. The basis of the Guide is Recommendation 1 (CI-1981) of the Comité International des Poids et Mesures (CIPM) and Recommendation INC-1 (1980) of the Working Group on the Statement of Uncertainties. The Working Group was convened by the Bureau International des Poids et Mesures (BIPM) in response to a request of the CIPM. The ClPM Recommendation is the only recommendation concerning the expression of uncertainty in measurement adopted by an intergovernmental organization.

This Guide was prepared by a joint working group consisting of experts nominated by the BIPM, the International Electrotechnical Commission (IEC), the International Organization for Standardization (ISO), and the International Organization of Legal Metrology (OIML).

The following seven organizations* supported the development of this Guide, which is published in their name:
BIPM: Bureau International des Poids et Measures; IEC: International Electrotechnical Commission; IFCC: International Federation of Clinical Chemistry ; ISO: International Organization for Standardization; IUPAC: International Union of Pure and Applied Chemistry; IUPAP: International Union of Pure and Applied Physics; OlML: International Organization of Legal Metrology ”

4.2 When predictions don´t match observations

If there is one common characteristic to all mothers of a theory – it must be the spinal cord reflex to blame everything else than the theory whenever it fails a test.

So what to do then, if the predictions doesn´t come through, if the experiment gives another output than predicted or if observations in nature doesn´t match the predictions. It is now time to think very carefully about what to do. The reason for that is that my integrity is at stake. At this point in the process I risk loosing my credibility as a researcher. A proponent of a theory can easily evade falsification by ad hoc modifications intended to save the theory.

Again this is very well phrased by Karl Popper:
“… it is always possible to find some way of evading falsification, for example by introducing ad hoc an auxiliary hypothesis, or by changing ad hoc a definition. It is even possible without logical inconsistency to adopt the position of simply refusing to acknowledge any falsifying experience whatsoever. Admittedly, scientists do not usually proceed in this way, but logically such procedure is possible»

«From my point of view, a system must be described as complex in the highest degree if, in accordance with conventionalist practice, one holds fast to it as a system established forever which one is determined to rescue, whenever it is in danger, by the introduction of auxiliary hypotheses. For the degree of falsifiability of a system thus protected is equal to zero.»

“the empirical method shall be characterized as a method that excludes precisely those ways of evading falsification which … are logically possible. According to my proposal, what characterizes the empirical method is its manner of exposing to falsification, in every conceivable way, the system to be tested. Its aim is not to save the lives of untenable systems but … exposing them all to the fiercest struggle for survival.»
– Karl Popper

If the observations does not match the prediction there is one reasonable thing to do, and that is to state that there is something wrong. It is reasonable to state that, at the moment, I cannot claim that I have a proper and full understanding of the issue at hand.  There are many possible reasons for why the predictions did not match the observations, so it doesn´t have to mean that the idea is wrong. It could also be that the measurements are wrong, that the test is poorly designed or poorly conducted – or that the uncertainty specification is too optimistic.

But anyhow, the reasonable thing to do is to suspend judgement about the idea, and continue the endless search for a fault.

4.3 When predictions match observations

If the observation match the prediction I can now publish an objective statement about this idea. I can state that this theory has passed this very test under these very conditions. The kind of test and the conditions for the test are closely related to the degree of corroboration of the theory.

Hence, the idea should be referred to in the same breath as the conditions of the most severe test it has been exposed to and survived. The idea should be presented together with predictions which did materialize and could be reproduced. The corroboration of the theory is stronger the more severe testing the theory has been exposed to and survived.

Here are some more quotes from Karl Popper regarding testing and corroboration:
“Next we seek a decision as regards these (and other) derived statements by comparing them with the results of practical applications and experiments. If this decision is positive, that is, if the singular conclusions turn out to be acceptable, or verified, then the theory has, for the time being, passed its test: we have found no reason to discard it. But if the decision is negative, or in other words, if the conclusions have been falsified, then their falsification also falsifies the theory from which they were logically deduced.

It should be noticed that a positive decision can only temporarily support the theory, for subsequent negative decisions may always overthrow it. So long as theory withstands detailed and severe tests and is not superseded by another theory in the course of scientific progress, we may say that it has ‘proved its mettle’ or that it is ‘corroborated’ by past experience.

Nothing resembling inductive logic appears in the procedure here outlined. I never assume that we can argue from the truth of singular statements to the truth of theories. I never assume that by force of ‘verified’ conclusions, theories can be established as ‘true’, or even as merely ‘probable’.”
– Karl Popper

Requirements to an quantitative empirical model

Having been through the main steps in the modern scientific method i think of it as useful to have a checklist against which I can evaluate the reliability of an quantitative empirical model. I really think it would have been useful to have an international standard to refer to when evaluating if an quantitative empirical model is reliable. Unfortunately no such standard exists.

If we regard the case where the empirical model is about predicting the quantity of an output value for a defined number of inputs. There are some standards relating to measurement, uncertainty and testing that are relevant. Among these are: International Standardization Organisation ISO:  Guide to the expression of uncertainty in measurementGeneral requirements for the competence of testing and calibration laboratories; And Bureau International des Poids et Measure BIPM: International Vocabulary of Metrology – Basic and General Concepts and Associated Terms

Based on these standards – I regard it reasonable to expect from an useful and reliable empirical model that it fulfills the criteria:

  • The theory is about a causal relations between quantities that can be measured
  • The measurands are well defined
  • The measurands can be quantified within a reasonable uncertainty
  • The uncertainty of the measurands has been determined by statistical and / or quantitative analysis
  • The functional relationships, the mechanisms, has been explained in a plausible manner
  • The functional relationships, the mechanisms, has been expressed in mathematical terms
  • The functional relationships between variables and parameters has been combined into a model.
  • The influencing variables which have significant effect on the accuracy of the model are identified
  • The model has been demonstrated to consistently predict outputs within stated uncertainties
  • The model has been demonstrated to consistently predict outputs without significant systematic errors
  • The model has been tested by an independent party on conditions it has not been adjusted to match

I would also expect that data, methods, models and test results are readily available for scrutiny, and that all information is provided in a way that is consistent with established standards.

The explanation behind every point is provided in the post: Requirements to a reliable quantitative theoretical model – explained. In that post I list and explain the requirements above. Requirements I think a quantitative empirical model should fulfill before I will be willing to refer to it as reliable.

Extroduction

Before I end this post – there is one thing it took me many years to realize, and that is that I used think that the hypotetico – deductive model is taught to every student at every school all over the world – that all scientists are trained in the method and follow it strictly in their quests for truth. I have finally realized that is not the case. Many have heard about the method – few live it- no one lives it to the full.

And finally – if you happen to like this post, or parts of it, anyone is free to use whatever is found useful, I will be happy of course if you then also mention the source, or provide a link to this site. I will do my very best to keep the links unchanged. I may do changes to the content however. Major changes will be logged, minor changes will not.

With the greatest wish to be corrected
Science of Fiction

Advertisements

3 thoughts on “How to arrive at a reliable model

  1. Pingback: Index – Posts by “Science or Fiction” | Science or fiction?

  2. The Problem of Induction:
    All of that section is very good.

    But I want to suggest a bit more.

    This springs off of the example of predicting stock prices.

    Here is an innovation that makes it easier to define and describe the scientific method / hypothetical-deductive method.

    The innovation is this: declare that “science” applies to the general physical / mechanico-physical universe, including but limited to general, replicable phenomena.

    What does this exclude? Specific historical events and the future. Prediction and “history.”

    Then, we can say it is a matter of historical analysis to explain the changes in a stock price across time. We can use quasi-scientific methods, and good argument, to make the case of why the stock went up one day, and down another.

    Those principles are often quite likely to continue operating in the near future. But they are not guaranteed. The CEO of a firm may change, and its stock price may then step up higher than ever, or step down lower than ever, with the other influences continuing, or maybe not. All of these things are historical, once-in-history situations that will never occur again in the same way.

    In contrast, a company can rely on what is known of the physical world to reliably produce television after television, and your electricity company is able to provide the right electrical supply to operate it. General knowledge of metals, conductivity, electricity, etc., are used for this.

    But you cannot reproduce a great stock buy like you can reproduce a great TV.

    Likewise, predictions cannot be scientific. The future, when it arrives, will be a once-in-history event, with an utterly unique set of causes leading to the effects we see.

    This all sounds good, but leaves some problems.

    For history, problems like this: we cannot scientifically “prove” what caused WWII, or the Great Depression. We cannot replicate those historical events. There is no way to falsify an explanation of either event by science – by hypothetical/deductive effort. It is not about general mechanical principles of the orderly universe.

    It is more acceptable to consider this reality if we realize that the past has already happened, and so no matter what we try we are always coming up with an explanation after the fact; there is no prediction about it.

    To go further, I will disturb the faith of many: we cannot scientifically “prove” evolution. If it is true, it was a unique combination of factors at one point in time. So, following what I am proposing, the best we can say is that Evolution is a powerful explanation of a historical event, but cannot ever be “fact.”

    Similar for the future. We can never say it is a “fact” that we will one day elect a woman as president of the United States. as inevitable as that may now seem, having seen Geraldine Ferraro and Sarah Palin get on the ticket as VP, and having seen H. Clinton get on as the candidate, and be quite close – a matter of some close races in swing states.

    As alternatives, we can devote efforts to teaching and learning principles of good History, as we focus on good methods of good Science. We can also focus on good methods for predictions and projections about the future.

    Like

    • u might care to refine your definition of ‘proof’ because requires a specific context for validity.
      one may not claim to prove anything except a logical proposition.
      so if an hypothesis has not been formed such that it is amenable to logic, then it’s simply not what it pretends to be. baseless speculation is in the realm of mysticism, not reality.
      for something to be provable, it has to be a logical proposition.
      only logical propositions are amenable to proof.
      if a statement (or experiment) does not evaluate to ‘true’ or not, then it wasn’t a statement (or experiment) directed at proving anything.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s