Explaining Failure with Tense Logic

Peter B. Ladkin

Research Report RVS-RR-96-13

Abstract
Accidents involve events, and events causing other events, laid out in time until the final disaster. Boolean logic alone is insufficient to describe such failure, although its use in the shape of fault trees has been very successful. Reasoning about objects and their properties is the domain of predicate logic, and reasoning about events and their temporal relations is the province of tense logic. I show how the reasoning involved in accident analysis may be formalised in tense logic, using the past-time operator since. I also show how a fault tree may be derived from a temporal logic description without use of frame axioms. I critique proposals to use purely future operators, and to identify the causality relation with a purely temporal logic operator. Finally, I consider the important concepts of interactive consistency and tight-coupling not to be worth formalising inside a tense logic.


After equipment fails, or an accident happens, analysis will try to determine the temporal sequence of crucial events and their influence on each other. Much analysis is informal. Formal safety analysis methods such as Fault Trees (1) allow only very limited expression of temporal factors - as is often remarked (2). Most such methods are prophylactic tools and not designed for explanations of historical incidents.

Do we need formal reasoning for accident explanation? Yes: using a minimal formalism to analyse the probable cause statement in the report on the 1979 Chicago DC-10 accident highlights both the incompleteness in the statement and an inconsistency in the report (3). Analysis of the A320 Warsaw runway overrun accident using the same minimal formalism highlights infelicities in the report (4).

So, easy formalism has already helped in analysis. An appropriate logical formalism could put accident reasoning on a firmer foundation: give it a form in which it is open to formal analysis and may be shown correct or incorrect, or complete or incomplete, by those with varying intuition about the subject matter. This route leads to rigor, to complete understanding, to a scientific basis for resolving any disagreement, and hence to safer airplanes in the future.

The Form of Accident Explanations

Consider a simple description of an accident and its explanation.

Accident:
  • my sandwich is squashed.
Causal sequence:
  • (a) sandwich placed on chair, followed sometime later by
  • (b) my backside.
Reasons:
  • (a) undertaken in preparation for placement of sandwich in bag; table already full
  • (b) undertaken to rest my weary soul, in blissful oblivion of past action
The questions answered by this accident report are The purpose of the report is This trivial example highlights some major themes:

Explanation involves human factors and physical causality. First, a brief word about human factors before I concentrate on my main theme, the structure of causal explanations. References (4) and (5) remarked in passing on the complex story of human decisions, intentions and actions . To illustrate the multiple failure modes available within the realm of human behavior, consider the following crude model of rational action: one receives information and holds beliefs (represented let's say as a set of propositions), makes decisions based on these propositions, intends to act according to the decisions, and acts according to the intention. All of these four steps can fail: my information and beliefs may be false; they may be correct but my decision may be inappropriate (say I panic); my decision may be correct but my intented act inappropriate (I think I know what to do but really don't); my intention may be honorable but my implementation faulty (I'm out of practice). Any of these `failure modes' can be factors in a causal chain leading to an accident, and conversely, what factors I consider depends on my theory of action. So analysing human action within a causal chain can be a complex undertaking.

To my main theme: how causality involves the unfolding of events in time. Figuring the progression of events in time is an important part of determining what caused what. Some recent studies have attempted to reverse the condition: to claim that the causal conditions relevant for accident reports can be defined purely in a formal logic of time. The temporal logic of intervals of Moffett et al. (6) was considered in (5). Johnson and Telford (7) propose using a logic specifically developed for formal reasoning about systems by Lamport, the Temporal Logic of Actions, TLA (9), and defining causality and indeed full explanations of accidents entirely within this logic. My critique of their attempt appears in (10).

TLA is a logic devised for verification, that is, for proving formally and rigorously that algorithms or programs or system designs fulfil their specifications. Such a logic must be able to describe sequences of events, or their outcomes, in order to compare what happens with the system to what is wanted. Accidents are sequences of events in which what happens compares unfavorably with what is wanted. So it looks as though such a language could be of use in describing accidents. It we had a definition within the language of what kind of event constitutes an accident, to say `what is wanted' we can then simply take the requirement that `there is no accident'. Then, describing in the logic how an accident happened is tantamount to showing how the system failed to fulfil its requirements. Straightforward stuff for a verification logic. The suggestion looks promising to adapt such a logic for expressing accident reports.

Reasoning About Accidents

Analysing accidents involves considerable detective work. And detective work involves reasoning, not just reasoning from groups of facts to conclusions, but also reasoning about the relation of conclusions to each other, whether they represent incidental facts or essential ones.

We start from a fact: the accident happened. That's past tense. We may as well say <sometime in the past> the accident happens. That's a statement with a tense operator. The assertion the accident happens (present tense) is true sometime, and the tense operator <sometime in the past> tells us when this time is, relative to the present. If I imagine moving the 'present' back into the past, so my 'present' is now the moment at which the accident actually happens, I can assert the accident is happening at this moment. Let me call this imagined present the temporal point of view, tpov for short. So a language with tense operators allows us to move tpovs as we're talking.

We reason about what must have happened in order for the accident to happen, and from other facts we know about prior to the event. One method of analysing reasoning is Boolean logic, which shows how to reason using AND and OR, if not IF and NOT. Fault trees are an incarnation of Boolean logic. Boolean logic can't handle tenses, neither can it handle reasoning about physical objects and their properties. First Order Predicate Logic handles objects and properties. Tense logic handles tense operators, but there isn't just one set of valid rules of tense logic, there are many. The valid rules of a tense logic depend on the underlying mathematical stucture of time. A single time line allows certain inferential tense reasoning conditions that time branching into the future does not, and vice versa. A continuous time line imposes conditions that a discrete time line does not. A time line starting at a particular moment may be suitable for describing a computation, but it imposes logical conditions that a time line extended infinitely into the past does not. And so on.

So there are some choices to make about exactly how we should reason with tenses. Does this make it less precise, less justified, than, say, Boolean logic? I don't believe so. Even for propositional and predicate reasoning, there are alternatives to Boolean logic and first-order logic. Suppose we assert a proposition only when we are guaranteed it will be true sometime now or in the future: say, when we have a proof of it now. Reasoning thusly yields different rules. For example, for any proposition p, p \/ ~p is a validity of Boolean logic. However, it is not assertable under our current supposition: the disjunction reasoning rule allows me to assert A \/ B if I can assert A, alternatively if I can assert B, in Boolean logic also; using this rule under the current supposition, I can assert A \/ B only when I am in possession of a proof either of A or of B. Since for a random proposition p I do not in general possess a proof of p or of ~p, I cannot assert the `law of the excluded middle' p \/ ~p (11). It could be concluded that the reasoning rules to be used depend upon the background to the reasoning task: we use `the right logic for the right job', so to speak (12).

The tense logic rules depend upon the mathematical structure of time, and vice versa. So we can either determine the complete set of rules we need, thus fixing the time structure; or we can determine the time structure, thus fixing the rules we need; the best is to do a bit of both. We'll use a 'recipe book' (13).

The Ontology of TLA

Before the reasoning is considered, one has to determine what one is reasoning about, what kinds of things there are in the universe. We can take the suggestion of Johnson and Telford that a tense logic for verification is likely to use an appropriate ontology, and so I shall discuss and use that of the tense logic TLA. However, the ontology of TLA is mathematical, based on ZF (Zermelo-Fraekel) set theory. Since we wish the sentences of our temporal logic to express truths about the (past) real world, and to reason about these truths, we are faced with explaining the relation between set theory and the world. Do we therefore wish to take this step?

There are two answers: first, we have to do it anyway to explain the structure of time; second, we can acknowledge the need, but leave it to realist philosophers of mathematics such as Maddy (14) to provide the justification. Barwise (15) has explicitly considered the relation in the context of the mathematics of computer program verification. Let's consider briefly just the first answer. Tense logic has been given a mathematical semantics, usually the 'Kripke' semantics, just as Boolean logic has the truth-table semantics and predicate logic the Tarskian model-theoretic semantics. These semantics act as oracles for the validity of the reasoning rules we use. That is, we may determine that a reasoning rule is invalid by providing a counterexample constructed according to these semantics. If one believes that valid reasoning is valid also in mathematics (and it would be strange to consider a rule valid if it had a counterexample in mathematics!), then finding a mathematical counterexample suffices to show the invalidity of the rule. But if a rule is valid, can one show that by just considering mathematical interpretations? That depends to some extent on the second answer. The extra twist with tense logic is that one must decide on a mathematical structure for time before one can determine which rules are valid even for just the mathematical semantics. But philosophically speaking, a satisfactory answer to the relation between formal mathematical semantics and the 'real world' will answer this question with no extra work. There is no problem with the mathematical semantics of tense logic that does not already occur with all forms of logic that have a formal semantics. I don't attempt to address this larger question here.

At the basis of the TLA ontology are state variables and values. From these, we define other categories of objects by standard set-theoretical data-type constructions. State variables and values are objects of two different categories, that is to say, no state variable is a value, no value is a state variable, and we have means to tell whether a given object is a state variable or a value. Values are simply sets in Zermelo-Fraenkel (ZF) set theory, the `usual' set theory. State variables are syntactical objects. It matters only that they can be distinguished from values.

How do we construct these two categories mathematically? Let us base ourselves on ZF set theory, that is, we take everything to be a (ZF) set. We can then build the two distinct categories. This can be accomplished, for example, by taking everything in the TLA ontology to be a pair. Assuming the standard ZF definition of 0 and 1, pairs and sequences, state variables could be defined as pairs <0,s>, where s is a set, and values as pairs <1,s>, where s is a set. (The latter definition provides an embedding of part of the set-theoretical universe into the whole set-theoretical universe, taking <1,s> to s.)

Given the basic ontology, the rest of the important categories in the ontology can be defined using set-theoretic operations over state variables and values. A state is a set of pairs of state variables and values. We speak of a state variable x having value v in the state S if the pair <x,v> is a member of the set S [1].

We now have state variables, states, and values. The next step is to define a data structure that can be used to describe how states may `change' with `time'. Artifacts that have state and change this state dynamically through operation are often called machines (think of a steam engine or a car, not just a computer). A description of a machine, or of its design, is a finite assertion which semantically captures its intended behavior, even though this behavior might be unbounded in time. Thus, a machine description must be in a general sense rule-based.

I shall define an abstract structure representing change through time called a state machine. Intuitively, a state machine is an abstract device that can be thought of as having a `current' state, and changing its current state in discrete steps, defined by the permissible machine actions. One may object that a steam engine or a car doesn't change state discretely, in steps. However, a behavior of one of our abstract machines may be thought of as a series of snapshots of the behavior of such a `continuous' machine. This entails that a continuous machine corresponds to no single one of our abstract machines but rather to a whole collection of them. One can think of one of these abstract discrete machines as a `view' of the continuous machine. It could be objected that there could be certain crucial features of such a continuous machine that could only be represented by handling the passage of time through a continuous function, as in calculus and engineering mechanics. This would be a purely technical, not a philosophical, argument, and as such it seems that it can be refuted. Lamport has shown (16) how a continuous system, a gas burner, which depends on integral calculus for its engineering description, may be specified in TLA using the discrete semantics. There is simply a mathematical technique involved. The gas burner is partly specified using an integral equation, and the state at a particular time t.0 is given by a definite integral over time with t.0 as a bound. The Riemann integral itself is defined in TLA exactly the same way that it is defined in mathematics using set theory. The set of all behaviors of the abstract machine definable in TLA is precisely the set of all snapshot sequences of the continuous machine. This general technique for converting continuous mathematics into snapshot sequences may be used to define dynamic systems, insofar as set theory suffices to describe this mathematics [2]. This example suffices to show that there is no fundamental problem integrating continuous mathematics with snapshot semantics. The reasoning involved in explaining accidents appears to be dependent only on snapshots.

Because a description of a machine is a finitary thing, and we formulate no finitary rules concerning what collection of state variables a machine may use, it follows that this collection must be finite. A state machine therefore `owns' a fixed finite set of state variables, all of which and only which have values in each state of the machine. So a restriction to a finite collection of variables is no logical restriction, given set theory. It is a straightforward set-theoretical operation to replace an infinite collection of variables by a single variable which is a function whose domain is an infinite set. First, we sequence the variables, using, say, the well-ordering principle. The sequence is then our function, with domain Nats, the natural numbers, or some initial segment of the ordinal numbers. Say, n

Vars = {a,b,c,.....} => <b, c, a, .....> = a function X: Nats -> Vars.

Variable a thus becomes X[2]. This semantic equivalence leads to a syntactical transformation. One replaces all occurrences of a in statements and predicates describing the machine by occurrences of X[2].

A run or behavior of such a machine is an infinite sequence (an infinite sequence is a function whose domain is Nats, as above) of states, each of which `has' precisely the same finite set of state variables (i.e., for every state in the behavior, the set of all first components of elements - the set of `program variables' - is the same). We say the behavior `has' this set of state variables, alternatively that it is a behavior `of' this set of state variables. We started from a definition of state machine and obtained a definition of behavior. This may be reversed: a definition of abstract state machine may be obtained instead from the definition of behavior. A state machine could be defined as any set of behaviors, each of which has the same finite set of state variables. We have now defined values, state variables, states, behaviors, and state machines. Thus are state machines equivalent to sets of behaviors. It remains to define the important semantic category of actions.

Machines so far are relatively abstract constructions in set theory, interdefinable with behaviors. However, physical machines can only change state in certain circumscribed ways: a car may normally only change direction by turning its front wheels; may only normally change its position, in two near-enough snapshots, along an arc formed by the directions defined by the positions of its front and back wheels. A computer may normally only change its memory by applying one of the operations defined in its instruction set, although in describing its actual, as opposed to its normative, behavior, we might have to allow for the effect of cosmic rays, airport X-ray machines, or strong electromagnetic fields in the vicinity. The abstract description of machines is similarly circumscribed in a verification logic such as TLA. `Normal' behavior is specified through actions. TLA only allows certain forms of machine description. Conditions are specified on the initial state; the precise set of actions possible in a state is determined; certain `liveness' conditions are specified on what the machine must eventually do. A TLA machine specification is a conjunction of initial conditions with allowed actions with liveness conditions.

An action is ontologically a binary relation on states: the first argument can be thought of as the `before' state and the second as the `after' state. We say that state S results from the action A on state R just in case the pair <R,S> is in the relation A (in set theory, binary relations on sets are defined as sets of pairs).

I believe that this ontology, these semantic categories, suffice to describe machine behavior and to formulate accidents and determine sequences of states leading to the accident. This ontology is that of the logic TLA. However, I shall show below how I do not believe that the syntax of TLA suffices for describing accidents. I shall propose operators and some rules governing them which I believe needed, which operators apparently strictly extend those of RTLA (RTLA is the class of unrestricted formulas written with the TLA syntax and interpreted with the TLA semantics).

Furthermore, as already indicated, I do not believe that either TLA or RTLA or my proposed extension suffice to describe causality.

The Syntactic Categories of TLA

Having described a particular ontology, the question of syntax now arises: what can be said about the objects in the ontology, and how. The definition of well-formed formulas as sequences of symbols, and how formulas may be derived from other formulas using proof rules, is standard and specified elsewhere (9). I shall rather discuss what can be expressed, and how easily - the meta-syntax, if you like.

A machine description is a logical formula whose interpretation is a state machine. Such a formula may be one of temporal logic, and would conform generally to a Pnueli-style semantics (17) (for a book-length discussion, see (18)). A temporal logic formula has two sorts of variables. Normal mathematical variables hold values across an entire behavior - the value doesn't change from state to state. These variables are called rigid variables. State variables (also called flexible variables) may have different values in different states of a behavior.

A formula of temporal logic is built from the usual logical operators of predicate logic, including the universal and existential quantifiers. The universal quantifier may be defined in the usual way as the dual of the existential quantifier. Since we have two sorts of variables, we also have two sorts of existential quantifier: :E: may only bind rigid variables; :EE: may only bind flexible variables. Temporal logic includes the additional sentential operators [], for 'now and always in the future', and and <>, 'sometime now or in the future

A temporal logic employing a Pnueli-style semantics also contains a set of inference rules, which are forms of conditional truth rules. One may infer that a given formula (the 'conclusion') is true provided that a set of formulas (the 'premises') are true. A logic is defined syntactically by the formulas it gives meaning to (the 'well-formed formulas') and its inference rules. The propositional temporal logic inference rules of TLA do not suffice to be able to infer all the valid propositional formulas of RTLA (valid is defined below). A complete system of propositional rules which does this is K4.3Dum (13).

Since we identify machines with sets of behaviors, one can ask whether two machine descriptions which define the same state machine are provably equivalent in the TLA logic. The answer is known to be yes only if the two descriptions are state machine descriptions allowed by TLA. TLA is not known to be complete for all machine decriptions of RTLA, indeed it seems likely that it isn't.

To define a specific action A explicitly in the syntax, one can write any formula involving state variables in R and S. Such a formula must allow reference to two possible values of each state variable x - one in R and one in S. Thus the syntax of action formulae must distinguish between references to the `before' and `after' values of x. A TLA action definition reserves symbol x for the value of variable x in state R, and uses the symbol x' for the value of x in the subsequent state S.

The Formal Semantics

The formal semantics of a temporal logic explains how these formulas denote objects in the ontology. A temporal-logical formula F is interpreted on behaviors S = <s.0, s.1, s.2, s.3, ........>, and we say that S fulfils or satisfies F, or not. Satisfaction is written

<s.0, s.1, s.2, s.3, ........> |= F

The formal definition of satisfaction proceeds as usual by induction over the definition of well-formed formulas of temporal logic, but there are some points to notice. The first two conditions are straightforward.

Given the intended meanings of the temporal operators, we want to say that a formula []F is satisfied on <s.0, s.1, s.2, s.3, ........> just in case F is true in every state in the sequence, and that a formula <>F is satisfied on <s.0, s.1, s.2, s.3, ........> just in case F is true on some state in the sequence. However, we haven't defined what it is for a state predicate to be true on an arbitrary state in a sequence, but only to be satisfied on the sequence as a whole. The definition of the semantics of the temporal operators is:

That is, we chop off the prefix sequence <s.0, s.1, s.2, s.3, ..., s.(n-1), s.(n-1)> consisting of the first n elements of <s.0, s.1, s.2, s.3, ........> and interpret the state predicate F on this sequence; mutatis mutandis for actions. Since the definition of what it is for a state predicate or an action predicate to be satisfied on a state sequence, this is a recursive definition of the interpretation of <>. Similarly,

This definition of the semantics of the temporal operators works for the Pnueli-style logics which are interpreted over behaviors. The Pnueli-style semantics uses operations on the state sequences to interpret the temporal operators. The original Kripke semantics for a temporal logic is more general, allowing a logic to be interpreted over any collection of states, not necessarily just sequences of them. No operators are assumed to be defined on the collections of states. Instead, the set of states must come equipped with a binary relation. In the Pnueli semantics, this is the relation of earlier-in-the-state-sequence-than. This binary relation may in a general logic have very different properties from that of a sequence ordering, thus the Kripke semantics explicitly evaluates a formula in a set of states at a state, so that the satisfaction relation has three arguments, not two:

The Pnueli semantics is, modulo minor technical differences, a special case of the Kripke semantics with the evaluating state always being the first state in the state sequence. We may say that the Pnueli semantics always sets the tpov to be the first state.

One describes the behavior of a TLA state machine thus in the Pnueli semantics. The machine starts in an initial state described by state predicate Init. It always makes one of a specified finite set of allowable actions Allowed-Actions. Finally, it satisfies a liveness condition which is a Boolean combination of weak or strong fairness conditions on Boolean combinations of the Crucial-Actions. Let the formula Allowed-Action be the disjunction of the formulas in set Allowed-Actions. (In TLA, one of the allowed actions is always do-nothing, which may trivially be written as a TLA formula if there are a finite number of flexible variables). Suppose there is one Crucial-Action, and weak fairness only is required. The machine may then be specified by the formula

Init /\ [](Allowed-Action) /\ WF(Crucial-Action)

In order to interpret the formula Init /\ [](Allowed-Action) /\ WF(Crucial-Action), we ask what behaviors satisfy it.

<s.0, s.1, s.2, s.3, ........> |= Init /\ [](Allowed-Action) /\ WF(Crucial-Action)

just in case Init is true in state s.0, Allowed-Action always happens, and the weak fairness condition (which involves both [] and <> operators) is satisfied. For verification, it suffices, when specifying how such a machine shall operate, to define its future behavior entirely from the starting state. This may be confirmed from the multitude of studies using verification in TLA (8)(19), discussed in (3)). One could describe the progress of conditions that led to the passenger airplane stalling on take-off and one would need to state the fact that the airplane remained stalled until it hit the ground (hitting the ground constitutes the `accident' according to the definition of `accident' used by the US National Transportation Safety Board, see below). After it hit the ground, it makes no sense to talk of the aircraft being stalled. Depending on how it hit, it may not even make sense to talk of 'the aircraft', but rather of many 'pieces of the former aircraft'. This shows there may even be problems of the reference of denoting terms if one is to talk about the state of the world after the accident. Talking about conditions up to a specific point of time but not beyond seems to be a reasonable constraint.

Talking about the past would require `past time' operators that we have not yet discussed. Maybe we can use a technical device: express the circumstances relative to a tpov earlier than the accident, and look to the accident as happening in the future? Then we would have to say that the airplane will stall, and will remain stalled until it hits the ground. But we don't want to express any conditions on behavior after this future point, for the reasons mentioned. We can't obviously do this with the operators we have so far. We want to express conditions which may persist up to a certain point, but then no further. There is such an operator, called `until'. The expressive power of 'until' is known to be strictly stronger than that of [] and <> (22) (see (13)). That means adding the 'until' operator to the syntax of the language.

Let P and Q be state predicates. The proposition P U Q, read P until Q, is intended to mean that P is true and remains true until Q becomes true, and that Q eventually does become true. As we have seen, it is natural for expressing a condition such as the stall. It is a correct observation concerning the accident in temporal logic that eventually the aircraft stalled and this stall persisted until the accident. Take the tpov to be some relevant prior state at which can begin explaining some conditions which led to the accident. Then the temporal logic formula

<> (Aircraft-stalled U Accident-happened)

is true in this tpov. (Notice that according to the intuitive meaning, <>Accident-happened is a consequence of this statement.) The semantics of until can be expressed in a similar style to those of the other operators. In the Pnueli style, in which the tpov is implicitly the first state:

The explicit style is similar:

The logic of until is powerful, so it may be that the expressive advantage of using it is more than compensated by the difficulty of using the logic which it requires. In particular, evaluating the truth of arbitrary sentences in a powerful logic may have a high computational complexity. Without such assessments, it's difficult to tell whether it would be ultimately advantageous. Also, it is not clear that accident reasoning needs to use the full power of an until logic. An expressive case for it has been made, based on a tpov taken to be before the accident.

The accident is an event, something which changes the world. One can describe the state of the world before the accident, and the altered state of the world after the accident, in more or less detail. Thus can the accident be considered as an action in the snapshot semantics, a relation between any two states which look similar enough in relevant ways to the before-state and after-state. In reasoning intuitively about an accident, we start from the present, namely a point at which the accident has happened, and consider the state of the world immediately after the accident occurrence and reason back from this point to what must have previously happened. The tpov of our intuitive reasoning is therefore not at all some point previous to the accident, but some point immediately after the accident or in the accident's future. What kind of technical differences can there be between describing an accident from a tpov in its past, and describing it from a tpov in its future or immediate future?

With a tpov in the present or future of an accident, reasoning must be represented in a language which allows expression of statements about the past: a past-time, not just a future-time, logic. Instead of the until operator, we thus need a since operator. This is straightforward to define using the Kripke semantics:

The semantics is a 'mirror-image' about s.m of the semantics of until. The Pnueli-style semantics of past-time logic cannot be given, since when the tpov is the starting state, all states are in the future of the tpov and there is no past. Let us now use the notation [F], <F> for the future operators. We define past-time operators [P], <P> of which the semantics are a mirror of those for the future-time operators: We define the operators always and sometime, [] and <>, now as [] E = [P] E /\ E /\ [F]E and <> E = <P> E \/ E \/ <F>E.

But although the semantics of since and until, [P] and [F], <P> and <F> are mirror-images of each other, the valid reasoning rules of past-time logic are not just the mirror-images of those of future-time logic. The states backwards from a tpov s.k look like <s.0, s.1, ...., s.(k-1)>, a finite sequence, whereas those forwards are <s.(k+1), s.(k+2), ......>, an infinite sequence. `Always in the past', in this interpretation, speaks about a finite, although indeterminate, sequence of past states, and the reasoning rules we may employ are different. For example, the formula

[P]([P]A => A) =>[P]A

is valid without hypotheses on behaviors (one calls such a formula a logical truth), whereas the equivalent formula with [F] instead of [P] is not valid over behaviors (13, p64).

We could reasonably take the tpov to be the state succeeding the accident action, supposing that we are not concerning with what happened after the accident (an assumption that is by no means always true in accident analyses). However, there is a further reason to take the tpov to be the 'snapshot' state immediately succeeding the accident. When we are constructing explanations for an event that has happened, it would be convenient if they were extensible, that is, if when we had constructed a `valid' argument, that this argument could not be vitiated simply by discovery of additional facts. If A is asserted in a temporal logic, it is being asserted at the tpov. We may have reasons for A, which typically in an accident investigation will include events that happened beforehand. Call these Reasons(A). In order that our reasons be sufficient, we need to show Reasons(A) => A in the logic. Suppose we then discover another feature of interest to us: B. Similarly we need to discover Reasons(B) and need to show Reasons(B) => B. In an accident investigation, we discover features of the accident and find reasons for them as the investigation proceeds. The aircraft engine is damaged. Was it damaged during impact, or was it damaged beforehand (and thus played some role in the accident)? If the tpov is the immediate post-accident state, we may assert the postcondition Accident-just-happened. We then discover engine damage and assert Engine-is-damaged . Thus we can assert their conjunction Accident-just-happened /\ Engine-is-damaged. Reasons for either one of these events become reasons for the conjunction, and some of these reasons might be shared: e.g, had the engine been damaged previous to the accident, it might have helped cause the accident; the accident might have caused the engine damage; or the two might be independent of one another. In any case, we assert the conjunction and the conjunction of the reasons: Reasons(A) /\ Reasons(B), and if there is any overlap between Reasons(A) and Reasons(B), that may be of interest. In other words,

Reasons(A) => A /\ Reasons(B) => B;
-------------------------
(Reasons(A) /\ Reasons(B)) => (A /\ B)

is a valid rule.

Suppose that the tpov is in the future of the accident. We may assert <P>Accident-just-happened and suppose we have found engine damage: <P>engine-is-damaged. We will likely jump to concluding <P>(Accident-just-happened /\ engine-damage), although this does not follow logically, since <P> does not distribute over /\, i.e. <P>(A /\ B) is not equivalent to <P>A /\ <P>B. However, if Reasons(A) is sufficient for showing <P>A and Reasons(B) is sufficient for showing <P>B, it no longer follows that there is any logical operator O for combining Reasons(A) and Reasons(B) that makes the following rule valid:

Reasons(A) => <P>A /\ Reasons(B) => <P>B
-------------------------------
O(Reasons(A), Reasons(B)) => <P>(A /\ B)

That is, there is no O known such that the reasons sufficient for showing <P>A and <P>B, can be combined to form reasons O(Reasons(A), Reasons(B)) sufficient for concluding <P>(A /\ B). There are good logical reasons for doubting whether such an O may be found. A similar observation may be made for any of the 'sometime' operators: neither <P> nor <F> nor <> distribute over conjunction in the logics we are considering.

Taking the tpov to be the accident postcondition therefore helps the incremental compilation of reasons: putting reasons together is just conjunction. It is important to understand that this is a purely formal point. The accident happened when it happened and we are reasoning when we are reasoning, and the actual tpov for talk about the accident is always the present moment. Whatever reasons there were for the accident and whatever evidence there is for those reasons is there anyhow. But it makes sense to choose a tpov to facilitate formal properties of the formal reasoning. Our reasoning back from the present moment to understand what was true at the tpov is also temporal reasoning, but it takes place in the meta-language, not in the formal language that we are using to express the accident reasoning. It is meta-reasoning, rather than reasoning in the language. Thus, we can choose which reasoning is to be installed in the formal language and which is to remain as meta-reasoning. It makes sense to simplify the formalities, consistent with embodying as much reasoning as possible in the formal language. Taking the tpov to be the immmediate post-accident state is an appropriate simplification.

Some Sample Reasoning

Let us develop our previous example further to illustrate some pro forma reasoning with the temporal operators. I will assume some knowledge of formal reasoning in temporal logic and will not give the individual axioms, which are standard, or show the individual steps. State predicates and actions may have similar names, so I distinguish them by always using the infix '-is-' for a state predicate in this example. I take the tpov to satisfy the accident postcondition, that is, the tpov is the snapshot immediately post-accident. We know that

Accident-just-happened

and suppose we also know that the crew bailed out before the aircraft crashed. We discover that there's something wrong with the engine, and that the aircraft attitude was not consistent with controlled flight into terrain:

engine-is-damaged /\ ~(aircraft-is-controlled) /\ <P>crew-bail-out

Recall from the previous section that we are able to assert the findings as a simple conjunction because the tpov is the immediate post-condition of the accident. As general principles, when engines are damaged, it is because sometime in the past a damaging event occurred; also, when engines become damaged, they remain damaged until they are repaired

[](engine-is-damaged => (engine-is-damaged S engine-damaged) )
[]((engine-is-damaged S engine-damaged) U (engine-repaired) )

We can logically conclude that (engine-is-damaged S engine-damaged). The question arises: what caused the engine-damaged event? Suppose we also find remains of a large bird like a pelican in the engine. After some looking around and inside, we conclude that it was sucked in during operation.

Pelican-is-in-engine S pelican-ingestion

Pelicans are large birds - we have good engineering reasons for asserting

[](Pelican-ingestion => <F>engine-damaged)

and we know from the maintenance records of the aircraft, in which there is no mention of repair after ingestion of a pelican, that

~(Engine-is-repaired S pelican-ingestion)

We cannot conclude from the engine-damage assertions or the connection of pelican ingestion with engine damage that the engine was damaged because a pelican was ingested. This is as it should be, because it does not logically follow. We must look for specific evidence that the engine damage resulted from the pelican ingestion. Engine-is-damaged is obviously too imprecise a term to function in anything other than general reasoning. Let's be more precise. What we actually saw was damage from a large bird in the engine, and we know that this happens very quickly after ingestion. Ingestion is an event, that is, an action with precondition and postcondition. We can assert that if the engine was undamaged at the time of the ingestion, that as a postcondition it became damaged. With a long enough time between snapshots, the <F> operator does not need to appear. We shall revisit this 'long-enough time' assumption below. We may assert

[](Pelican-ingestion => Large-bird-damage)
[](Large-bird-damage => engine-damage)

and thus we may conclude that the engine damage and the pelican ingestion are correlated:

(Engine-damaged S engine-becomes-damaged) S pelican-ingestion

We have not yet correlated the engine damage with the accident. We might wonder whether the engine-damage led to a loss of control. From the information we have already, we can infer

<P>(engine-damaged /\ <F>loss-of-control) \/ <P>(loss-of-control /\ <F>engine-damaged)

I concern myself briefly with the characteristics of loss-of-control. It seems to be an appropriate set of principles that during and after a loss of control the aircraft remains out of control until it's recovered or there's an accident. Also that if the aircraft was out-of-control at the accident, control must have been lost before and the aircraft remained out of control between then and the accident; that if the crew bail out, the aircraft is by definition out of control until it crashes; finally, that an out-of-control aircraft eventually crashes:

[](loss-of-control => (~aircraft-is-controlled U (Accident-happens \/ aircraft-recovered)))
[](Accident-happens /\ ~aircraft-is-controlled => (~aircraft-is-controlled S loss-of-control)
[](crew-bail-out => ([F]~aircraft-is-controlled \/ (~aircraft-is-controlled U Accident-happens)))
[](([F]~aircraft-is-controlled \/ (~aircraft-is-controlled U Accident-happens)) => (~aircraft-is-controlled U Accident-happens))

These last two principles can be written a little more briefly. The operator known as 'weak until', wU, is weaker than U. Weak-until is similar to strong-until except that the terminating condition need not happen, but if it doesn't, the persisting condition must persist for ever. A wU B is defined as [F]A \/ (A U B). We can thus write the last two principles as

[](crew-bail-out => (~aircraft-is-controlled wU Accident-happens))
[]((~aircraft-is-controlled wU Accident-happens) => (~aircraft-is-controlled U Accident-happens))

(I hasten to point out that it is normally not the case that control is lost when an engine is lost, but I recall a B1B loss at low altitude, when the aircraft encountered pelicans, the engines suffered damage and the crew bailed out, unfortunately not all successfully. So it's not a totally egregious supposition.)

Can we conclude that control was lost because the engine was damaged, and the engine-damage happened because of the pelican-ingestion? Unfortunately not. We may only conclude that a loss-of-control event and a pelican-ingestion event both occurred, and that these were sufficient, provided a recovery event did not occur, to cause a loss-of-control accident. But we also know that the crew bailed out, and at this point the aircraft was out-of-control, and remained so until the accident. We have two sufficient reasons for the accident. This is as it should be. Crudely speaking, we look to find sufficient reasons for an accident event to have occurred, and try to find as many as possible. When we have been over everything thoroughly, we may have evidence about all the possible causes that we believe there can be. But if we believe that we have enumerated all the possible causes of an accident, it is up to us to say so. Logic does not do it automatically for us. We somehow have to add information saying that these and only these events can have caused an accident. So it is in this case. We have already said all we can in logic.

Notice that we may infer

[](crew-bail-out => <F>Accident-happens)

This form of sentence occurs often in the temporal logic of systems and can be written

crew-bail-out ~> Accident-happens

The temporal operator '~>' is known as leads-to. It has been suggested that leads-to encompasses the notion of causality (7). We shall see below that this is implausible.

Inferring From Consequences to Causes

One can rarely infer directly from a state to its cause, unless there is assertion telling us that the only way that this state can have come about is via this cause. In the example, engine-damage is not the only event that can have led to loss of control, however, it is a sufficient event. The other event that led to loss of control is that of the crew bail-out. Either event by itself would have resulted in an accident. This accident was thus, logically speaking, overdetermined. However, it is plausible that the crew bailed out because control was lost. One may take this as a supposition, but of course it cannot be a part of logic. Human decision and action is error-prone, as I noted much earlier. However, the formalised assertions do indicate a line of inquiry to pursue, namely the sequencing of the engine-failure and the bail-out. If the engine failure happened first, then a crew decision to bail out is appropriate, since an accident inevitably ensues (according to our principles above) and one does not want to be inside when it happens. However, this is not sufficient to assert that the crew bailed out because they lost the engine. Maybe they hadn't noticed that the engine was lost, and they bailed out because they wanted to see what it was like, and had some story concocted for the board of enquiry that control of the aircraft had somehow been lost. In that case, it would be happenstance that the two events concurred. What conclusions could be drawn is no longer the domain of pure logic. However, the assumptions and assertions on which further reasoning is based can be added to the supply of principles we already have.

It may be remarked that this example is not unusual in one respect: an accident investigation will often uncover evidence of many sufficient or almost-sufficient causes.

Refining the Granularity of Snapshots

One of the assumptions made in the example was that of almost-simultaneity: simplifying the exact

[](Pelican-ingestion => <F>engine-damaged)

into the less-exact but more inference-rich

[](Pelican-ingestion => Large-bird-damage)
[](Large-bird-damage => engine-damage)

This step was justified by noting that if one takes the snapshots sufficiently far apart, say the order of a few tenths of a second, the post-condition of engine-damage is fulfilled at the second snapshot just as the postcondition of pelican-ingestion is, and therefore the snapshot-pair of states is in both relations. If the snapshots are a few microseconds apart, however, the postcondition of engine-damage may not yet be fulfilled, the latter statements would be false and only he former statement, inferring engine damage in the future of pelican-ingestion, would be true. Hence the validity of these useful principles depends on the granularity of snapshot succession. We can ensure these real-temporal dependencies of the principles by using the `old-fashioned recipe' of (23) for real-time in a temporal logic with this ontology. This method introduces a certain amount of technical apparatus. A special variable now is introduced as taking a value in each state. An axiom is introduced assuring that now increases unboundedly throughout a behavior (behaviors that do not fulfil this condition are called `non-Zeno' in (23) because of their association with the paradoxes of Zeno). One may now introduce bounded future and past operators. Let us assume for simplicity that units of time have been set implicitly, so that natural numbers uniquely identify intervals of time of fixed length: say, natural numbers correspond to microsecond units. Then bounded operators [F](<= n) and <F>(<= n) may be defined. For example, the operators that assert what happens within 10 microseconds of now are

[F](<= 10) A which is defined as (A U 10)
<F>(<= 10) A which is defined as ~(~A U 10).

and similarly for past-time operators. Define also

(10-step) as the action (now' >= 10)
Then the appropriate way to formulate the gross-snapshot assertion is

[](1000-step /\ Pelican-ingestion => Large-bird-damage)

ensuring that any 'step' that is sufficiently large enough will render the pair of snapshots in the Large-bird-damage relation as well as in the Pelican-ingestion relation. The assertion that Large-bird-damage is engine-damage is of course simply a classification of types of events, rather than an assertion of temporal consequence and therefore remains as it is.

What Has Been Clarified?

It is reasonable to ask what has been clarified by the above example. It may seem to some readers that many trivial assertions and trivial reasoning amongst these assertions have been formulated in temporal logic. We have learnt nothing that we did not know before and have done nothing that we could not do before. What has this formalisation therefore gained us? The answer, hopefully, certainly the goal, is that precisely this 'everyday' reasoning has been formalised. We have seen which temporal operators seem to be needed, and how these operators are used to state what we want to say. A look at the formalisation allows us to identify certain kinds of principles which seem to recur. For example, there are actions such as A that lead into a state partly described by state predicate S, which persists until some other action B:

[](A => S')
[](S => (S U B))

For example,

A S B
crew-bail-out ~aircraft-is-controlled Accident-happens
loss-of-control ~aircraft-is-controlled Accident-happens
engine-damaged engine-is-damaged engine-repaired

Experience shows that reasoning with assertions with similar logical features usually follows patterns. These patterns can be clarified. Thereby we learn more about how this reasoning may be pursued, and it becomes pro forma. In formalising reasoning, one must also make explicit under what assumptions and following what principles it is valid. Clarification of the assumptions underlying certain forms of reasoning is a worthwhile goal. Only by making these assumptions explicit and analysing the principles involved can we be assured that applying these forms of reasoning to new situations will lead to valid conclusions. Conclusions that are invalid may often be turned into conclusions of valid rules by adding hypotheses. These hypotheses may rightly be regarded as 'hidden' until they are explicitly identified. Finally, if the reasoning is analysed and turned into valid forms according to understood principles of logic, automated reasoning tools may be built which enable one to follow the consequences of reasoning without being bothered so much about typograpical mistakes or spending time with repetetive forms of inference. Finally, formalisation puts the reasoning on a basis with which all may in principle agree or disagree, and therefore allows participants to avoid unresolvable controversy.

Does clarification come at the expense of complexity? It may seem so. But first of all, the complexity is created by the necessity to be explicit about reasoning. Much of the complexity cannot be avoided except at the expense of rendering the reasoning once more informal and therefore subject to disagreement and also error. Second, much of it is pro forma and therefore is amenable to a degree of automation. A quarter century of automated theorem proving has accumulated experience with the organisation of seemingly complex but pro forma reasoning. Third, formal reasoning seems to me very much like programming. One builds formal arguments as one builds programs, and one has similar kinds of challenges organising the information (24). But we still regularly build programs of many thousands and sometimes millions of lines of code. Information-hiding and variable-scoping techniques exist, and I see no fundamental reason why proof engineers should not regularly build formal proofs of similar complexity. Providing, of course, that we give them accurate formal reasoning rules.

Deriving Fault Trees From Temporal Logic: Another Example

Fault trees have often been used as reasoning tools when attempting to analyse failures. An accident is a particular type of failure. If temporal-logical reasoning could derive fault tree analyses, it would demonstrate that it extends the formal reasoning already used about failures. I use an example to show how a fault tree may be derived.

I consider a simple reservoir vessel with a regular inflow of liquid coolant, and an outflow valve that lets the coolant flow out when too much of it is stored. There is a high-coolant-level sensor, which actuates a switch, which actuates a valve which opens or closes the drain aperture. Thus, there is one component (in the large) that can malfunction: the drain; and there are three components to the drain. A fault tree (1) for this system with top `event' Overflow has one subsidiary condition Drain-malfunction, which has three disjoint subsidiary conditions Sensor-malfunction, Switch-malfunction and Valve-malfunction. For simplicity, I will take the communication lines connecting these devices to belong to a specific component, so that if a wire malfunctions, it is a component malfunction. How do we know that this fault tree is representative or correct? We shall derive it as a consequence of a description of the system in temporal logic.

First, I state a claim about the flow rate of coolant into the vessel. If the level of the coolant is less than the sensor-height, then for at least three seconds the coolant level will remain below the overflow level. This can be formalised as follows, using the real-temporal logic operators introduced in the previous section and the `old-fashioned' approach to real-time (23), as follows.

I introduce two notations. First, I shall write conjunctions and disjunctions frequently as 'bulleted lists' (25), which make complicated formulas easier to read. Thus

A /\ B /\ C
is written as
/\ A
/\ B
/\ C
and
A /\ (B \/ C) /\ D
is written as
/\ A
/\ \/ B
\/ C
/\ D
Second, I define the operator =>(< n sec):

A =>(< n sec) B
is defined to be
A => (B /\ now' < now+n sec)

The operator =>(< n sec) allows actions to be defined without the occurrence of temporal operators in the action. This property of action definitions was found to be of use in verification (9) and so we follow this formulation here, rather than use the until operator, [F](<= n) A ((A U 10)) as before.

NormalFlowRate ==

(LevelOfCoolant < SensorLevel) =>(<3 sec) (LevelOfCoolant' < OverflowLevel)

Using the definition of =>(< 3 sec), the definition of the action NormalFlowRate is seen to be shorthand for

/\ (LevelOfCoolant < SensorLevel) => (LevelOfCoolant' < OverflowLevel)
/\ now' < now + 3

The sensor, switch and valve assume different values in different states, and therefore are flexible variables. Each has two valid values: the sensor may be "off" or "signal", the switch "on" or "off", and the valve position "open", "closed" or "transit. The sensor and switch are thus binary-valued variables, like a single bit. But what is to ensure that only these two values are taken by the variables? In reality, an arbiter. In logic, an axiom:

ArbitersFunction ==

[](/\ Sensor = "signal" \/ Sensor = "off"
/\ Switch = "on" \/ Switch = "off"
/\ Valveposition = "open" \/ Valveposition = "closed" \/ Valveposition = "transit" )

Next I define the actions of the sensor, switch and valve. The action of the sensor is to signal if the level of coolant is greater than or equal to the max, and to be "off" if the level is below the max. We allow a twentieth of a second between snapshots

SensorFunction ==

/\ \/ LevelOfCoolant > max
\/ LevelOfCoolant = max
=>(< 0.05) Sensor' = "signal"
/\ LevelOfCoolant < max =>(< 0.05) Sensor' = "off"

The action of the switch is to turn on if the sensor is at signal, and to turn off if the sensor is off. To formulate this

SwitchFunction ==

/\ (Sensor = "signal" =>(< 0.1) Switch' = "on"
/\ (Sensor = "off" =>(< 0.1) Switch' = "off"

The valve must similarly react to the switch within half a second. The valve has three possible states, one of which is "transit", and we may informally presume that the valve is in transit during the half second it takes to change position from open to closed or vice versa.

ValveFunction ==

/\ (Switch = "on" =>(< 0.5) ValvePosition' = "open"
/\ (Switch = "off" =>(< 0.5) Valveposition' = "closed"

We can derive now a fault tree logically from this formalization. The symbol "|-" means "it is provable that" and indicates that what follows it is a logical theorem.

|- /\ Overflow
/\ NormalFlowRate => (LevelOfCoolant > SensorLevel S -3sec)
|- /\ Overflow
/\ NormalFlowRate
/\ SensorFunction => (Sensor = "signal" S -2.95sec)
|- /\ Overflow
/\ NormalFlowRate
/\ SensorFunction
/\ SwitchFunction => (Switch = "on" S -2.85sec)
|- /\ Overflow
/\ NormalFlowRate
/\ SensorFunction
/\ SwitchFunction
/\ ValvePosition => (ValvePosition = "open" S -2.35sec)
............ => (LevelOfCoolant > MaxLevel S -2.35sec)

Thus we have shown

[]Safety /\ Overflow => []Safety /\ (Overflow S -2.35sec)

I need now to use the following valid rule of the since operator for behaviors:

n :in: PositiveReals
Q => (Q S -n)
-------------
[P]Q

where Q is a state predicate, n is a rigid variable, and PositiveReals are the positive real numbers, we may derive the valid rule

n :in: PositiveReals
[]A /\ Q => []A /\ (Q S -n)
------------------
[]A /\ [P]Q

and thus infer that

[P]Overflow

and thus that

<Init>Overflow

where <Init> is an operator whose semantics is that <s.0, s.1, ...., s.(k-1)> |- Q if and only if Q is true in state s.0, the first state of the behavior.

The assertion <Init>Overflow contradicts ~<Init>Overflow, that Overflow is not true in the first state. Thus are the safety conditions

NormalFlowRate /\ SensorFunction /\ SwitchFunction /\ ValvePosition /\ArbiterFunction

inconsistent with Overflow. The negation of each one of the safety conditions is thus a node of a fault tree whose top node is Overflow. This fault tree says that if there is an overflow, either the normal flow rate has violated, or one of the sensor, switch and valveposition functions has been faulty, or the arbiters failed:

Overflow => \/ <P>~NormalFlowRate
\/ <P>~SensorFunction
\/ <P>~SwitchFunction
\/ <P>~ValvePosition
\/ <P>~ArbiterFunction

Each one of the fault modes can themselves serve as the top of a fault tree, without the operator <P>. To 'attach' this subtree to the appropriate leaf of the original, the formulas at the child nodes may be simply prefixed with <P>, remembering that those already prefixed with <P> remain unchanged (since <P><P> is equivalent to <P>).

In the case of the aircraft accident, we were interested in discovering the causes of the accident, in eliciting evidence that certain causal events occurred or not. For the vessel, we can consider ourselves to control the environment considerably more, because we have engineered it ourselves, and know in greater detail the causal connections therein. It is notable that we have formally derived the top level of a fault tree analysis from our logical formulation of the top-level function of the drain mechanism, without using assumptions about the environment. These assumptions are normally formulated as a `frame axiom'. A `frame axiom' for the vessel drain would say that the only way that the valve changes position is when triggered by the switch; the only way the switch changes is because of the sensor, and the only way the sensor changes is because of the level.

FrameAxiom ==

/\ ~(Valve' = Valve) => ~(Switch' = Switch)
/\ ~(Switch' = Switch) => ~(Sensor' = Sensor)
/\ ~(Sensor' = Sensor) => ~(LevelOfCooland' = LevelOfCoolant)

The problem with frame axioms is in ensuring that they are really true of the system being described. Are we really sure that the switch can only change value through the sensor, and not, say, because somebody is forcing the contacts shut with a screwdriver, or because there was an explosion nearby? Obviously not. Therefore, a frame axiom is an only an approximation to the general correct behavior of the drain. But the point of a fault tree is to be a logically absolute device, valid no matter what the real behavior. Its function is to show us, when the drain goes awry, that there is a problem precisely here or here or here, no matter what. I thus consider it a positive feature of the formalisation above that the fault tree can be derived without use of a frame axiom.

This example has shown how straightforward conditions on physical devices may be formalized in a temporal logic with past time operator, taking the tpov to be the accident time. The formalization makes essential use of past-time operators, as well as the Since operator. Furthermore, we made use of a valid past-time inference rule essentially involving Since. This reasoning enabled us to derive the top node and first level of a fault tree for the Overflow condition, the nodes in the fault tree themselves labelled with assertions in temporal logic, without using a frame axiom. The translation is natural and powerful.

Causality and ~>

The idea is that a sensor triggers the switch, moving it to the alarm position, and that this move should then inevitably trigger the valve-closes action. Inevitably, but perhaps not simultaneously - there could be a slight delay. For accident analysis, the delay may or may not be critical, but the inevitability of the consequence is important. If we want to say that throwing the switch causes the valve to close, we can say that throw-emergency-switch is inevitably followed by a valve-closes action. Can we specify this in a temporal logic? The two operators "always from now on", []; and "sometime now or in the future", <> have been used to propose such a specification (e.g., (7)). Such a proposal goes like this: it's plausible to translate `A inevitably follows B' to mean that whenever B occurs, A must occur sometime either now or in the future. So to say "if B occurs, then A inevitably follows" would be to say (B =><>A), and to say `whenever' would be to say `at all states now and in the future':

[](B => <>A)

This formula is often written

B ~>A

and read "B leads to A". Anomalies arise if we try to take this operator to formalise the causal relation.

First, let A be a temporal-logical truth, that is, any formula of temporal logic that is true in all states in all behaviors. Then for any B whatsoever, B ~>A. But intuitively, logical truths aren't caused by anything, let alone features of the world expressed by arbitrary propositions. The standard phrase is that logical truths are true by virtue of their form alone. Hence the ~ relation includes pairs of assertions that are not in the causal relation

Second, and similarly, if a pilot pushes the throttle levers forward (B), then simultaneously his hand moves forward in relation to his body (A). Both B ~>A and A ~>B are true. (Do both these events cause the other? They are clearly in some sort of causal relation, but what?) For any simultaneous A and B therefore, B ~>A /\ A ~>B. However, causality is standardly taken to be asymmetric, that is, B causes A => ~(A causes B). The property of symmetry for simultaneous events of ~> stands in stark contrast to the asymmetry of causality.

Third, it's often taken to be a standard property of causality that events do not cause themselves. For any event A, let B be a simultaneous event. Then, as above, A ~>B /\ B ~>A. It is easy to check that ~> is transitive (that is, (A ~>B /\ B ~>C) => A ~>C). Therefore by transitivity A ~>A. That is true every event that is simultaneous with some other event (that goes for all events it seems, but ultimately would depend on one's particular view of what an event is). The reflexivity of ~> is in stark contrast to the irreflexivity of causality.

Fourth, A ~> now' > now for any A. However, it's normally understood that the advancement of time is not caused by any material fact, let alone by all of them. This property of causality stands in stark contrast therefore to the property of ~>.

Fifth, it has been held by some, for example Lewis (26), that the relation of causality is not transitive. Lewis explains causality by counterfactual dependence. Counterfactual dependence is not transitive. However, there are arguments for thinking that the type of causality involved in accident analysis (why?....because causality) is indeed counterfactual dependence (5). The intransitivity of counterfactual-dependence contrasts with the transitivity of ~>.

Sixth, a non-formal difficulty. Humean views of causality have not taken it to be a logical property at all (28). Any suggestion that it is must answer the difficulties that Hume described that are still outstanding.

The chances thus look poor that ~> could formalise a causal relation. Many essential logical properties of ~> simply contrast starkly with those of causality. Furthermore, there are classical reasons for thinking that causality cannot be a logical relation at all. This does not mean that a binary relation, intended to be the causal relation, along with appropriate axioms, could not be added to a logic (some approaches to the fundamental problems of situation-calculus planning in AI do exactly this). For example, it would be plausible to take as an axiom that

(B causes A) => (B ~> A)

which would suffice for much reasoning about accidents in a temporal logic. However, chances of an identification look negligible.

Moving Beyond Temporal Logic

Suppose we were to be able satisfactorily to express all relevant accident components in temporal logic. That would be tantamount to claiming that all we need to do to explain an accident is to describe the events, order them in their causal sequence (or maybe using ~>, as above), see if we can derive the causal progression from a few "catastrophic" events, identify the "main causes" - some particular events taken to be particularly crucial, maybe - and draw conclusions about how to avoid this sequence of events in future. But that is not all that is involved in explaining an accident. And chances are slim that this rest could be formulated in a temporal logic.

Charles Perrow's Normal Accidents (29) attempts to identify factors that lead to accidents, and to clarify their explanatory role. He considers industrial accidents in many different fields (including nuclear power and aircraft, like Johnson and Telford) and proposes an analysis method which reckons amongst the most influential amongst academic specialists. The trouble is that his explanatory notions do not obviously appear to fit within the expressive power of temporal logic as we have so far considered it.

Perrow has two notions in particular, that of a system with `interactive complexity' and that of `tightly-coupled' versus `loosely-coupled' systems. (Interactive complexity is also studied in (30) under the name of 'Vernetztheit'). A system which is interactively complex is one in which it is very hard to describe, and therefore to see, the consequences of a given event. This happens often when the causal chains are long and there are many of them but can also happen simply if the causal connections are obscure (30). Most of us have experienced interactive complexity with computer software. When something goes wrong, all kinds of funny things start happening and it's hard both to predict what will happen, and what the root cause was. We admire the geniuses who can look at the source code and determine the interactions and we give them good jobs and lots of money. Tight coupling can be best explained using another metaphor. Think of a row of dominos standing on edge next to each other. As the end domino is pushed over, the next is hit and falls: all the dominos are knocked over in a row and eventually they all lie flat on the table. However, if the upright dominos are placed sufficiently far from each other, then toppling one domino will not affect the next, and the other dominos remain standing. This, in contrast, is a loosely-coupled system. Perrow shows how the tight-coupling and interactive complexity of a variety of systems, from nuclear power-plants to marine navigation to an air traffic control system, contribute to the genesis of accidents. Dörner has conducted experiments to show how people try to handle interactively-complex and tightly-coupled systems, and to explain their behavior. Perrow strongly makes the point that explanations of an accident should mention coupling and the interactive complexity, if it's there, as essential contributions to an accident. Dörner's experiments back up this view.

The domino example yields a suggestion as to how we could check a temporal-logic-based description of a system accident to see whether tight-coupling played a role. We could express the causal relation between domino number N and its successor by

domino number N falls ~> domino number (N+1) falls

and then we can observe the tight coupling by noticing the length of the `causal chain'

domino number 1 falls ~> domino number 2 falls
domino number 2 falls ~> domino number 3 falls
domino number 3 falls ~> domino number 4 falls
......
the penultimate domino falls ~> the last domino falls

This tightly-coupled system is not interactively complex. Even though the causal chain can be very long, it's nevertheless easy to comprehend how the system works and very easy to see the consequence of toppling the first domino, and therefore we would be uninclined to describe it as complex.

An example of a system which is interactively complex could be a system of flight interconnections at a major hub airport. If Birdseed Airlines jumbo flight 55 is 15 minutes late into the hub, are its 400 passengers going to make their connections or not? It depends on which of those connections may themselves be delayed, and which departure times are close to and which further from the arrival of Birdseed 55. Let's suppose each of Birdseed's passengers has a flight of four legs of which this is the first. Putting airplanes late into the air traffic control system is going to affect arrival slots at destination airports further along the chain, and how many flights air traffic control can handle. The consequences of Birdseed's arrival are manifold and probably incalculable, although for each single one of Birdseed's passengers the causal chain is easy to calculate, since it isn't very long.

How could formalisation in temporal logic help? Suppose that we build a relation database, a graph, of causality amongst the events of an accident (as suggested in (4)). Then to demonstrate tight coupling, we could look for very long chains (paths in the graph). To demonstrate interactive complexity, we could look for causal trees -- events with many consequences, each of which has themselves many consequences, each of which ..... and so forth. If the trees are `bushy', we can conclude that there's interactive complexity. These formal properties of the graph a computer would find relatively easy to check.

An analysis of this sort does not fit within the syntax of a temporal logic. Assuming we formalise accident events in temporal logic similarly to the proposal above, the analysis of tight-coupling and interactive complexity would involve reasoning about sets of sentences and their logical interconnections: in other words, meta-linguistic properties of the language as proposed above. These meta-properties are properties of the sets of formulae describing events, properties such as length of paths through relations, and the width of trees. Although Gödel has shown how some - but not all - relevant metalinguistic properties can be formalised within a sufficiently rich logic, which establishes the possibility in principle, it is not at all clear that it would be advantageous to do this. It would complexify the logic to little additional purpose. What would be the advantages over leaving the analysis as meta-linguistic, separate but complimentary?

Perrow also employs a decomposition of a system called DEPOSE (for Design, Equipment, Procedures, Operators, Supplies and materials, and Environment). There are some aspects of a DEPOSE analysis whose facile expression in temporal logic escapes me as yet. Many of these involve human factors, which have not been the topic of this essay. For example it is often the case in accidents that a series of operator actions did not conform to a particular operating procedure, or that an operating procedure did not exist, or was ambiguous. There is no vocabulary for formalising any of this in pure temporal logic. Perhaps one could use a deontic logic which expresses a required procedure as a sequence of `should' statements concerning actions: one can write `It Should be the case that A' for a required action and then can express some operator failures as `Should A but not A'. Formally, we can write

[[]]A /\ ¬A
It turns out that a deontic logic is formally similar to a temporal logic, so it should not be too hard to mix the two. Wieringa and Meyer (31) have explored the possibility of using deontic logic along these lines as a means of stating requirements in software engineering. Their work focuses on what the system should do, rather than what an operator should do, but formally there seems to be little difference in analysis. The use of a modal logic similar to deontic logic for expressing causality within a temporal logic framework was also considered in depth by von Wright (32).

To show how important these extensions are, consider an example of an explanation which involves tightly-coupled design with operator actions and maintenance procedures. NTSB report NTSB-AAR-79-17 (19) concerned the 1979 Chicago DC-10 crash, and the McDonnell Douglas Corporation produced a report (20) arguing their position on the accident. I discussed this incident in (3). A DC-10 lost an engine on take-off from Chicago O'Hare airport - and I do mean `lost an engine' - it fell off, unfortunately taking some hydraulics with it and causing certain high-lift devices on that wing to stow themselves. The wing stalled, and the airplane flipped over and flew into the ground. The entire incident was captured on video by an unfortunate onlooker whose relatives were on the plane. There ensued many public discussions about the safety of the DC-10.

The National Transportation Safety Board is charged with investigating all aircraft accidents inside the United States. The Board's report concluded that causes of the crash were an asymmetrical stall and the ensuing roll because of the uncommanded retraction of the left wing's slats and the loss of the relevant cockpit indicators, resulting from damage caused by improper maintenance procedures leading to failure of an engine pylon and structure at a critical point in take off.

The terminology is curt but expressive. What is a `critical point'? The pilots were more preoccupied than they would have been in, say, cruise, and therefore their ability to diagnose the failure was impaired. McD says, and the NTSB agrees, that the airplane was in principle controllable. However, because the crew were occupied with the takeoff, one of the most intense phases of flight for the crew, they may have had less opportunity to devote immediate and undivided attention to diagnosing the failure, because they had to ensure all other factors remained under control as well.

The discussion of the phenomena surrounding stall onset, and the crew's likely state of knowledge, and reactions, and conformance with procedure, is particularly challenging for those who are contemplating a formalisation of accident causes. It is hard to see how a discussion of maintenance procedures, how they were violated, and how this contributed to failure of the pylon and structure at the exact moment it did (takeoff produces maximum stress on the airframe and engine, says engineering folk wisdom) could be formulated in pure temporal logic. Yet both these situations are crucial to an understanding of the accident.

I conclude that extensions, at least to allow criteria corresponding to Perrow's concepts to be analysed, and to allow encoding of procedures and failure to follow them, are essential. I do not see how to encode these using pure temporal logic rules alone.

Conclusions

Reasoning about accidents in a temporal logic involves a basic ontology of states, state variables and sets, from which one can build all needed objects. A temporal point of view (tpov) must be chosen, and for both intuitive and formal reasons I argued that taking the tpov to be the state at which the accident had just happened is preferable. Reasoning about the past of the accident event is involved, and if the tpov is the immediate aftermath, this is simple reasoning about the past.

I observed there are logical truths (and therefore reasoning rules) about the past whose correlates about the future are invalid. Therefore, we have more logic inference power about the past than about the future. Much reasoning turned out to be reasoning about the persistence of conditions between events, which prima facie requires (in the past) the since operator, and (for the future) the until operator, which are known to be stronger than the [] and <> operators of simple temporal logic.

Reasoning from events to their causes and causal relations is not purely logical - some inference from conditions to the only possible ways these conditions could have come about is required, and this may require some non-logical axioms (such as 'frame axioms'). However, fault trees have been used to reason from failures to their possible causes, and I showed how a fault tree may be formally derived from formal requirements of a system (a vessel holding coolant) without the use of frame axioms. I used a valid rule for since whose correlate for until is not valid.

I raised serious objections to the proposed identification by some of causality with the temporal-logical ~ operator.

Finally, I considered some meta-temporal-linguistic concepts, interactive complexity or Vernetztheit, and tight-coupling, that have been found useful if not essential in analysing accidents, especially with regard to human behavior and other human factors, that it makes little sense to try to formulate within a temporal logic. I suggested it would be preferable to leave such analyses separate but complementary.

I conclude that temporal logic helps greatly in formalising accident reasoning, although much remains to be discovered about the logic that it is most natural to use, and about its use. However, temporal logic does not formalise the crucial notion of causality (although causality implies temporal-logical ~), nor does it yet seem to make sense to formalise Perrow's and Döner's analyses within a temporal logic.


References

References link back to the first mention.
Back to top

(1): W. E. Vesely, F. F. Goldberg, N. H. Roberts and D. F. Hassl, Fault Tree Handbook, NUREG-0492, U.S. Nuclear Regulatory Commission, Washington, D.C., January 1981. Back

(2): G. Bruns and S. Anderson, Validating Safety Models with Fault Trees, in Janusz Gorski, ed., Proceedings of the 12th International Conference on Computer Safety, Reliability and Security SAFECOMP'93, pages 21-30, Springer-Verlag, 1993. Back

(3): P. B. Ladkin, Formalism Helps in Describing Accidents, in http://www.rvs.uni-bielefeld.de/~ladkin/. Back

(4): Peter Ladkin, The X-31 and A320 Warsaw Crashes: Whodunnit?, in http://www.rvs.uni-bielefeld.de/~ladkin/. Back

(5): Peter Ladkin, Reasons and Causes, in http://www.rvs.uni-bielefeld.de/~ladkin/. Back

(6): Jonathan Moffett, Jon Hall, Andrew Coombes and John McDermid, A Model for a Causal Logic for Requirements Engineering, Journal of Requirements Engineering 1(1):27-46, March 1996. Also in ftp://ftp.cs.york.ac.uk/hise_reports/req_capture/causal.ps.Z. Back

(7):C. W. Johnson and A. J. Telford, Extending the Application of Formal Methods to Analyse Human Error and System Failure During Accident Investigations. Back

(8): Leslie Lamport, The TLA Home Page, http://www.research.digital.com/SRC/tla/. Back

(9): Leslie Lamport, The Temporal Logic of Actions, ACM Transactions on Programming Languages and Systems 16(3):872-923, May 1994. Also in (8). Back

(10): P. B. Ladkin, Some Dubious Theses in the Tense Logic of Accidents, Research Report RVS-RR-96-14, at http://www.rvs.uni-bielefeld.de. Back

(11): Michael Dummett, Truth, in Truth and Other Enigmas, Duckworth and Harvard University Press, 1978. Back

(12): Susan Haack, Philosophy of Logics, Cambridge University Press, 1978. Back

(13): Robert Goldblatt, Logics of Time and Computation, CSLI, Stanford, California, 1987. Back

(14): Penelope Maddy, Realism in Mathematics, Oxford University Press, 1990. Back

(15): Jon Barwise, Mathematical Proofs of Computer System Correctness, Notices of the American Mathematical Society 36(7):844-851, September 1989. Back

(16): Leslie Lamport, Hybrid Systems in TLA, in R.L. Grossman, A. Nerode, A.P. Ravn and H. Rischel, eds., Hybrid Systems, vol. 736 of Lecture Notes in Computer Science, Springer-Verlag, 1993. Also in (8). Back

(17): Amir Pnueli, The temporal logic of programs, In Proc. 18th IEEE Symposium on the Foundations of Computer Science, pages 46-57, Providence, RI, 1977. Back

(18): Zohar Manna and Amir Pnueli, The Temporal Logic of Reactive and Concurrent Systems, Volume 1: Specification, Springer-Verlag, 1992. Back

(19): National Transportation Safety Board, Aircraft Accident Report, American Airlines, Inc. DC-10-10, N110AA, Chicago-O'Hare International Airport, Chicago, Illinois, May 25, 1979., Report NTSB-AAR-79-17, NTSB, Washington, DC, 1979. Also in (21). Back

(20): McDonnell Douglas Corporation The DC-10: A Special Report, McDonnell Douglas Corp., 1979. Also in (33). Back

(21): John H. Fielder and Douglas Birch, eds., The DC-10 Case, State University of New York Press, 1992.

(22): J. A. W. Kamp, Tense Logic and the Theory of Order, Ph.D. Dissertation, University of California at Los Angeles, 1968. Back

(23): Martin Abadi and Leslie Lamport, An Old-Fashioned Recipe for Real Time, ACM Transactions on Programming Languages and Systems 16(5):1543-1571, September 1994. Also in (8). Back

(24): P. B. Ladkin, Formal But Lively Buffers in TLA+, in http://www.rvs.uni-bielefeld.de/~ladkin/. Back

(25): Leslie Lamport, How to Write a Long Formula, Formal Aspects of Computing 6:580-584, 1994. Also in (8). Back

(26): David Lewis, Causation, Journal of Philosophy 70, 1973, 556-567. Also in (27), 193-204. Back

(27): Ernest Sosa and Michael Tooley, eds., Causation, Oxford Readings in Philosophy Series, Oxford University Press, 1993.

(28): David Hume, Treatise of Human Nature, Book I, part 3, Sections 1-6, 11, 12, 14, 15, London, 1739; and Enquiry Concerning Human Understanding, Sections 4 and 7, London, 1748. Back

(29): Charles Perrow, Normal Accidents. Living With High-Risk Technologies, Basic Books, 1984. Back

(30): Dietrich Dörner, Die Logik des Misslingens: Strategisches Denken in komplexen Situationen, Rowohlt Verlag 1989, Rohwohlt Taschenbuch Verlag 1992, translated in English as The Logic of Failure, Metropolitan Books (Henry Holt), New York, 1996. Back

(31): J.-J. Ch. Meyer and R. J. Wieringa, Deontic Logic in Computer Science: Normative System Specification, Wiley and Sons, 1993. Back

(32): G. H. von Wright On the Logic and Epistemology of the Causal Relation, in P. Suppes, ed., Logic, Methodology and Philosophy of Science IV, North-Holland 1973. Also in (27). Back


Notes

[1]: Alternatively, one may construct a third primitive category S of unanalyzed objects called states, and specify a collection F of partial functions from the set of states to values. To every state s one may associate the set of all values f(s) where f is a function in F and f(s) is defined. However, since states in TLA cannot be different-but-undifferentiable, as can easily be seen from the `set' definition of state, one must also require that if s and t are two different states, there is a function f in the collection F that has different values on s and t. Thus the collection F must be `rich enough'. This is an extra constraint that simply doesn't appear in the explicit set-theoretical construction of states, which may be a reason for preferring the set-theoretical construction, even though both ways define isomorphic ontologies. Back to text.

[2]: There is little disagreement since Russell and Whitehead, indeed since Frege, that most of the analysis required for physics can be formulated in set theory (although it took until the twenties for set theory to be formulated in its modern form as Zermelo-Fraenkel set theory). There are those who wish to base mathematics on other foundations, such as category theory or the theory of types. Such people are mainly constructivists, but also include those who think that categories are more "natural" for mathematics. Constructivists would probably disagree with the broadly classical (if not Platonist) view of logic taken in this paper. Others who prefer category theory for aesthetic reasons would not disagree that ZF set theory provides enough mathematical mechanisms for me to accomplish all that I need to in this paper. Back to text.