Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Formal reasoning is a representation of either ordinary or specialized reasoning on some specific subject, provided the actual reasoning could be translated into a framework allowing a calculus for copying with it.

To do a formal reasoning is required, first of all and necessarily, counting with some mathematical framework where the reasoning could be translated into the calculus, and according to the reality existing behind it. The framework corresponds to the kind of reasoning to be translated into it, and allowing, as much as possible, its reproduction with the calculus; that is, from the characteristics the corresponding situation can show, and by fixing the basic properties or laws, the involved terms should verify when symbolically translated into the framework, that is, under the supposition that the chosen symbols and laws between them faithfully translate their meanings in the actual reasoning. In such a sense the framework should be as “natural”, or suitable, as possible for each specific kind of reasoning.

Formal reasoning is, in the end, only a mathematical model of some particular specialized type of reasoning on something; hence, there is not exactly a single type of formal reasoning, but several mathematical models of it. At each specialized form of reasoning, it is supposed that the semantics of what is modeled is well translated into the corresponding mathematical model; for it, the internal laws of the representation’s framework should be established according to what is recognized in the actual and external reasoning and its context. Sciences compact in artificial languages what, thought in plain language with scientific concepts, is considered basic for the corresponding subject and for formally developing the reasoning on it.

In what follows, the models for reasoning with precise words, and with both precise and imprecise words (and later in Part II, with the specialized reasoning physicists conduct on the quantum microworld), are considered. Basically, formal reasoning refers to mathematically formalized deductive reasoning, even if in the first two cases some hints regarding ordinary reasoning are presented. In addition, it should be noted that in these three cases, the corresponding inference relation is represented by a partial order verifying the transitive law; hence the former results requiring transitive triplets always hold. In addition, in these cases, the negation is usually presumed to be strong, that is, verifying (p′)′ ≈ p; thus also those results requiring one of the laws p ≤ (p′)′ or (p′)′ ≤ p also hold.

11.1. The mathematical framework credited as the undisputed one for classical reasoning with precise words is the theory of sets, that is, the structure of a Boolean algebra, as stated by Marshall Stone’s characterization theorem of Boolean algebras in 1936. Such a framework comes from the “specification axiom” under which a precise word P acting in a universe of discourse X specifies a subset of X consisting in those x for which “x is P” holds, with its complement subset containing those x such that “x is P” fails; statements can only be either true or false.

In this case, the operations translating the linguistic conjunction (·), the disjunction (+), and the negation (′), are supposed to verify all the laws of a Boolean algebra, that is, of a distributive lattice with a single strong negation. These laws affect all the statements composed by means of such connectives; the model presupposes that in the corresponding language all Boolean laws hold. In particular, it holds the law of perfect repartition p = p · q + p · q′, a law that jointly with the negation, the commutative, and associative laws of +, and the conjunction defined by duality, p · q = (p′ + q′)′, characterizes Boolean algebras as Edward V. Huntington proved in 1933. Of course, the laws derived from those that characterize Boolean algebras also hold; for instance, the represented statements are presumed to verify (p · p) + (q · p) = p + q · p = p, because q · p ≤ p; (p · q′) · (p′ · q) = (p · p′) · (q′ · q) = 0 · 0 = 0; p + p = p, and so on.

In this case, if/then statements, the conditional ones, are supposed to coincide with “negation of antecedent, or consequent”; that is, “if p, then q” (p → q) is presumed to coincide with p′ + q, that, provided the algebra were complete, in its turn coincides with Sup {z; p · z ≤ q} as has been formerly shown. Hence, the truth values, t(p → q), are equal to t(p′ + q) = max (1 − t(p), t(q)), that equals 1 if and only if it is t(p) = 0, or t(q) = 1; the conditional only holds provided the consequent were to hold or the antecedent fails.

It is said that a linguistic statement p is a tautology when it is p = 1 (the maximum of the lattice); nevertheless, it can be statements q that, not being a tautology, have truth value equal to one, t(q) = 1. An if/then statement p → q represents a tautology if p′ + q = 1 that, as was shown, is equivalent to p ≤ q, the partial order of the Boolean lattice’s part, defined by p · q = p, or equivalently by p + q = q. For instance, p + p′ = (p · p′)′ is a tautology, as well as are all linguistic statements whose translation into the algebra is represented by p′ + p · q + p · q′ = p′ + p · (q + q′) = p′ + p = 1, but are not a tautology those whose representation is p + p · q + p · q′ that are equal to p + p = p.

It should be pointed out that the idempotent laws of conjunction and disjunction, respectively, p · p = p, and p + p = p, imply t(p · p) = F(t(p), t(p)) = t(p), and t(p + p) = G(t(p), t(p)) = t(p), showing that F = min, and G = max, are, at least, suitable commutative and associative solutions of these equations, under which t(p · q) = 1 ⇔ t(p) = t(q) = 1 and t(p + q) = 1 ⇔ t(p) = 1 or t(q) = 1.

Notwithstanding, in the case of lattices, and Boolean algebras in particular, it can be proven that with t ranging in [0, 1], the only admissible pair (F, G) is (min, max); where, for instance, with F = prod, and p · p = p, from t(p) = t(p) · t(p) either t(p) = 0, or t(p) = 1 follows, and true or false statements will only exist, but not a single q with 0 < t(q) < 1.

What about the basic point, in inference, with p → q = p′ + q? That is, what can be said when the set of premises is P(→) = {p, p → q} whose résumé is p · (p → q)? This, of course, supposes that p · (p → q) is not self-contradictory, which now simply means p · (p → q) ≠ 0, because in Boolean algebras it is r ≤ s′ ⇔ r · s = 0; thus, from p · (p → q) = p · (p′ + q) = p · q, equivalent to p · q ≠ 0, it is not p ≤ q′.

An element c is a consequence of P(→) provided it were p · (p → q) ≤ c, and h is a hypothesis for P(→), provided h ≤ p · (p → q). Hence c deductively follows from P(→) provided p · (p → q) ≤ c. The modus ponens inequality is obtained with c = q, that, as proven, is equivalent to p → q ≤ p′ + q, showing that p′ + q is the greatest possible expression of the conditional, and that because it follows p · (p → q) ≤ p · q, p · q is also a consequence of P(→). Additionally, with the greatest conditional p′ + q, c is a consequence of P(→) if and only if p · q ≤ c.

Concerning a hypothesis h ≠ 0, that is, h is not self-contradictory, the inequality h ≤ p · (p → q) = p · q shows that h should be a hypothesis for both p and q, and reciprocally because h ≤ p and h ≤ q, imply h = h · h ≤ p · q. In the case where p → q were not p′ + q, the hypotheses for P(→) are just those h ≠ 0 that are hypotheses of both p and p → q.

What about refutations and speculations of P(p′ + q)? Refutations r are characterized by p · q ≤ r′ ⇔ r ≤ p′ + q′. Type-two speculations s cannot be characterized by any inequality, but only by s NC p · q, and s NC(p · q)′, or s NC(p′ + q′). Those of type-one should verify s NC p · q, and s′ ≤ p · q, or (p · q)′ = p′ + q′ ≤ s; they should be, on the order of the algebra, isolated from p · q but greater than (p · q)′, and, in particular, simultaneously greater than p′ and greater than q′, because it is p′ ≤ p′ + q′, and q′ ≤ p′ + q′.

Of course, if there are always consequences and refutations it can happen that neither the hypotheses nor the speculations exist. In this respect, let’s consider a simple example of some interest concerning the reasoning that, for making a bet, is done on the events that can appear in throwing a die.

The elemental directly observable events in the experiment are “appears one”, “appears two”, …, “appears six” points; hence, the universe of discourse can be taken to be X = {1, 2, 3, 4, 5, 6}, and all the possible events are its subsets, with the empty set Ø corresponding, for instance, to a failure in throwing the die. For instance, the event “appears odd points”, corresponds to the subset {1, 3, 5}, the event “appears more than 3” corresponds to {4, 5, 6}, and so on. Note that the full set X corresponds to the “sure event”, consisting in “obtaining any possible number of points”, the only one at which no bet is allowed; X is the only premise for the reasoning.

Hence, because all subsets S of X verify S ⊆ X, and subset inclusion is the counterpart of ≤ in the power algebra 2X of subsets, the events are but hypotheses; the bets are on hypotheses. The only consequence, X ⊆ S, is obviously X and the only refutation is the empty set because it is X ⊆ S′ ⇔ S ⊆ X′ = Ø ⇔ S = Ø, the one at which nobody will bet.

In this example, there are no speculations, a single consequence, a single refutation, and many hypotheses. The theory of probability mainly concerns the measuring of the chances hypotheses can have, the hypotheses that can be made on the possible results of a random experiment expressible by means of precise words. The case with either nonrandom experiments, or with “imprecise events”, is considered later on.

Summing up,

  • The uniqueness of the three operations translating the linguistic and, or, and not, as well as the great number of laws a Boolean algebra (or a power set) enjoys, makes the model a very simple one in which, for instance, refutations r(p ≤ r′) coincide with those r such that p · r = 0, that is, those subsets with empty intersection with the résumé’s subset.

  • Analogously, conjectures p ≤/ q′ coincide with those q such that p · q ≠ 0, that is, those subsets with nonempty intersection with the résumé, and first-type speculations with those subsets s not comparable with the résumé but whose negation is included in it, s′ ≤ p, equivalent in this case to p′ ≤ s, and also to p + s = 1, because: p′ ≤ s => 1 = p + s, and 1 = p + s implies p′ = p′ · s ≤ s.

Hence, the Boolean model is uniform for all reasoning in which words are precise, all information on them is at least potentially available, and no degrees of truth beyond 1 and 0 are required. This kind of reasoning is typical of linguistic environments on which a perfect cut can be made between what is and what is not, where “ideal” perfect classifications can actually be obtained, something that is not always possible when the descriptions of situations or phenomena, either physical or virtual, are made with imprecise words, with uncertainty or with ambiguity, as is usual in ordinary reasoning and when, for instance, the behavior of a dynamical physical system is described in a plain language.

Nevertheless, there are descriptions of some situations that are done with precise words, as if they were some interesting random experiments such as that of throwing a die, and that are full of uncertainty; the events are describable in precise linguistic words, but are uncertain. In these experiments, the linguistic description of the events that can be obtained is well translated into a Boolean algebra of crisp sets, and, for computing the uncertainty of events, the idea of probability was first introduced, and later subjected to a very short, simple, and beautiful axiomatic, introduced by Anatoly N. Kolmogorov in 1933. This probability is, actually, a measure of the event’s uncertainty when it is precisely describable, but is not the only interpretation of the probability’s concept. The analysis of the meaning in language of the word “probable” is still open, the mother-predicate of the (abstract) concept of probability, which “measures of probability” are supposed to measure. In this interpretation, is probability a measure? And, what does it measure?

The answer is hidden in the same Kolmogorov definition, namely in the axiom of additivity:

  • If p · q = 0 (⇔p ≤ q′, p and q are contradictory), then prob(p + q) = prob(p) + prob(q),

because in Boolean algebras it holds p ≤ q ⇔ q = p + q = p + p′ · q, and it is p · (p′ · q) = (p · p′) · q = 0 · q = 0, then prob(q) = prob(p) + prob(p′ · q) ≥ prob(p). Thus, taking the lattice’s order of the Boolean algebra as the qualitative meaning of “probable”, the mapping prob, assigning numbers in [0, 1] to the events, is a measure of the word “probable”. Kolmogorov’s probability completes the graph (2X, ⊆), identified with the qualitative meaning of probable in 2X, to the triplets (2X, ⊆, prob) that, in this form, each can specify a full meaning of the word “probable”. That is, Kolmogorov’s probability is actually a measure of the linguistic qualitative meaning of probable, ≤probable, under its identification with the relation ⊆ of set’s inclusion.

This is something that, perhaps acceptable with precise words, is not clearly so with the imprecise ones, whose meanings cannot be represented by crisp sets. Note that the supposition ≤probable = ⊆, comes from accepting that “less elements” can be identified with “less probable”.

In conclusion, although the evident successes of the formal Kolmogorov theory of probability that, based on Boolean algebras, can lead to assigning great confidence in the former interpretation of the meaning of probable when used with precise words, it is still open to study when “probable” is used with imprecise words. Note that in plain language expressions such as, “It is with high probability that John is rich,” in which neither “high” nor “rich” can always be constrained to be represented by crisp sets, are often uttered; examples like this cause us to look again at the meaning of the word “probable” in plain language.

11.2. In the classical case, with the inference relation identified with the lattice’s order of the Boolean algebra, and because it is transitive, there is no room for inferential jumps in a deductive process, except an operative mistake or the ignorance of something that can produce an erroneous proof. In a correct proof there cannot be jumps; correct proofs are conducted in algorithmic form, that is, by enchaining statements in such a way that all steps in the chain hold thanks to its first step which is initially supposed to hold; each step is fired thanks to its former step, and by following the rule p:p ≤ q::q, of modus ponens.

A proof of q from p consists in a sequence {p, p 1, p 2, …, p n−1, q}, such that p ≤ p 1, p 1 ≤ p 2, …, p n−2 ≤ p n−1, and p n−1 ≤ q. Because ≤ is transitive, it follows that p ≤ q; that is, q is deduced from p thanks to the inferential steps p k regardless of its number. The chain p ≤ p 1 ≤ … ≤ p n−1 ≤ q, is but an algorithm that allows reaching q from p; of course, it does not mean such an algorithm is unique, but usually several different proofs of q from p are available, and mathematicians prefer those with a minimum number of steps (often considered among the most beautiful proofs). As soon as the transitive law of ≤ is lost, algorithms can break at some intermediate step without allowing finally and safely concluding p ≤ q, a proof of q from p.

Algorithms are essential for mechanizing formal deductive reasoning; from very early in the history of artificial intelligence, there have been computer programs or algorithms that proved some previously known mathematical theorems with fewer steps than proven by mathematicians. Let’s remember the old case of the Herbert Simon program, Logical Theorist, that proved a theorem appearing in the book Principia mathematica by A.N. Whitehead and B. Russell, where it was proven with a larger proof than the one obtained by Logical Theorist. It was attained, nevertheless, in the short and closed context of the few axioms constituting the previous information needed for the proof.

11.3. As formerly observed, in the precise case, the NC and EM principles expressed in the former self-contradictory form collapse, respectively, in the equivalent Boolean axioms p · p′ = 0, and p + p′ = 1, because, as is well known, in Boolean algebras: x ≤ x′ ⇔ x = 0, and x′ ≤ x ⇔ x = 1. Observe that because the reciprocal also holds: p · p′ ≤ (p · p′)′ ⇔ p · p′ = 0, and (p + p′)′ ≤ ((p + p′)′)′ ⇔ p + p′ = 1.

Note that in non-Boolean ortholattices these equivalences also hold, but neither in De Morgan algebras nor in basic fuzzy algebras (BAFs) where, nevertheless, the principles only hold in their “self-contradictory form” provided the inference relation were taken coincidental with their respective orderings. In all these cases, the inference relation is transitive and there is no room for the failing of the principles. But in ordinary reasoning it cannot always be presumed that the inferential relation ≤ is an algebraic order, and less again is it always transitive. In ordinary reasoning, transitivity is a local property.

11.4. NC and EM can also be analyzed from the inferential point of view, and a hint on it follows. Note that in its former interpretation (p · p′)′ can be seen as simply being a refutation of p · p′, and ((p + p′)′)′ as being one of (p + p′)′. Hence, they cannot be conjectures; but, what about p · p′ and p + p′?

Provided p were not self-contradictory, taking the singleton P = {p} as the set of premises, and presuming transitivity, p · p′ cannot be a consequence of P, p ≤ p · p′, nor p a hypothesis for p · p′. Were it a consequence, and the triplet (p, p · p′, p′) transitive, because it is p · p′ ≤ p′, it would follow that p ≈ p · p′, and because of NC, the absurd p ≤ p′ would also follow. In addition, the possibility that p · p′ is a speculation can be avoided inasmuch as p · p′ ≤ p implies that it is not p NC p · p′. Statement p · p′ is not a conjecture of {p} and, hence, should be a refutation of {p}, and it is so because p · p′ ≤ p′. But, on which conditions can it be p ≤ (p · p′)′? Because p · p′ ≤ p′, it is (p′)′ ≤ (p · p′)′, it suffices to count with p ≤ (p′)′, and the transitivity of the triplet (p, (p′)′, (p · p′)′) to have p ≤ (p · p′)′.

On the contrary, because it is always p ≤ p + p′, p + p′ is a consequence of P, and p a hypothesis for p + p′.