1 Introduction

In his reconstruction of assertoric syllogistic, Łukasiewicz claims that it presupposes a metalogic (underlying logic), and that that metalogic is propositional logic [8, 20f]. Nevertheless, he has to avail himself of predicate logic to explain Aristotle’s proofs by ecthesis [8, §19]. Contrary to Łukasiewicz, Corcoran argues that Aristotle’s theory of deduction contains a self-sufficient natural deduction system which presupposes no other logic [5, 97f].

In this article, we intend to show that Aristotle’s syllogistic indeed presupposes a metalogic. Based on text passages that can be found in Prior Analytics and Metaphysics, we shall show how a semantics can be reconstructed. This means that it can be shown that Aristotle was aware of a distinction between object language and metalanguage and that he tried to convey that distinction in his writings.

In order to investigate assertoric syllogistic using modern semantics, it is necessary to draw a clear distinction between object language and metalanguage. First of all, the object language, which is a term-language, will be introduced and syllogisms defined as arguments with 2 premisses (Sect. 2).

The presentation of the semantics (Sect. 3) is based on Aristotle’s texts and ecthetic proofs. The truth conditions of universal categorical sentences are derived from the dictum de omni et nullo. It will be shown that Aristotle was aware of the concepts of extensions of predicates and of set inclusion and how he intended to impart them to us (Sect. 3.1). The truth conditions of particular categorical sentences are obtained using the truth conditions of universal sentences and ecthetic proofs. All truth conditions are given by means of set inclusion instead of, for instance, empty intersection (for universal negative sentences) and non-empty intersection (for particular affirmative sentences) as in [5, p. 103] and [15, p. 225]. As a by-product, it follows that ecthetic proofs play a central role in the metatheory of syllogistic (Sect. 3.2). Based on the analysis of the truth conditions for categorical sentences, it is possible to define a formal semantics (Sect. 3.3) which includes a definition of syllogistic validity (s-validity, for short).

Next (Sect. 4) follows the investigation of Aristotle’s concept of a ‘perfect syllogism’. It will be shown that he did not consider perfect syllogisms as evident [10, 43f], as axioms [8, p. 42] or as rules of inference [5, p. 126], [15, p. 225], but rather as valid arguments that deserve an s-validity proof. The proofs of s-validity of perfect syllogisms are made with use of the semantics presented in Sect. 3.3. A strong relation between perfection and s-validity emerges from these proofs, in which the main roles are played by the extension of the middle term as well as transitivity of set inclusion. It will also be shown that Aristotle was aware of the concept of transitivity of set inclusion, and his criterion of perfection can be precisely conveyed by a modern (metalogical) definition of a perfect syllogism. This means that it is possible to establish what the necessary and sufficient condition must be to ascertain that a syllogism is perfect. The same condition is also the condition of s-validity for these syllogisms.

In Sect. 5, it will be shown how, from the definition of a perfect syllogism, it can be established (i) what it means for a syllogism to be imperfect, (ii) what the main metalogical difference between a perfect and an imperfect syllogism is, (iii) what metalogical features perfect and imperfect syllogisms have in common, and (iv) what role transitivity of set inclusion plays for the s-validity of imperfect syllogisms. It is possible to work out these distinctions and similarities because the proof of s-validity for imperfect syllogisms are direct proofs without conversion and ecthesis proofs in a calculus of natural deduction. The s-validity of the laws of conversion is obtained by direct proofs.

Finally, in Sect. 6, it will be shown that and explained why some imperfect syllogisms satisfy the definition of a perfect syllogism.

2 The Syllogistic Language \(\mathbb {L}\)

The formal language \(\mathbb {L}\) of syllogistic contains the following descriptive signs: countably infinitely many general terms ‘F’, ‘G’, ‘H’, etc. The non-descriptive signs of \(\mathbb {L}\) are the C-functors ‘a’, ‘e’, ‘i’, ‘o’ (‘belongs to all’, ‘belongs to none’, ‘belongs to some’, ‘does not belong to some’); the negation sign ‘\(\sim \)’; the conclusion sign ‘\(\therefore \)’; the auxiliary signs ‘\(\{\)’, ‘\(\}\)’, and ‘, ’. We read ‘\({F\hspace{-0.75pt}a\hspace{-0.75pt}G}{}\)’ as ‘F belongs to all G’ etc., as does Aristotle.Footnote 1 An expression of \(\mathbb {L}\) is composed of a finite number of signs of \(\mathbb {L}\). The non-descriptive signs of \(\mathbb {L}\) are used autonomously in the metalanguage of \(\mathbb {L}\). We define:

  1. D1

    Let \(\mathbb {G}\) = \(\{F, G, H, \ldots {}\}\) be the set of general terms of \(\mathbb {L}\). s is a categorical sentence of \(\mathbb {L}\) of type \(x \leftrightarrow _{\textrm{Def}} (\exists A, B)(A, B \in \mathbb {G} \text { and } x \in {a, e, i, o} \text { and } s = AxB)\). A categorical sentence s of type x is also called ‘x-sentence’.

  2. D2

    s is a categorical sentence of \(\mathbb {L} \leftrightarrow _{\textrm{Def}} (\exists x)(s\) is a categorical sentence of \(\mathbb {L}\) of type x). \(\mathbb {C} = \{s \mid s\) is a categorical sentence of \(\mathbb {L}\}\) is the set of categorical sentences of \(\mathbb {L}\).

  3. D3

    The set \(\mathbb {S}\) of all sentences of \(\mathbb {L}\) is defined by

    (a):

    \(\mathbb {C} \subseteq \mathbb {S}\);

    (b):

    \(s \in \mathbb {S} \rightarrow (\sim \hspace{-3.0pt}s) \in \mathbb {S}\);

    (c):

    Nothing else is an element of \(\mathbb {S}\).

    The two categorical sentences ‘\({F\hspace{-0.75pt}a\hspace{-0.75pt}G}{}\)’ and ‘\({H\hspace{-0.75pt}a\hspace{-0.75pt}F}{}\)’ are of the same form: \({A\hspace{-0.75pt}a\hspace{-0.5pt}B}{}\).

  4. D4

    \(\Gamma \therefore s\) is an argument of \(\mathbb {L} \leftrightarrow _{\textrm{Def}} \Gamma \therefore s\) is an expression of \(\mathbb {L}\) and \(\Gamma \subseteq \mathbb {S}\) and \(s \in \mathbb {S}\).

    The laws of the square of opposition and the laws of conversion are examples of one-premiss arguments of \(\mathbb {L}\).

  5. D5

    \(\Gamma \therefore s\) is a first figure syllogism of \(\mathbb {L} \leftrightarrow _{\textrm{Def}} (\exists p, q, s, P, M, S, u, v, w)\), such that

    (a):

    \(\Gamma \therefore s\) is an argument of \(\mathbb {L}\);

    (b):

    \(p, q, s \in \mathbb {C}\); \(P, M, S \in \mathbb {G}\);

    (c):

    \(\Gamma = \{p, q\}\);

    (d):

    \(u, v, w \in \{a, e, i, o\}\);

    (e):

    \(p = {P\hspace{-0.5pt}u\hspace{-0.5pt}M}{} \wedge q = {M\hspace{-0.5pt}v\hspace{-0.5pt}S}{} \wedge s = {P\hspace{-0.5pt}w\hspace{-0.5pt}S}{}\).

    Thus, by D5, the argument ‘\(\{{F\hspace{-0.75pt}a\hspace{-0.75pt}G}{}, {G\hspace{-0.75pt}a\hspace{-0.75pt}H}{}\} \therefore {F\hspace{-0.75pt}a\hspace{-0.75pt}H}{}\)’ is a first figure syllogism. The syllogisms of the remaining three figures can be defined in a similar way, differing only in their respective condition (e). Furthermore, we define

  6. D6

    \(\Gamma \therefore s\) is a syllogism of \(\mathbb {L} \leftrightarrow _{\textrm{Def}} \Gamma \therefore s\) is a syllogism of \(\mathbb {L}\) of the first figure or of the second figure or of the third figure or of the fourth figure.

  7. D7

    Let \(n \in \{1, 2, 3, 4\}\). \(\Gamma \therefore s\) is of the form of a syllogism of \(\mathbb {L}\) of the nth figure \(\leftrightarrow _{\textrm{Def}} \Gamma \therefore s\) is a syllogism of \(\mathbb {L}\) of the nth figure.

By D7, a syllogism such as ‘\({{F\hspace{-0.75pt}a\hspace{-0.75pt}G}{}, {G\hspace{-0.75pt}a\hspace{-0.75pt}H}{}} \therefore {F\hspace{-0.75pt}a\hspace{-0.75pt}H}{}\)’ is of the form of a first figure syllogism because it is a first figure syllogism, its form being: \({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}, {M\hspace{-0.5pt}a\hspace{-0.5pt}S}{} \therefore {P\hspace{-0.5pt}a\hspace{-0.5pt}S}{}\).Footnote 2

The form of a syllogism of \(\mathbb {L}\) in one of the four figures is thusFootnote 3:

First figure \({P\hspace{-0.5pt}u\hspace{-0.5pt}M}\), \({M\hspace{-0.5pt}v\hspace{-0.5pt}S}\) \(\therefore \) \({P\hspace{-0.5pt}w\hspace{-0.5pt}S}\)

Second figure \({M\hspace{-0.5pt}u\hspace{-0.5pt}P}\), \({M\hspace{-0.5pt}v\hspace{-0.5pt}S}\) \(\therefore \) \({P\hspace{-0.5pt}w\hspace{-0.5pt}S}\)

Third figure \({P\hspace{-0.5pt}u\hspace{-0.5pt}M}\), \({S\hspace{-0.5pt}v\hspace{-0.5pt}M}\) \(\therefore \) \({P\hspace{-0.5pt}w\hspace{-0.5pt}S}\)

Fourth figure \({M\hspace{-0.5pt}u\hspace{-0.5pt}P}\), \({S\hspace{-0.5pt}v\hspace{-0.5pt}M}\) \(\therefore \) \({P\hspace{-0.5pt}w\hspace{-0.5pt}S}\)

Aristotle’s notion of a syllogism also has a semantic property [12]. The term ‘syllogism’ is not only characterised by the above-listed syntactic features of a syllogism but also by the fact that its conclusion follows necessarily from its premisses:

A syllogism is a discourse in which, certain things being stated, something other than what is state follows of necessity from their being so. I mean by the last phrase that they produce the consequence, and by this, that no further term is required from without in order to make the consequence necessary. [Prior Analytics, A1, 24\(^\textrm{b}\)18–22] [2, 13]

3 The Semantics of \(\mathbb {L}\)

In his proofs of syllogistic validity (s-validity), Aristotle first assumes that all premisses of the argument in question are true and then tries to show that the conclusion must be true as well:

It is possible for the premises of the syllogism to be true, or to be false. The conclusion is either true or false necessarily. From true premises it is not possible to draw a false conclusion [...]. [Prior Analytics, B2, 53\(^\textrm{b}\)4–9]

This quote concerns the notion of logical consequence. In order to define this term, we need the truth conditions of categorical sentences. In what follows, we shall determine how they can be specified, based on Aristotle’s explanations and some of his proofs.

3.1 Truth Conditions of Universal Sentences

We shall next show how in the following quote, called dictum de omni et nullo, Aristotle explains how universal sentences, which are of the form \({A\hspace{-0.75pt}a\hspace{-0.5pt}B}{}\) and \({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}\), are to be interpreted (A, B are general terms and \(\alpha \), \(\beta \) their extensions):

That one term [\(\beta \)] should be included in another [\(\alpha \)] as in a whole is the same as for the other [A] to be predicated of all of the first [B]. And we say that one term [A] is predicated of all another [B], whenever no instance of the subject [\(\beta \)] can be found of which the other term [A] cannot be asserted: ‘to be predicated of none’ [B] must be understood in the same way. [Prior Analytics, A1, 24\(^\textrm{b}\)26–30]

First of all, we have to explain what Aristotle means by ‘a whole’:

A whole means [...] that which so contains the things it contains that they form a unity [...] For that which is true of a whole class and is said to hold good as a whole (which implies that it is a kind of whole) is true of a whole in the sense that it contains many things by being predicated of each, and by all of them, e. g.  man, horse [...]. [Metaphysics, V26, 1023\(^\textrm{b}\)27–28] [1]

Something that ‘contains many things by being predicated of each, and by all of them’ is the extension of a predicate (general term). The extensions \(\alpha \), \(\beta \) of the general terms A, B are extensional concepts or extensions of a concept (i.e. sets of individuals) and can be defined as \(\varepsilon (A) = \{x \in U \mid x\text { is }A\}\) and \(\varepsilon (B) = \{x \in U \mid x\text { is }B\}\), where U is the universe of discourse [19, pp. 119ff.]. An individual’s falling under an extensional concept is identified with the elementhood of the extension of said concept. A ‘whole’, therefore, is the extension of a predicate.

A predicate A determines which individuals of U are elements of \(\varepsilon (A)\) and in so doing separates those individuals from individuals of U which don’t share this predicate. This means that A establishes a boundary between individuals of U. Aristotle calls the boundary of a predicate, which we call ‘term’, ‘’ (‘hóros’, Lat. terminus, meaning ‘term’ or ‘boundary’).

I call that a term [‘’ (‘hóros’)] into which the premiss is resolved, i. e.  both the predicate and that of which it is predicated [i. e. the extension of the predicate], ‘being’ being added and ‘not being’ removed, or vice-versa. [Prior Analytics, A1, 24\(^\textrm{b}\)16–18]

Given a categorical sentence of the form \({A\hspace{-0.75pt}a\hspace{-0.5pt}B}{}\), how must the boundaries of the extensions \(\varepsilon (A)\) and \(\varepsilon (B)\) be related to one another in order for this sentence to be true? Aristotle states that ‘one term [\(\varepsilon (B)\)] should be included in another [\(\varepsilon (A)\)] as in a whole’, that is, the boundary of \(\varepsilon (B)\) should be included in the boundary of \(\varepsilon (A)\) as in a whole. This is his way of conveying that \(\varepsilon (B)\) and \(\varepsilon (A)\) must be related to one another by set inclusion:

$$\begin{aligned}(\forall x \in U) (x \in \varepsilon (B) \rightarrow x \in \varepsilon (A)).\end{aligned}$$

Aristotle reiterates this in the second sentence of his dictum de omni et nullo, where he says that ‘one term [A] is predicated of all another [B]’ means that ‘no instance of the subject [\(\varepsilon (B)\)] can be found of which the other term [A] cannot be asserted’, i.e.

$$\begin{aligned} (\lnot \exists x\in U)(x \in \varepsilon (B) \wedge \lnot (x \in \varepsilon (A))), \end{aligned}$$

which is logically equivalent to

$$\begin{aligned} (\forall x \in U) (x \in \varepsilon (B) \rightarrow x \in \varepsilon (A)). \end{aligned}$$

Thus, a sentence of the form \({A\hspace{-0.75pt}a\hspace{-0.5pt}B}{}\) means:

$$\begin{aligned} (\forall x \in U) (x \in \varepsilon (B) \rightarrow x \in \varepsilon (A)),\text { i.e. }\varepsilon (B) \subseteq \varepsilon (A). \end{aligned}$$

The truth condition of a-sentences can now be specified as follows:

$$\begin{aligned} \text {(1) A sentence of the form }{A\hspace{-0.75pt}a\hspace{-0.5pt}B}{}\text { is true }\leftrightarrow \varepsilon (B) \subseteq \varepsilon (A). \end{aligned}$$

Now we have to establish the truth condition for a sentence of the form \({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}\). A sentence of the form \({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}\) signifies that no individual that falls under the subject concept \(\varepsilon (B)\) falls under the predicate concept \(\varepsilon (A)\):

$$\begin{aligned} (\lnot \exists x\in U)(x \in \varepsilon (B) \wedge x \in \varepsilon (A)), \end{aligned}$$

which is logically equivalent to

$$\begin{aligned} (\forall x \in U) (x\in \varepsilon (B) \rightarrow \lnot (x \in \varepsilon (A))). \end{aligned}$$

But this is set-theoretically equivalent toFootnote 4

$$\begin{aligned} (\forall x \in U) (x\in \varepsilon (B) \rightarrow x \in \varepsilon (A)^c_U). \end{aligned}$$

Hence, a sentence of the form \({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}\) means:

$$\begin{aligned} (\forall x \in U) (x \in \varepsilon (B) \rightarrow x \in \varepsilon (A)^c_U),\text { i.e. }\varepsilon (B) \subseteq \varepsilon (A)^c_U. \end{aligned}$$

The truth condition of e-sentences can be specified thus:

$$\begin{aligned} \text {(2) A sentence of the form }{A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}\text { is true }\leftrightarrow \varepsilon (B) \subseteq \varepsilon (A)^c_U. \end{aligned}$$

Thus, the dictum de omni et nullo explains the truth conditions for universal categorical sentences.Footnote 5

3.2 Ecthesis and Truth Conditions of Particular Sentences

We shall next formulate the truth conditions of particular sentences, which are of the form \({A\hspace{-0.75pt}i\hspace{-0.5pt}B}{}\) and \({A\hspace{-0.75pt}o\hspace{-0.5pt}B}{}\), in terms of set inclusion.

3.2.1 Affirmative Particular Sentences

Aristotle’s proof by ecthesis (Ancient Greek , Latin expositio, ‘exposition’) of the s-validity of syllogisms of the form Darapti (\({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}, {S\hspace{-0.5pt}a\hspace{-0.5pt}M}{} \therefore {P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}\)) of the third figure yields clues to the nature of the truth condition of sentences of the form \({A\hspace{-0.75pt}i\hspace{-0.5pt}B}{}\). He writes (P, S, M, C being general terms):

It is possible to demonstrate this [i. e.  that syllogisms of the form Darapti are s-valid] [...] by exposition [i. e.  ecthesis]. For if both P and S belong to all M, should one of the Ms, e. g.  C, be taken, both P and S will belong to this, and thus P will belong to some S. [Prior Analytics, A5, 28\(^\textrm{a}\)22–26]

Aristotle claims that syllogisms of the form Darapti are s-valid, viz. that assuming \({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}\) and \({S\hspace{-0.5pt}a\hspace{-0.5pt}M}{}\) are true, \({P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}\) must be true as well. In other words, by Sect. 3.1 (1), i.e. if both \(\varepsilon (M) \subseteq \varepsilon (P)\) and \(\varepsilon (M) \subseteq \varepsilon (S)\) hold, then \({P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}\) must also hold.

In this respect, questions arise as to the nature of what exactly is exposed, as to the set-theoretical prerequisites under which Aristotle applies this exposition, and as to the nature of the truth condition of i-sentences required for his assertion of s-validity to be true.

One possible answer is that the entity exposed is an element of the universe.Footnote 6 However, we shall show that it is instead a non-empty subset of the universe. The advantages of such an interpretation will become clear as we study Aristotle’s criterion of perfection (Sect. 4).

What follows is a semi-formal reconstruction of his ecthetic proof for syllogisms of the form Darapti. The first two steps in every ecthetic proof are to assume that the premisses are true and to expose a non-empty subset \(\delta _1 (= \varepsilon (C))\) from \(\varepsilon (M)\).Footnote 7\(^{,}\)Footnote 8

1.

\(\varepsilon (M) \subseteq \varepsilon (P) \wedge \varepsilon (M) \subseteq \varepsilon (S)\)

Section 3.1 (1)

 

‘ if both P and S belong to all M

 

2.

\(\varepsilon (M) \ne \emptyset \)

Assumption

3.

\(\varepsilon (M) \ne \emptyset \leftrightarrow (\exists \delta )(\delta \ne \emptyset \wedge \delta \subseteq \varepsilon (M))\)

2, ST2 UI\(^{7}\)

4.

\((\exists \delta )(\delta \ne \emptyset \wedge \delta \subseteq \varepsilon (M))\)

2,3 BMP

5.

\(\delta _1 \ne \emptyset \wedge \delta _1 \subseteq \varepsilon (M)\)

4 EI

 

‘ should one of the M s, e. g. C [ i. e. \(\delta _1\)], be taken ’

 

6.

\(\delta _1 \subseteq \varepsilon (M) \wedge \varepsilon (M) \subseteq \varepsilon (S)\)

5 S, 1 S Adj.

7.

\(\delta _1 \subseteq \varepsilon (M) \wedge \varepsilon (M) \subseteq \varepsilon (S) \rightarrow \delta _1 \subseteq \varepsilon (S)\)

6, T\(\subseteq \) UI\(^{8}\)

8.

\(\delta _1 \subseteq \varepsilon (S)\)

6,7 MP

9.

\(\delta _1 \subseteq \varepsilon (M) \wedge \varepsilon (M) \subseteq \varepsilon (P)\)

5 S,1 S Adj.

10.

\(\delta _1 \subseteq \varepsilon (M) \wedge \varepsilon (M) \subseteq \varepsilon (P) \rightarrow \delta _1 \subseteq \varepsilon (P)\)

9, T\(\subseteq \) UI

11.

\(\delta _1 \subseteq \varepsilon (P)\)

9,10 MP

12.

\(\delta _1 \ne \emptyset \wedge \delta _1 \subseteq \varepsilon (S) \wedge \delta _1 \subseteq \varepsilon (P)\)

5 S, 8, 11 Adj.

 

‘both P and S will belong to this [\(\delta _1\)]’

 

13.

\((\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon (S) \wedge \delta \subseteq \varepsilon (P))\)

12 EG

 

P will belong to some S

 

Given the premisses of Darapti,Footnote 9 are true and assuming that a non-empty subset \(\delta _1\) can be exposed from \(\varepsilon (M)\), it follows, by applying transitivity of set inclusion and EG, that a sentence of the form \({P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}\) is true iff there is a non-empty set \(\delta \) that is contained in both \(\varepsilon (S)\) and \(\varepsilon (P)\). In this way, Aristotle conveys in his ecthetic proof of Darapti how he imagines the truth condition of a affirmative particular sentence.Footnote 10 From the semi-formal reconstruction of this proof, we gather that the truth condition of a sentence of the form \({A\hspace{-0.75pt}i\hspace{-0.5pt}B}{}\) is an existentially quantified sentence:

$$\begin{aligned}{} & {} \text {(3) A sentence of the form}\,\, {A\hspace{-0.75pt}i\hspace{-0.5pt}B}{}\,\, \text {is true} \leftrightarrow \\{} & {} (\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon (B) \wedge \delta \subseteq \varepsilon (A)) \end{aligned}$$

According to [9, p. 121], [5, p. 103], [15, p. 225], \({A\hspace{-0.75pt}i\hspace{-0.5pt}B}{}\) is true \(\leftrightarrow \varepsilon (B) \cap \varepsilon (A) \ne \emptyset \). However, formulating the truth condition of i-sentences in terms of a non-empty intersection leaves the connexion with ecthesis wanting: according to Aristotle, ‘one of the Ms, e.g. C, should be taken’. For a proof by ecthesis, it does therefore not suffice to know that the intersection is not empty; in addition, it is required to expose a part of \(\varepsilon (M)\) from this non-empty intersection.

3.2.2 Negative Particular Sentences

To obtain the truth condition of a particular negative sentence, let us examine the following semi-formal proof for Ferio (\({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}, {M\hspace{-0.5pt}i\hspace{-0.5pt}S}{} \therefore {P\hspace{-0.5pt}o\hspace{-0.5pt}S}{}\)) of the first figure. Given the premisses of Ferio, we have

1.

\(\varepsilon (M) \subseteq \varepsilon (P)^c_U\)

Section 3.1 (2)

 

[P belongs to no M]

 

2.

\((\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon (S) \wedge \delta \subseteq \varepsilon (M))\)

Section 3.2.1 (3)

3.

\(\delta _1 \ne \emptyset \wedge \delta _1 \subseteq \varepsilon (S) \wedge \delta _1 \subseteq \varepsilon (M) \)

2 EI

 

[M belongs to some S]

 

4.

\(\delta _1 \subseteq \varepsilon (M) \wedge \varepsilon (M) \subseteq \varepsilon (P)^c_U\)

3 S,1 Adj.

5.

\(\delta _1 \subseteq \varepsilon (M) \wedge \varepsilon (M) \subseteq \varepsilon (P)^c_U \rightarrow \delta _1 \subseteq \varepsilon (P)^c_U\)

4, T\(\subseteq \) UI

6.

\(\delta _1 \subseteq \varepsilon (P)^c_U\)

4,5 MP

7.

\(\delta _1 \ne \emptyset \wedge \delta _1 \subseteq \varepsilon (S) \wedge \delta _1 \subseteq \varepsilon (P)^c_U\)

3 S,6 Adj.

 

[S belongs to \(\delta _1\), but P does not belong to \(\delta _1\)]

 

8.

\((\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon (S) \wedge \delta \subseteq \varepsilon (P)^c_U)\)

7 EG

 

[P does not belong to some S]

 

This proof of Ferio goes to show that the truth condition for a sentence of the form \({A\hspace{-0.75pt}o\hspace{-0.5pt}B}\) is also best understood as an existentially quantified sentence; that is,

$$\begin{aligned}{} & {} \text {(4) A sentence of the form}\,\, {A\hspace{-0.75pt}o\hspace{-0.5pt}B}{}\,\, \text {is true} \leftrightarrow \\{} & {} (\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon (B) \wedge \delta \subseteq \varepsilon (A)^c_U) \end{aligned}$$

The conditions (3) and (4) are better suited to describe the truth conditions of particular sentences, for they clearly show the strong connexion between the truth conditions we are looking for on the one hand and ecthesis on the other. The formulations of the truth conditions of particular sentences obtained here show that ecthesis is more than merely an additional method of proof. By no means is it extra-systematic, as [5, fn. 20] claims; rather, it plays a paramount role in the metatheory of syllogistic.

In addition to showing how particular sentences are to be interpreted, these proofs also demonstrate that, first, an ecthetic proof is a semantic proof: given the premisses, we derive the truth condition of the conclusion i.e. it is a proof of s-validity. Secondly, the truth condition of the conclusion is obtained by applying transitivity of set inclusion. Thirdly, in so doing, the extension of the middle term \(\varepsilon (M)\) serves to connect the exposed \(\delta _1\) with \(\varepsilon (S)\) and with \(\varepsilon (P)\) in Darapti (an imperfect syllogism) or \(\varepsilon (P)^c_U\) in Ferio (a perfect syllogism). Finally, we have also shown that it is possible to prove the validity of Ferio. Indeed, it is even possible to prove the validity of all perfect syllogisms (see Sect. 4).

We have seen that Aristotle conceived of sets as extensions of predicates. He was also aware of set inclusion and transitivity of set inclusion. This means that he was cognisant of a form of naïve set theory, though he did not have set theory at his disposal. In order to formulate a semantics, we need to assert the existence of non-empty subsets of non-empty sets, the existence of non-empty complements of such sets, as well as the assumption that every set (of elements of U) has a complement in U: \((\forall \beta \exists \alpha ) (\beta \subseteq U \rightarrow \alpha = \beta ^c_U)\).

Furthermore, the universe U must not be empty since it is assumed in (3) and (4) that the three sets involved are not empty. For the same reason, we must demand that the extension of every general term have a non-empty subset of U as its extension. Finally, no general term may have the universe U itself as its extension. Otherwise, since every set has a complement, the complement that is the anti-extension of such a universal general term would be empty. This, however, would run afoul of a condition involving o-sentences where one of the three sets involved is a non-empty complement: (4) does not permit empty complements since said complement, which is the anti-extension of A, is required to have a non-empty subset.

3.3 Interpretations of \(\mathbb {L}\)

A formal reconstruction of Aristotle’s proof sketches requires a semantics that implements our analysis of truth conditions.

An interpretation I of \(\mathbb {L}\) is an ordered pair \(\langle U, \varepsilon \rangle \), where U is a non-empty set and \(\varepsilon \) a function that assigns to each \(C \in \mathbb {G}\) one and only one non-empty subset \(\varepsilon (C)\) of U. Thus, the extension of a general term is a non-empty set of elements of U. On the basis of such an interpretation, the extension of a sentence s of \(\mathbb {L}\) under I, \(\varepsilon _{\hspace{-1.5pt}I}(s)\), can be defined as follows:

  1. D8:

    Let \(I = \langle U, \varepsilon \rangle \) be an interpretation of \(\mathbb {L}\); let \(A, B \in \mathbb {G}\) and \(s \in \mathbb {C}\), and let \(\alpha ^c_U\) be the complement of \(\alpha \) in U.

    (a):

    \((\forall C \in \mathbb {G}) (\varepsilon _{\hspace{-1.5pt}I}(C) = \varepsilon (C) \ne \emptyset \wedge \varepsilon (C) \ne U)\)

    (b):

    \( \varepsilon _{\hspace{-1.5pt}I}({A\hspace{-0.75pt}a\hspace{-0.5pt}B}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(B) \subseteq \varepsilon _{\hspace{-1.5pt}I}(A)\)

    (c):

    \( \varepsilon _{\hspace{-1.5pt}I}({A\hspace{-0.75pt}i\hspace{-0.5pt}B}{}) = 1 \leftrightarrow (\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(B) \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(A))\)

    (d):

    \( \varepsilon _{\hspace{-1.5pt}I}({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(B) \subseteq \varepsilon _{\hspace{-1.5pt}I}(A)^c_U\)

    (e):

    \( \varepsilon _{\hspace{-1.5pt}I}({A\hspace{-0.75pt}o\hspace{-0.5pt}B}{}) = 1 \leftrightarrow (\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(B) \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(A)^c_U)\)

    (f):

    \( \varepsilon _{\hspace{-1.5pt}I}(\sim \hspace{-3.0pt}s) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(s) = 0\).

The truth conditions of categorical sentences are thus formulated in terms of set inclusion and set complements only. Furthermore, the extension of each of these sentences is a truth value that is an element of the set of the two truth values True (1) and False (0).

The truth of a sentence s of \(\mathbb {L}\) under I can then be defined as follows:

  1. D9:

    s is true under \(I \leftrightarrow _{\textrm{Def}} \varepsilon _{\hspace{-1.5pt}I}(s) = 1\).

Next, we can define the Aristotelian notion of s-validity:

  1. D10:

    An argument \(\Gamma \therefore s\) of \(\mathbb {L}\) is s-valid \(\leftrightarrow _{\textrm{Def}} (\forall I) ((\forall p \in \Gamma )\) (p is true under \(I) \rightarrow s\) is true under I).

All upcoming formal proofs of s-validity will be realised in a calculus of natural deduction (cf. [7]) of first-order predicate logic with identity and set inclusion, and this is the metalanguage of assertoric syllogistic.

In my Ph.D. thesis [11], I was able to use this semantics to prove the s-validity of the laws of the square of opposition and the laws of conversion by reductio ad impossibile, as well as the s-validity of the syllogisms by means of the methods of conversion, reductio ad impossibile, and ecthesis.

There, I also investigated the problem of non-empty extensions and proved that some laws of the square of opposition and some syllogisms are not s-valid if this semantics is modified to a semantics with non-empty extensions. Similarly to Schröder’s suggestion [14, p. 244], I have also shown how the language \(\mathbb {L}\) has to be enlarged in order to turn syllogisms that are not s-valid under such a semantics (e.g. Darapti) into s-valid ones.

In the present article, I shall show that the necessary condition for s-validity and perfection that I established in the aforementioned thesis has turned out to be sufficient as well (Sect. 4).

Since the upcoming direct proofs of the s-validity of all syllogisms of the second to fourth figures (Sect. 5) do not require the use of any laws of conversion or subalternation, it follows that it can be precisely ascertained whether or not a syllogism is imperfect.

The proofs of s-validity of the laws of conversion and subalternation are also direct proofs (Sect. 5.1). The method of reductio ad impossibile and the laws of the square of opposition are not used at all.

4 Perfect Syllogisms

Aristotle defined a perfect syllogism as follows:

I call that a perfect () syllogism which needs nothing other than what has been stated to make plain what necessarily follows; [Prior Analytics, A1, 24\(^\textrm{b}\)22–24]

Aristotle does not say that the conclusion of a perfect syllogism evidently or obviously follows from its premisses. Rather, what he says is that nothing but the premisses given is needed ‘to make plain’ that the conclusion must follow. But how can we prove that the conclusions of these syllogisms follow necessarily from their premisses? For that, we take a look at Aristotle’s criterion of perfection. Let the non-empty sets \(\alpha , \beta , \gamma \) be the extensions of the general terms ABC:

Whenever three terms [hóroi, ‘sets’] are so related to one another that the last [\(\alpha \)] is contained in the middle [\(\beta \)] as in a whole [i. e. \(\alpha \subseteq \beta \)] and the middle [\(\beta \)] is either contained in or excluded from the first [\(\gamma \)] as in or from a whole [i. e. \(\beta \subseteq \gamma \)] the extremes [i. e. \(\alpha \) and \(\gamma \)] must be related by a perfect syllogism. I call that term [set] middle [\(\beta \)] which is itself contained in another [i. e. \(\beta \subseteq \gamma \)] and contains another in itself [i. e. \(\alpha \subseteq \beta \)]: in position also, this comes in the middle [\(\alpha \subseteq \beta \) and \(\beta \subseteq \gamma \)]. By extremes I mean both that term [set] that is itself contained in another [\(\alpha \)] and that in which another is contained [\(\gamma \)]. [Prior Analytics, A4, 25\(^\textrm{b}\)32–37]

In Sect. 3.1, we have seen, that ‘the last [\(\alpha \)] is contained in the middle [\(\beta \)] as in a whole’ means that \(\alpha \subseteq \beta \), and ‘the middle [\(\beta \)] is either contained in or excluded from the first [\(\gamma \)] as in or from a whole’ means that \(\beta \subseteq \gamma \). Now, if three sets are interrelated as is described in this quote, that is \(\alpha \subseteq \beta \) and \(\beta \subseteq \gamma \), then \(\alpha \), \(\beta \), \(\gamma \) are related to each other by the transitive relation of set inclusion.

‘To make plain what necessarily follows’, we must show that for certain syllogisms the extensions of the three general terms are in this relation to each other, and then, we shall have proven that these syllogisms are s-valid and perfect.

Here, Aristotle does not use the terms ‘last’, ‘middle’, and ‘first’ to refer to the general terms S, M, P, but rather, to their extensions. He also does not swap any premisses, and neither is this quote part of his description of Barbara and Celarent.Footnote 11 Rather, it is a semantic condition, i.e. what the relation between the extensions of the general terms has to be in order for a syllogism to be perfect. Next, Aristotle introduces syllogisms of the form Barbara and Celarent:

If P is predicated of all M, and M is predicated of all S, P must be predicated of all S: we have already explained what we mean by ‘predicated of all’. Similarly also, if P is predicated of no M, and M of all S, it is necessary that no S will be P.[Prior Analytics, A4, 25\(^\textrm{b}\)37–26\(^\textrm{a}\)2]

After reminding us how a-sentences are to be interpreted (‘we have already explained what we mean by “predicated of all” ’), i.e. according to the dictum de omni et nullo, he leaves it to the reader to ascertain whether Barbara and Celarent meets his criterion. The following metatheorem contains the complete proof of the s-validity of syllogisms of the form Celarent.

MT-Celarent

Every argument of \(\mathbb {L}\) of the form \({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}, {M\hspace{-0.5pt}a\hspace{-0.5pt}S}{} \therefore {P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}\) is s-valid.

Proof

By definitions D9 and D10, we have to show that

$$\begin{aligned} (\forall I)(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}) = 1). \end{aligned}$$

1.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1\)

Ass. for CP

 

‘If P is predicated of no M, and M of all S

 

2.

\(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1\)

1 S

3.

\(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)\)

D8(b)

4.

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)\quad [\alpha \subseteq \beta ]\)

2,3 BMP

5.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}) = 1\)

1 S

6.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

D8(d)

7.

\(\varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\quad [\beta \subseteq \gamma ]\)

5,6 BMP

8.

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\quad [\alpha \subseteq \beta \wedge \beta \subseteq \gamma ]\)

4,7 Adj.

9.

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U \rightarrow \varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

8, T\(\subseteq \) UI

10.

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\quad [\alpha \subseteq \gamma ]\)

8,9 MP

 

‘it is necessary that no S will be P

 

11.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

D8(d)

12.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}) = 1\)

10,11 BMP

13.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}) = 1\)

1–12 CP

14.

\((\forall I)( \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}) = 1)\)

13 UG

\(\square \)

Interpreting the premisses, we observe that the extensions of the general terms are related to one another by the transitive relation of set inclusion. Syllogisms of the form Celarent are s-valid by transitivity of set inclusion.

We shall now consider the main steps of the formal proof of the s-validity of Barbara. Given the set of the premisses of Barbara \(\{{P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}\), \({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}\}\), we get:

1.

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\quad [\alpha \subseteq \beta \wedge \beta \subseteq \gamma ]\)

D8(b), Adj.

2.

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\quad [\alpha \subseteq \gamma ]\)

1, T\(\subseteq \) UI, MP

3.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1\)

2, D8(b), BMP

 

P must be predicated of all S

 

The main steps of the formal proof of s-validity for Barbara are analogous to those of the proof for Celarent. In the case of both Barbara and Celarent, the middle \(\beta = \varepsilon _{\hspace{-1.5pt}I}(M)\) and one of the extremes (the last) \(\alpha = \varepsilon _{\hspace{-1.5pt}I}(S)\). The other extreme (the first), \(\gamma \), is \(\varepsilon _{\hspace{-1.5pt}I}(P)\) in the case of Barbara and \(\varepsilon _{\hspace{-1.5pt}I}(P)^c_U\) in the case of Celarent. These terms (hóroi) are also semantic concepts.Footnote 12Barbara and Celarent satisfy Aristotle’s criterion of perfection, and the proof of their perfection proves their s-validity. Both results follow from transitivity of set inclusion. Aristotle introduces the three sets \(\alpha \), \(\beta \), \(\gamma \) in that exact order so as to stress this point. The following passage shows that Aristotle was aware of the transitivity of set inclusion:

[...] for if [the extension of] D is included in [the extension of] B as in a whole, and [the extension of] B is included in [the extension of] A, then [the extension of] D will be included in [the extension of] A. [Prior Analytics, B1, 53\(^\textrm{a}\)22–24]

If \(\alpha \), \(\beta \), \(\gamma \) are the extensions of D, B, A, then we have \(\alpha \subseteq \beta \wedge \beta \subseteq \gamma \rightarrow \alpha \subseteq \gamma \).

In what follows, we analyse Aristotle’s proof sketches of the s-validity of syllogisms of the form Darii (\({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}, {M\hspace{-0.5pt}i\hspace{-0.5pt}S}{} \therefore {P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}\)) and Ferio (\({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}, {M\hspace{-0.5pt}i\hspace{-0.5pt}S}{} \therefore {P\hspace{-0.5pt}o\hspace{-0.5pt}S}{}\)):

But if one term is related universally, the other in part only, to its subject, there must be a perfect syllogism whenever universality is posited with reference to the major term either affirmatively or negatively, and particularly with reference to the minor term affirmatively [...] I call that term the major [\(\gamma \)] in which the middle [\(\beta \)] is contained and that term the minor [\(\alpha \)] which comes under the middle [\(\beta \)]. [Prior Analytics, A4, 26\(^\textrm{a}\)17–23]

In the second sentence, he defines the meaning of two new terms: ‘major’ and ‘minor’. Why does he not use the terms ‘first’ and ‘last’, unlike in his criterion of perfection? First of all, it is noticeable that he uses the expression ‘is contained’ here, which he rarely uses, to define those terms. But when he uses it, he always does so to signalise that he is not talking about general terms, but rather about their extensions, which are related to one another by set inclusion. For instance, he uses it in the dictum de omniet nullo and in the description of his criterion of perfection.

Furthermore, Aristotle intends to show that Darii and Ferio are perfect; for that, one has to show that the premisses of Darii and Ferio give rise to 3 sets \(\alpha \), \(\beta \), \(\gamma \) that satisfy his criterion of perfection. We shall show that, by Darii, the major \(\gamma = \varepsilon (P)\) and, by Ferio, \(\gamma = \varepsilon (P)^c_U\), and, by both, the middle \(\beta = \varepsilon (M)\). But \(\alpha \), the minor, cannot be the last, i.e. \(\varepsilon (S)\), since here, it is not a subset of \(\varepsilon (M)\).

In order to show which set is the minor, we have to take a look at the proofs of the s-validity of both syllogisms. In the quote below, it is quite noticeable that he does not speak of ‘predicated of’ or ‘belong to’, as he usually does:

Let M be P and some S be M. Then if ‘predicated of all’ means what we said above, it is necessary that some S is P. [Prior Analytics, A4, 26\(^\textrm{a}\)23–25]

The variables seem transposed; the general terms, however, are not transposed. Here, M, S, P, are not intended to be general terms, but rather their extensions. This is in fact a sketch of an ecthetic proof of the s-validity of Darii, and for that, he needs the truth conditions of its premisses. This is what he means by ‘Let M be P and some S be M’. ‘M’ should be read as \(\varepsilon (M)\), ‘P’ as \(\varepsilon (P)\) and ‘S’ as \(\varepsilon (S)\). With ‘if “predicated of all” means what we said above’, he explains how the first premiss of Darii has to be interpreted, i.e. according to D8(b). The following proof completes Aristotle’s sketch.

MT-Darii

Every argument of \(\mathbb {L}\) of the form \({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}, {M\hspace{-0.5pt}i\hspace{-0.5pt}S}{} \therefore {P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}\) is s-valid.

Proof

By definitions D9 and D10, we have to show that

$$\begin{aligned} (\forall I)( \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1). \end{aligned}$$

1.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1\)

Ass. for CP

2.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1\)

1 S

3.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\)

D8(b)

4.

\(\varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\quad [\beta \subseteq \gamma ]\)

2,3 BMP

 

‘Let M be P

 

5.

\(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1\)

1 S

6.

\(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1 \leftrightarrow (\exists \delta )(\delta \ne \emptyset \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(S) \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(M))\)

D8(c)

7.

\((\exists \delta )(\delta \ne \emptyset \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(S) \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(M))\)

5,6 BMP

8.

\(\delta _1 \ne \emptyset \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(S) \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)\quad [\alpha = \delta _1]\)

7 EI

 

‘[let] some S [\(\delta _1\)] be M

 

9.

\(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\quad [\alpha \subseteq \beta \wedge \beta \subseteq \gamma ]\)

8 S, 4 Adj.

10.

\(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P) \rightarrow \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\)

9, T\(\subseteq \) UI

11.

\(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\quad [\alpha \subseteq \gamma ]\)

9,10 MP

12.

\(\delta _1 \ne \emptyset \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(S) \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\)

8 S, 11 Adj.

 

‘it is necessary that some S [\(\delta _1\)] is P

 

13.

\((\exists \delta )(\delta \ne \emptyset \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(S) \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(P))\)

12 EG

14.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1 \leftrightarrow (\exists \delta )(\delta \ne \emptyset \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(S) \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(P))\)

D8(c)

15.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1\)

13, 14 BMP

16.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1\)

1–15 CP

17.

\((\forall I)(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}) = 1)\)

16 UG

\(\square \)

Our reconstruction of his sketch of a proof by ecthesis shows that the middle \(\beta = \varepsilon _{\hspace{-1.5pt}I}(M)\), the major \(\gamma = \varepsilon _{\hspace{-1.5pt}I}(P)\), and the minor \(\alpha = \delta _1\). This explains why Aristotle introduces these new terms: they are needed to adapt his criterion of perfection in order to apply it to Darii - \(\alpha \) is not \(\varepsilon _{\hspace{-1.5pt}I}(S)\), but rather the exposed set \(\delta _1\). Darii satisfies the criterion of perfection. Like the term ‘middle’, ‘major’ and ‘minor’ are not syntactic, but semantic concepts.Footnote 13

Aristotle’s remark regarding \(\alpha \), \(\beta \), \(\gamma \) also applies to his proof sketch for Ferio, which is similarly terse:

And if no M is P, but some S is M, it is necessary that some S is not P. (The meaning of ‘predicated of none’ has also been defined.) [Prior Analytics, A4, 26\(^\textrm{a}\)25–27]

As with Darii, the variables are not transposed. Again, M, S, P are not intended to be general terms, but rather their extensions, and ‘predicated of none’ has to be interpreted according to D8(d). Aristotle’s ecthetic proof encompasses just 3 steps of our semi-formal proof (see Sect. 3.2.2):

1.

\(\varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\quad [\beta \subseteq \gamma ]\)

 

‘if no M is P

3.

\(\delta _1 \ne \emptyset \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(S) \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)\quad [\alpha = \delta _1]\)

 

‘but some S [\(\delta _1\)] is M

7.

\(\delta _1 \ne \emptyset \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(S) \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

 

‘it is necessary that some S [\(\delta _1\)] is not P.’

In Ferio, the minor \(\alpha = \delta _1\), the middle \(\beta = \varepsilon _{\hspace{-1.5pt}I}(M)\), and the major \(\gamma = \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\). Thus, Ferio satisfies Aristotle’s criterion of perfection: there are three non-empty sets \(\alpha \), \(\beta \), \(\gamma \) such that \(\alpha \subseteq \beta \) and \(\beta \subseteq \gamma \), which gives \(\alpha \subseteq \gamma \). However, as with Darii, \(\alpha \) is not \(\varepsilon _{\hspace{-1.5pt}I}(S)\) but rather the exposed set \(\delta _1\). The proofs for Darii and Ferio corroborate our view on the truth conditions of particular sentences: only if that which is exposed is interpreted as a set (\(\delta _1\)) can we prove the s-validity of the last two syllogisms and satisfy Aristotle’s criterion of perfection.Footnote 14

These proofs show that by no means did Aristotle regard perfect syllogisms as ‘evident’, much less improvable. Rather, he shows, albeit unfortunately tersely, how both the perfection and the s-validity of a syllogism of the form Darii or Ferio can be proven.

The general pattern of the proofs of validity of the four perfect syllogisms is summarised in the table on the next page:

 

\(\alpha \)

\(\beta \)

\(\gamma \)

\(\alpha \subseteq \beta \wedge \beta \subseteq \gamma \)

\(\alpha \subseteq \gamma \)

Barbara

\(\varepsilon _{\hspace{-1.5pt}I}(S)\)

\( \varepsilon _{\hspace{-1.5pt}I}(M) \)

\(\varepsilon _{\hspace{-1.5pt}I}(P)\)

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\)

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\)

Celarent

\(\varepsilon _{\hspace{-1.5pt}I}(S)\)

\(\varepsilon _{\hspace{-1.5pt}I}(M)\)

\(\varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

Darii

\(\delta _1\)

\(\varepsilon _{\hspace{-1.5pt}I}(M)\)

\(\varepsilon _{\hspace{-1.5pt}I}(P)\)

\(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\)

\(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\)

Ferio

\(\delta _1\)

\(\varepsilon _{\hspace{-1.5pt}I}(M)\)

\(\varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

\(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

\(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

As we can see, perfection is not contingent on the position of the middle term M but on the fact that its extension \(\beta \) assumes a connecting role between \(\alpha \) and \(\gamma \). Transitivity of set inclusion is applicable only if there is a set \(\beta \) that acts as a link between the other two sets \(\alpha \), \(\gamma \). This set \(\beta \) is always the extension of the middle term. It is therefore not the middle term itself, but rather its extension that plays an important part in all proofs of s-validity and in determining the perfection of a syllogism.Footnote 15

Based on 25\(^{\textrm{b}}\)32–37 as well as our proofs, we propose the following definition of a perfect syllogism:

  1. D11

    A syllogism \(\Gamma \therefore s\) is perfect \(\leftrightarrow _{\textrm{Def}}\) the only set-theoretical theorem required to prove \(\Gamma \therefore s\) to be s-valid on the basis of our semantic metatheory is transitivity of set inclusion.

That means that the extensions of the general terms contained in the premisses are related to one another by the transitive relation of set inclusion. Transitivity of set inclusion is a necessary and sufficient condition for perfection and for the s-validity of perfect syllogisms. This is exactly what Aristotle has in mind when stating that ‘[...] a perfect syllogism [...] needs nothing other than what has been stated [in the premisses] to make plain [by transitivity of set inclusion] what necessarily follows.’

Aristotle had no concept of the explicit notion of ‘transitivity of set inclusion’, and yet he was still able to communicate that that was exactly what he intended when speaking of ‘perfection’.

5 Imperfect Syllogisms

Given our definition of perfection (D11), we must now show, by contraposition, that a syllogism is imperfect iff other theorems of set theory in addition to transitivity of set inclusion are required to prove its s-validity. Aristotle defines an imperfect syllogism as follows:

[...] a syllogism is imperfect [], if it needs either one or more propositions, which are indeed the necessary consequences of the terms set down, but have not been expressly stated as premises. [Prior Analytics, A1, 24\(^\textrm{b}\)24–26]

To answer the question of how we can show that a syllogism is imperfect, let us first consider the second figure: \({M\hspace{-0.5pt}u\hspace{-0.5pt}P}{}, {M\hspace{-0.5pt}v\hspace{-0.5pt}S}{} \therefore {P\hspace{-0.5pt}w\hspace{-0.5pt}S}{}\). For instance, from the premisses of Cesare (\({M\hspace{-0.5pt}e\hspace{-0.5pt}P}{}, {M\hspace{-0.5pt}a\hspace{-0.5pt}S}{} \therefore {P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}\)) we obtain by conjunction of their truth conditions (see D8(b) and (d)):

$$\begin{aligned} \varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(P) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)^c_U. \end{aligned}$$

The connecter \(\beta \) is found wanting, which prevents us from connecting \(\alpha \) and \(\gamma \) and applying T\(\subseteq \). By Aristotle’s criterion of perfection (\(\alpha \subseteq \beta \wedge \beta \subseteq \gamma \)) Cesare is therefore not perfect. Nevertheless, we can show that Cesare is s-valid by applying theorem ST4 to \(\varepsilon _{\hspace{-1.5pt}I}(P) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)^c_U\).Footnote 16 From \(\varepsilon _{\hspace{-1.5pt}I}(P) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)^c_U\), we obtain \(\varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\) by theorem ST4 UI and, consequently, the conjunction

$$\begin{aligned} \varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U. \end{aligned}$$

Now, there is a set \(\beta = \varepsilon _{\hspace{-1.5pt}I}(M)\), which allows us to connect \(\alpha = \varepsilon _{\hspace{-1.5pt}I}(S)\) and \(\gamma = \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\) thus, enabling us to apply T\(\subseteq \). The following metatheorem contains the complete proof without conversion of the s-validity of syllogisms of the form Cesare.

MT-Cesare

Every argument of \(\mathbb {L}\) of the form \({M\hspace{-0.5pt}e\hspace{-0.5pt}P}{}, {M\hspace{-0.5pt}a\hspace{-0.5pt}S}{} \therefore {P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}\) is s-valid.

Proof

By definitions D9 and D10, we have to show that

$$\begin{aligned} (\forall I)( \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}e\hspace{-0.5pt}P}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}) = 1). \end{aligned}$$

1.

\(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}e\hspace{-0.5pt}P}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1\)

Ass. for CP

2.

\(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1\)

1 S

3.

\(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)\)

D8(b)

4.

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)\quad [\alpha \subseteq \beta ]\)

2,3 BMP

5.

\(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}e\hspace{-0.5pt}P}{}) = 1\)

1 S

6.

\(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}e\hspace{-0.5pt}P}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(P) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)^c_U\)

D8(d)

7.

\(\varepsilon _{\hspace{-1.5pt}I}(P) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)^c_U\)

5,6 BMP

8.

\(\varepsilon _{\hspace{-1.5pt}I}(P) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)^c_U \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

7, ST4 UI

9.

\(\varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\quad [\beta \subseteq \gamma ]\)

7,8 BMP

10.

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\quad [\alpha \subseteq \beta \wedge \beta \subseteq \gamma ]\)

4,9 Adj.

11.

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U \rightarrow \varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

10, T\(\subseteq \) UI

12.

\(\varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\quad [\alpha \subseteq \gamma ]\)

10,11 MP

13.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(S) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)^c_U\)

D8(d)

14.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}) = 1\)

12,13 BMP

15.

\(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}e\hspace{-0.5pt}P}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}) = 1\)

1–14 CP

16.

\((\forall I)(\varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}e\hspace{-0.5pt}P}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({M\hspace{-0.5pt}a\hspace{-0.5pt}S}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}e\hspace{-0.5pt}S}{}) = 1)\)

15 UG

\(\square \)

By T\(\subseteq \) (line 11), Cesare is therefore s-valid in the same way as perfect syllogisms are. As in the proofs for perfect syllogisms, the extension of M (\(\beta \)) plays the role of connecting \(\varepsilon _{\hspace{-1.5pt}I}(S)\) and \(\varepsilon _{\hspace{-1.5pt}I}(P)^c_U\). But we are only able to apply transitivity of set inclusion once the subset relation that follows from interpreting the first premiss (recall line 7) has been replaced with a set-theoretically equivalent subset relation (line 9). This additional precondition corresponds to applying the set-theoretical theorem ST4 (line 8). In this case, transitivity of set inclusion is only a necessary condition for s-validity.

Here, to ‘make a syllogism perfect’ means to find a theorem of set theory that allows us to make use of transitivity of set inclusion. Indeed, we have shown that Cesare requires an additional theorem of set theory other than transitivity of set inclusion for a proof of its s-validity without conversion, which is also the case for the remaining syllogisms of the second figure. For example, it is possible to prove the s-validity of syllogisms of the form Baroco (\({M\hspace{-0.5pt}a\hspace{-0.5pt}P}{}, {M\hspace{-0.5pt}o\hspace{-0.5pt}S}{} \therefore {P\hspace{-0.5pt}o\hspace{-0.5pt}S}{}\)) of the second figure by ecthesis. As with Cesare, transitivity of set inclusion is necessary, but not sufficient, for its s-validity. In addition to applying T\(\subseteq \), we must make use of theorem ST5.Footnote 17Baroco is therefore imperfect.Footnote 18

In Sect. 3.2.1, we presented a semi-formal proof of the s-validity of syllogisms of the form Darapti of the third figure. In this case, there are two universal affirmative premisses. Exposing \(\delta _1\), differently from the way it was done in the proofs for syllogisms that have a particular premiss, such as Darii and Ferio, is handled by theorem ST2 (fn. 7). The conclusion is obtained only after applying T\(\subseteq \) twice. The ecthetic proof of Felapton (\({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}, {S\hspace{-0.5pt}a\hspace{-0.5pt}M}{} \therefore {P\hspace{-0.5pt}o\hspace{-0.5pt}S}{}\)) of the third figure is completely analogous. Darapti (as well as Felapton) are therefore imperfect.

Thus, transitivity of set inclusion is a necessary condition for the validity of all assertoric syllogisms.Footnote 19 In light thereof, the following metatheorem follows from the definition of a perfect syllogism:

MT-Imp

A syllogism \(\Gamma \therefore s\) is imperfect iff, to prove \(\Gamma \therefore s\) to be s-valid on the basis of our semantics, some other set-theoretical theorem in addition to transitivity of set inclusion is required.

Furthermore, in all proofs of s-validity for perfect as well as for imperfect syllogisms, the extension of the middle term M is the connector between the extensions of the other two general terms.

5.1 The Laws of Conversion

Unlike Aristotle, we do not make use of the method of indirect proof (reductio ad impossibile) but rather of that of direct proof to prove that the laws of conversion are s-valid.

MT-e-conv

Every argument of \(\mathbb {L}\) of the form \({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{} \therefore {B\hspace{-0.75pt}e\hspace{-0.5pt}A}{}\) is s-valid.

Proof

By definitions D9 and D10, we have to show that

$$\begin{aligned}(\forall I) (\varepsilon _{\hspace{-1.5pt}I}({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({B\hspace{-0.75pt}e\hspace{-0.5pt}A}{}) = 1).\end{aligned}$$

1.

\(\varepsilon _{\hspace{-1.5pt}I}({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}) = 1\)

Ass. for CP

2.

\(\varepsilon _{\hspace{-1.5pt}I}({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(B) \subseteq \varepsilon _{\hspace{-1.5pt}I}(A)^c_U\)

D8(d)

3.

\(\varepsilon _{\hspace{-1.5pt}I}(B) \subseteq \varepsilon _{\hspace{-1.5pt}I}(A)^c_U\)

1,2 BMP

4.

\(\varepsilon _{\hspace{-1.5pt}I}(B) \subseteq \varepsilon _{\hspace{-1.5pt}I}(A)^c_U \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(A) \subseteq \varepsilon _{\hspace{-1.5pt}I}(B)^c_U\)

3, ST4 UI

5.

\(\varepsilon _{\hspace{-1.5pt}I}(A) \subseteq \varepsilon _{\hspace{-1.5pt}I}(B)^c_U\)

3,4 BMP

6.

\(\varepsilon _{\hspace{-1.5pt}I}({B\hspace{-0.75pt}e\hspace{-0.5pt}A}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(A) \subseteq \varepsilon _{\hspace{-1.5pt}I}(B)^c_U\)

D8(d)

7.

\(\varepsilon _{\hspace{-1.5pt}I}({B\hspace{-0.75pt}e\hspace{-0.5pt}A}{}) = 1\)

5,6 BMP

8.

\(\varepsilon _{\hspace{-1.5pt}I}({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({B\hspace{-0.75pt}e\hspace{-0.5pt}A}{}) = 1\)

1–7 CP

9.

\((\forall I) (\varepsilon _{\hspace{-1.5pt}I}({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({B\hspace{-0.75pt}e\hspace{-0.5pt}A}{}) = 1)\)

8 UG

\(\square \)

As we can see, the s-validity of e-conversion follows from exactly the same theorem (line 4, ST4 UI) that we used earlier to prove the s-validity of Cesare.

That direct proofs for syllogisms and for the laws of conversion must have been discussed by Aristotle and his disciples and later successors is evident from the fact that Aristotle provided the instructions for ecthetic proofs. Later Theophrastus and Eudemus came up with the first direct proof for e-conversion (the line numbers refer to our proof of MT-e-conversion above):

Theophrastus, however, and Eudemus have proven in a simpler way that the universal negative (premiss) can be converted ... They conduct the proof thus: Let A belong to no B [line 1: \({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{} =1\)]. If it belongs to none [if \({A\hspace{-0.75pt}e\hspace{-0.5pt}B}{}\) is true], then A has to be separate [...] and isolated [..] from B [line 3: \(\varepsilon _{\hspace{-1.5pt}I}(B) \subseteq \varepsilon _{\hspace{-1.5pt}I}(A)^c_U\)]. That which is separated, however, is separate from the separated [line 4: ST4 UI]. Thus, B is also wholly separate from A [line 5: \(\varepsilon _{\hspace{-1.5pt}I}(A) \subseteq \varepsilon _{\hspace{-1.5pt}I}(B)^c_U\)].. And if it is so, then it belongs to no (A).Footnote 20 [line 7: \(\varepsilon _{\hspace{-1.5pt}I}({B\hspace{-0.75pt}e\hspace{-0.5pt}A}{}) = 1\)]

Their proof shows that they too had an extensional interpretation of the general terms, but they did not realise that in their proof they were making use of a set-theoretic theorem.Footnote 21 According to Themistius, Boethus of Sidon later managed to provide proofs without conversion for all syllogisms regarded as s-valid by Aristotle.Footnote 22

Whereas the proof of e-conversion requires only theorem ST4, the proof of the s-validity of a-conversion, i.e. \({A\hspace{-0.75pt}a\hspace{-0.5pt}B}{} \therefore {B\hspace{-0.75pt}i\hspace{-0.5pt}A}{}\), by ecthesis, like those of Darapti and Felapton, requires theorem ST2 (fn. 7) as well as T\(\subseteq \). The s-validity of i-conversion, i.e. \({A\hspace{-0.75pt}i\hspace{-0.5pt}B}{} \therefore {B\hspace{-0.75pt}i\hspace{-0.5pt}A}{}\), follows from commutativity of conjunction only.Footnote 23

Theorem ST2 and T\(\subseteq \) are also needed for the direct proofs of the s-validity of the laws of subalternation as well as of all syllogisms whose s-validity is usually proven by means of those laws. This implies that those syllogisms are imperfect.

When possible, Aristotle makes use of the method of conversion. For example, in lieu of using theorem ST4, the first premiss of Cesare, \({M\hspace{-0.5pt}e\hspace{-0.5pt}P}{}\), is replaced with \({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}\) by e-conversion. If we then apply MT-Celarent UI, we end up with the desired conclusion of Cesare. This means that we can use MT-Celarent to derive the conclusion of Cesare. Aristotle, when stating that Cesare can be reduced to Celarent, is therefore referring to the fact that the s-validity of Cesare can be proven by universal instantiation in MT-Celarent. A proof by conversion, like an ecthetic proof, is therefore a semantic proof of s-validity, and the same applies to the proofs by reductio ad impossibile. In all of his proofs, Aristotle first assumes that the premisses are true and then shows that the conclusion must also be true.

Celarent now, is s-valid by T\(\subseteq \), which means that Cesare is also s-valid by T\(\subseteq \). However, this vital role of T\(\subseteq \) is not apparent in Aristotle’s proofs by conversion if one is not familiar with the proofs of the s-validity of perfect syllogisms.

Therefore, the main metalogical question of assertoric syllogistic seems to be: given three sets that are the extensions of the three general terms that occur in the premisses of a syllogism, can we apply transitivity of set inclusion? And if so-under what conditions? In some cases, the relationships between the extensions of these general terms are such that they allow direct application of transitivity of set inclusion. Aristotle calls these syllogisms ‘perfect’. And if the extensions of the general terms that occur in the premisses are not related to one another by the transitive relation of set inclusion even though the syllogism is s-valid, the syllogism is called ‘imperfect’. In most cases, it is possible to correlate the extensions of the three general terms by the transitive relation of set inclusion by using another set-theoretic theorem. If we use the laws of conversion the role of transitivity of set inclusion is not apparent, since in this case, it is used only indirectly; the same applies to the proofs by reductio ad impossibile. Whereas we make direct use of transitivity of set inclusion in direct proofs without conversion and in proofs by ecthesis, it is only used indirectly in proofs by conversion and by reductio ad impossibile.

It seems to be possible to divide assertoric syllogisms into two groups: the proof of the s-validity of a perfect syllogism requires only transitivity of set inclusion (D11), whereas that of an imperfect syllogism requires an additional theorem of set theory.

However, it is not as simple as that. In the next section, we shall show that some syllogisms which Aristotle considers to be imperfect satisfy our definition of a perfect syllogism (D11).

6 Perfect Syllogisms of the Third and Fourth Figures

We shall show that syllogisms of the form Bocardo of the third figure satisfy our definition of a perfect syllogism. Taking a look at Aristotle’s formulation of Bocardo, we can compare our proof with his sketch:

For if P belongs to all M, but S does not belong to some M, it is necessary that S does not belong to some P. [...] Proof is possible also without reduction ad impossibile, if one of the Ms be taken to which S does not belong. [Prior Analytics, A6, 28\(^\textrm{b}\)17–20]

This amounts to \({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}, {S\hspace{-0.5pt}o\hspace{-0.5pt}M}{} \therefore {S\hspace{-0.5pt}o\hspace{-0.5pt}P}{}\). We now prove the s-validity of Bocardo by ecthesis.

MT-Bocardo

Every argument of \(\mathbb {L}\) of the form: \({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}, {S\hspace{-0.5pt}o\hspace{-0.5pt}M}{} \therefore {S\hspace{-0.5pt}o\hspace{-0.5pt}P}{}\) is s-valid.

Proof

By definitions D9 and D10, we have to show that

$$\begin{aligned} (\forall I) (\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}M}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}P}{}) = 1). \end{aligned}$$

1.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}M}{}) = 1\)

Ass. for CP

2.

\(\varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}M}{}) = 1\)

1 S

3.

\(\varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}M}{}) = 1 \leftrightarrow (\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(S)^c_U)\)

D8(e)

4.

\((\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(S)^c_U)\)

2,3 BMP

5.

\(\delta _1 \ne \emptyset \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(S)^c_U\quad [\alpha = \delta _1]\)

4 EI

6.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1\)

1 S

7.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1 \leftrightarrow \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\)

D8(b)

8.

\(\varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\quad [\beta \subseteq \gamma ]\)

6,7 BMP

9.

\(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\quad [\alpha \subseteq \beta \wedge \beta \subseteq \gamma ]\)

5 S,8 Adj.

10.

\(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(M) \wedge \varepsilon _{\hspace{-1.5pt}I}(M) \subseteq \varepsilon _{\hspace{-1.5pt}I}(P) \rightarrow \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\)

9, T\(\subseteq \) UI

11.

\(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(P)\quad [\alpha \subseteq \gamma ]\)

9,10 MP

12.

\(\delta _1 \ne \emptyset \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(P) \wedge \delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(S)^c_U\)

5 S,11,5 S Adj.

13.

\((\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(P) \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(S)^c_U)\)

12 EG

14.

\(\varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}P}{}) = 1 \leftrightarrow (\exists \delta ) (\delta \ne \emptyset \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(P) \wedge \delta \subseteq \varepsilon _{\hspace{-1.5pt}I}(S)^c_U)\)

D8(e)

15.

\(\varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}P}{}) = 1\)

13,14 BMP

16.

\(\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}M}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}P}{}) = 1\)

1–15 CP

17.

\((\forall I) (\varepsilon _{\hspace{-1.5pt}I}({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}) = 1 \wedge \varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}M}{}) = 1 \rightarrow \varepsilon _{\hspace{-1.5pt}I}({S\hspace{-0.5pt}o\hspace{-0.5pt}P}{}) = 1)\)

16 UG

\(\square \)

The only set-theoretical theorem used to prove by ecthesis that Bocardo is s-valid is transitivity of set inclusion. The extension of M does once again assume its connecting role (line  10). The exposed set \(\delta _1\) is ‘one of the Ms [...] to which S does not belong’: \(\delta _1\) is a subset of \(\varepsilon _{\hspace{-1.5pt}I}(M)\) but not of \(\varepsilon _{\hspace{-1.5pt}I}(S)\) since \(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(S)^c_U\) (line  5). The truth condition of the conclusion is derived by the rules of inference of predicate logic and by T\(\subseteq \). By D11, Bocardo is perfect.Footnote 24

Furthermore, i-conversion is not required to prove the s-validity of syllogisms of the form Datisi (\({P\hspace{-0.5pt}a\hspace{-0.5pt}M}{}, {S\hspace{-0.5pt}i\hspace{-0.5pt}M}{} \therefore {P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}\)), Disamis (\({P\hspace{-0.5pt}i\hspace{-0.5pt}M}{}, {S\hspace{-0.5pt}a\hspace{-0.5pt}M}{} \therefore {P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}\)), and Ferison (\({P\hspace{-0.5pt}e\hspace{-0.5pt}M}{}, {S\hspace{-0.5pt}i\hspace{-0.5pt}M}{} \therefore {P\hspace{-0.5pt}o\hspace{-0.5pt}S}{}\)) of the third figure and Dimaris (\({M\hspace{-0.5pt}i\hspace{-0.5pt}P}{}, {S\hspace{-0.5pt}a\hspace{-0.5pt}M}{} \therefore {P\hspace{-0.5pt}i\hspace{-0.5pt}S}{}\)) of the fourth figure. This is due to the fact that i-conversion follows from commutativity of conjunction alone and, therefore, there is no need for it in our direct proofs. In all five cases, it is possible to derive the subset relation \(\delta _1 \subseteq \varepsilon _{\hspace{-1.5pt}I}(M)\) by simplification. It is the same pattern over and over again: \(\beta \) is always \(\varepsilon _{\hspace{-1.5pt}I}(M)\) and \(\alpha = \delta _1\), and no set-theoretical theorem other than transitivity of set inclusion is needed to derive the conclusions. By D11, they are therefore perfect.Footnote 25 All remaining syllogisms of the fourth figure are imperfect.

Out of the 24 syllogisms we investigated in the four figures, nineFootnote 26 have been shown to satisfy our definition of perfect syllogism. Indeed, we have shown that every syllogism which has one of these forms, such as for instance Barbara or Bocardo, is perfect. This means that there are not only nine perfect syllogisms, but nine argument forms which give rise to an unlimited number of perfect syllogisms.

This result is contrary to Aristotle’s conviction that only first figure syllogisms are perfect. Then, how can such a discrepancy be explained?

Aristotle knew how to prove the s-validity of the four perfect syllogisms of the first figure, and he proved the s-validity of the conversion laws per impossibile. This shows that he wanted to have a solid basis for proving that the syllogisms of the other figures are s-valid, instead of simply assuming that the perfect syllogisms are ‘evident’. Only after asserting that does he go on to show how the s-validity of the syllogisms of the other figures can be proven by conversion-excepting Baroco und Bocardo. The result is a beautiful system where only the perfect syllogisms of the first figure are needed, even in the proofs by reductio ad impossibile:

But it is evident also that all the syllogisms in this figure are imperfect: for all are made perfect by certain supplementary statements, which either are contained in the terms of necessity [laws of conversion] or are assumed as hypotheses [laws of the square of opposition], i. e.  when we prove per impossibile. [Prior Analytics, A5, 28\(^\textrm{a}\)4–7]

The reason for his claim that only the four syllogisms of the first figure are perfect seems to be that in proofs by conversion and by reductio ad impossibile, only syllogisms of these four forms are needed to prove the s-validity of the syllogisms of the other two figures.

Aristotle probably only introduced his criterion of perfection so as to prove the s-validity of the four perfect syllogisms of the first figure. He did not grasp the concept of transitivity of set inclusion, which is needed for these proofs, as a set-theoretic theorem. There was no set theory at Aristotle’s time. He also did not and could not have known the other theorems that are necessary to prove the s-validity of imperfect syllogisms without conversion. Although first-order predicate logic did not yet exist either, he was able to use its main laws of deduction. But he did not understand these laws as such. In order to define perfection and establish what it means for a syllogism to be imperfect in the way we did, i.e. by proving the s-validity of imperfect syllogisms without using conversions laws, he would have had to make use of theories that were not at his disposal. His ecthetic proofs yield the supposition that he may have seen that some (imperfect) syllogisms are indeed perfect. But he did not have at his disposal the theorems required to differentiate the 5 syllogisms that we have found to be perfect from the remaining, imperfect ones.

7 Conclusion

Our investigation of the metatheory of assertoric syllogistic has revealed many new and interesting aspects. First of all, we have shown that it is possible to construct a semantics based on Aristotle’s text. The truth conditions obtained conform to those of Aristotle. Secondly, the investigation of the dictum de omni et nullo has revealed that, in it, Aristotle makes use of sets as extensions of predicates, and that he was aware of set-inclusion. Both are needed in order to establish the truth conditions of universal categorical sentences (Sect. 3.1).

Thirdly, the truth conditions of particular sentences are obtained from the truth conditions of universal sentences and from ecthetic proofs (Sect. 3.2). As a consequence, it follows that the metalanguage of categorical syllogistic is the first-order predicate logic with identity and set inclusion. All proofs of s-validity are realised in a calculus of natural deduction.

Additionally, it turns out that Aristotle did not think of perfect syllogisms as ‘evident’ or ‘more evident’ than imperfect ones, nor did he regard them as rules of inference or axioms that need no proof of s-validity. Their apparent evidentness is due to transitivity of set inclusion, which is furthermore a necessary and sufficient condition for their s-validity. Aristotle was aware of transitivity of set inclusion. Our definition of a perfect syllogism(Sect. 4,D11) is based on Aristotle’ criterion of perfection and on completing his proof sketches.

We have also demonstrated that and how the perfection of a syllogism depends on the position of the extension of the middle term, and not the position of the middle term itself.

Furthermore, the s-validity of imperfect syllogisms can be directly proven by means of some set-theoretic theorems plus transitivity of set inclusion, in lieu of conversion laws. These theorems are needed in order to correlate the extensions of the three general terms by the transitive relation of set inclusion. A consequence of these proofs is that transitivity of set inclusion turns out to be a necessary condition for the s-validity of imperfect syllogisms (Sect. 5, MT-Imp).

What is more, the very same theorems used to prove without conversion the s-validity of a syllogism are needed to prove the s-validity of the corresponding conversion laws directly (Sect. 5.1). The laws of the square of opposition are not needed at all.

Finally, our investigation has revealed that every syllogism of nine different forms, and not merely four, satisfy Aristotle’s criterion of perfection (see Sect. 6).

In sum, our investigation of the metatheory of syllogistic has revealed that it contains many elements of semantics. Not only did Aristotle discover the concept of logical consequence, he also knew how it can be used to establish whether an argument is s-valid and that, in order to do this, it is necessary to know the truth conditions of categorical sentences. These truth conditions are defined in terms of set inclusion.

Furthermore, it is beyond doubt that he was aware of transitivity of set inclusion: his syllogistic contains elements of naïve set-theory, even though he did not have set theory at his disposal. He was also able to use the main laws of inference of first-order predicate logic, albeit without grasping them as such. Not only did he discover how to prove that some arguments are s-valid, he did so by using elements of semantics, of set-theory, and of first-order predicate logic.