This chapter discusses some philosophical issues concerning the nature of formal logic. Particular attention will be given to the concept of logical form, the goal of formal logic in capturing logical form, and the explanation of validity in terms of logical form. We shall see how this understanding of the notion of validity allows us to identify what we call formal fallacies, which are mistakes in an argument due to its logical form. We shall also discuss some philosophical problems about the nature of logical forms. For the sake of simplicity, our focus will be on propositional logic. But many of the results to be discussed do not depend on this choice, and are applicable to more advanced logical systems.

Logic, Validity, and Logical Forms

Different sciences have different subject matters: physics tries to discover the properties of matter, history aims to discover what happened in the past, biology studies the development and evolution of living organisms, mathematics is, or at least seems to be, about numbers, sets, geometrical spaces, and the like. But what is it that logic investigates? What, indeed, is logic?

This is an essentially philosophical question, but its answer requires reflection on the status and behavior of logical rules and inferences. Textbooks typically present logic as the science of the relation of consequence that holds between the premises and the conclusion of a valid argument, where an argument is valid if it is not possible for its premises to be true and the conclusion false. If logic is the science of the relation of consequence that holds between the premises and the conclusion of a valid argument, we can say that logicians will be concerned with whether a conclusion of an argument is or is not a consequence of its premises.

Let us examine the notion of validity with more care. For example, consider the following argument:

If Alex is a sea bream, then Alex is not a rose.

Alex is a rose.

[latex]/ \therefore[/latex] Alex is not a sea bream.

It can be shown that it is not possible for (1) and (2) to be true yet (3) false. Hence, the whole argument is valid. For convenience, let us represent each sentence of the argument into the standard propositional logic, which aims to analyze the structure and meaning of various propositions. To do this, we must first introduce the language of our logic.

The alphabet of propositional logic contains letters standing for sentences: A, B, C, and so on. For example, we can translate “Alex is a rose” by just using B. Similarly, we can use S to translate “I would love to smell it.” The alphabet of propositional logic contains other symbols known as logical connectives. One is a symbol for “not” or negation [latex](\neg )[/latex]. When we say that Alex is not a rose, we, in effect, say that it is not the case that Alex is a rose. If we translate “Alex is a rose” by B, we translate “Alex is not a rose” as “[latex]\neg B[/latex].” Another is a symbol [latex](\rightarrow)[/latex] for conditional sentences of the form “if … then ….” For example, we can translate “If Alex is a rose, then I would love to smell it” as “[latex]B \rightarrow A[/latex].” When we say that if Alex is a rose, then I would love to smell it, we say something conditional: on the condition that Alex is a rose, I would love to smell it. In general, a conditional sentence has two components. We call the first component the antecedent, the second component the consequent, and the whole proposition a conditional. The language of our logic also includes “and” [latex](\wedge)[/latex], otherwise known as conjunction, and “or” [latex](\vee)[/latex], otherwise known as disjunction. But in this chapter, we shall only deal with negation and conditional.

Thus, if we use A for “Alex is a sea bream,” we can represent (1) with [latex]A \rightarrow \neg B[/latex], and represent our above argument (1)-(3) as follows:

[latex]A \rightarrow \neg B[/latex]

[latex]B[/latex]

[latex]/ \therefore \neg A[/latex]

But, recall, our aim was to examine why this argument, if at all, is valid. The mere representation of “not” by “[latex]\neg[/latex]” and “if … then” by “[latex]\rightarrow[/latex]” will not be sufficient to verify the validity or invalidity of a given argument: we also need to know what these symbols and the propositions they express mean. But how can we specify the meaning of “[latex]neg[/latex] ” and “[latex]\rightarrow[/latex]”?

It is plausible to say that if A is true, then its negation is false, and vice versa. For example, if “Alex is a rose” is true, then “Alex is not a rose” is false. This gives us the meaning of “[latex]\neg[/latex]”. We can represent this information about the meaning of negation in terms of a truth-table in the following way (with T symbolising true, and F false):

Truth table for negation

[latex]A[/latex]

[latex]\neg A[/latex]

T

F

F

T

Here, we can read each row of the truth-table as a way the world could be. That is, in situations or possible worlds where A is true (for example, where Alex is indeed a sea bream), [latex]\neg \textit{A}[/latex] is false (it is false that Alex is a sea bream); and vice versa. Thus construed, a truth-table gives us the situations in which a proposition such as A is true, and those in which it is false. In addition, it tells us in what situations [latex]\neg \textit{A}[/latex] is true, and in what situations it is false.

In a similar way, we can specify the meaning of “[latex]\rightarrow[/latex]” by specifying the situations in which conditional propositions of the form “[latex]\textit{A} \rightarrow \textit{B}[/latex]” are true or false. Here is the standard truth-table for “[latex]\rightarrow[/latex]”:

Truth table for material conditional

[latex]A[/latex]

[latex]B[/latex]

[latex]A \rightarrow B[/latex]

T

T

T

T

F

F

F

T

T

F

F

T

As can be seen, there is only one row in which [latex]\textit{A} \rightarrow \textit{B}[/latex] is false; i.e. the second row in which the consequent is false, but the antecedent is true. As the first row tells us, if both A and B are true, then so is [latex]\textit{A} \rightarrow \textit{B}[/latex]. Further, the third and fourth rows tell us that if the antecedent is false, then the whole conditional is true, regardless of whether the consequent is true or false. Hence, all conditionals with false antecedents are true.

But how is it possible for a conditional to be true if its antecedent is false? Here is one suggestion to answer this question: if your assumption is false, then you can legitimately conclude whatever you would like to. For example, if you assume that Amsterdam is the capital of England, you can legitimately conclude anything whatsoever; it does not matter whether it’s true or false. Thus, from the assumption that Amsterdam is the capital of England, you can conclude that Paris is the capital of France. You can also conclude that Paris is the capital of Brazil.

We can see that one important piece of information that truth-tables convey concerns how the truth or falsity of complex sentences such as [latex]\textit{A} \rightarrow \textit{B}[/latex] and [latex]\neg \textit{A}[/latex] depends on the truth or falsity of the propositional letters they contain: the truth or falsity of [latex]\textit{A} \rightarrow \textit{B}[/latex] depends solely on the truth or falsity of A and of B. Similarly, the truth or falsity of [latex]\neg \textit{A}[/latex] depends solely on that of A.

Now we are in a position to verify whether our argument (1)-(3) is valid or not. And, as we shall see in a moment, the validity or invalidity of an argument depends on the meaning of the logical connectives (such as “[latex]\rightarrow[/latex]” and “[latex]\neg[/latex]”) which is specified by the corresponding truth-tables. In other words, if the truth-tables of these connectives were different to what they actually are, we would have a different collection of valid arguments.

We defined an argument as valid if it is not possible for its premises to be true and the conclusion false. By designing a truth-table, we can see under what conditions the premises [latex](\textit{A} \rightarrow \neg \textit{B}, \textit{B})[/latex] and the conclusion [latex](\neg \textit{A})[/latex] of our argument (1)-(3) are true or false:

Truth table for argument (1)-(3)

[latex]A[/latex]

[latex]A[/latex]

[latex]A \rightarrow \neg B[/latex]

[latex]B[/latex]

[latex]\neg A[/latex]

T

T

F

T

F

T

F

F

F

T

F

T

T

T

T

F

F

T

F

T

Since in the above truth-table, there is no row in which the premises [latex](\textit{A} \rightarrow \neg \textit{B}, \textit{B})[/latex] are true and the conclusion [latex](\neg A)[/latex] false, the argument is valid. The only row in which the premises are both true is the third row, and in that row the conclusion is also true. In other words, there is no world or situation in which (1) and (2) are true, but (3) is not. This just means that the argument is valid.

Now, consider the following argument:

If Alex is a tiger, then Alex is an animal.

Alex is not a tiger.

[latex]/ \therefore[/latex] Alex is not an animal.

There are situations in which the argument works perfectly well. For example, suppose that Alex is not a tiger but is, in fact, a table. In this case, Alex would not be an animal, either. And thus, the sentences (4), (5), and (6) would be true. But this is not always the case, for we can imagine a situation in which the premises are true but the conclusion false, such as where Alex is not a tiger but is, in fact, a dog. Thus, by imagining the situation just described, we would have produced a counterexample: in this situation, (6) would be false, and hence it would not be a consequence of (4) and (5). The argument is invalid.

That the argument is invalid can also be verified by the method of truth-tables. For we can find a situation in which (4) and (5) are both true and yet (6) false. That is, in the truth-table, if we represent (4) as [latex]\textit{C} \rightarrow \textit{D}[/latex], (5) as [latex]\neg \textit{C}[/latex], and (6) as [latex]\neg \textit{D}[/latex], there will be at least one row in which the premises are true and the conclusion false (which row is that?):

Truth table for argument (4)-(6)

[latex]C[/latex]

[latex]D[/latex]

[latex]C\rightarrow D[/latex]

[latex]\neg C[/latex]

[latex]\neg D[/latex]

T

T

T

F

F

T

F

F

F

T

F

T

T

T

F

F

F

T

T

T

We said that logicians are concerned with validity or invalidity of arguments, and we proposed the method of truth-tables for undertaking this task. But which arguments are valid, and which are not? It is here that the notion of logical form emerges. Suppose that a logician embarks on the ridiculous task of recording each and every valid argument. In this case, she would surely record that (1)-(3) is valid. Now, suppose she faces the following argument:

If Alice is reading Hegel, she is not frustrated.

Alice is frustrated.

[latex]/ \therefore[/latex] Alice is not reading Hegel.

To see whether this argument is valid or not, she can rewrite each sentence of the argument in her logical language: Alice is reading Hegel [latex](\textit{P})[/latex]; Alice is frustrated [latex](\textit{Q})[/latex]; and, if Alice is reading Hegel, then Alice is not frustrated) [latex](\textit{P} \rightarrow \neg \textit{Q})[/latex]. She can then design a suitable truth-table, and check whether there is any row or situation in which the premises are both true and the conclusion false. Since there is no such row (why?), she will correctly announce that the argument is valid.

But it is obvious that in order to check the validity of (7)-(9), our logician did not need to go to this effort. It would suffice if she just noted that the two arguments (1)-(3) and (7)-(9), and their respective truth-tables, are to a great extent similar; they have the same form. In fact, their only difference is that in the first, the letters A and B have been used, and in the second they have been substituted for P and Q, respectively. The logical connectives [latex]\rightarrow[/latex] and [latex]\neg[/latex] have not changed.

To see the point, let us translate each argument into the language of propositional logic we introduced above:

The two arguments have something in common. Let us say that what they have in common is their logical form. As you can see, the logical connectives of the arguments have not changed. Since the two arguments have the same form, if one is valid, then the other must be valid, too. More generally, all arguments of this same form are valid. The liberating news is that our logician does not need to embark on the exasperating task of checking the validity of each and every argument separately. For if she already knows that a given argument is valid, and if she can also show that another argument has the same form as the first one, then she can be sure that the second argument is valid without having to design its truth-table.

We said that an argument is valid if it is not possible for the premises to be true and the conclusion false. Now, we can say that every argument which shares its form with a valid argument is also valid, and consequently, every argument which shares its form with an invalid argument is also invalid.^{[1]}It is in this sense that the idea of logical form can be used to establish the (in)validity of arguments. For example, suppose that we want to check the validity of the following argument:

If Alice is reading Russell, then Alice is thinking of logic.

Alice is not reading Russell.

[latex]/ \therefore[/latex] Alice is not thinking of logic.

As soon as we see that (10)-(12) has the same form as (4)-(6), which we already know to be invalid, we can be assured that the former is also invalid without having to construct its truth-table.

Thus, we can see that understanding the notion of validity in terms of logical form allows us to identify various formal fallacies. For example, the argument (10)-(12) is an instance of the fallacy of denying the antecedent. Thus, every argument which shares its form with (10)-(12) is also invalid.

There are three further questions we may ask about logical forms: (i) How can we “extract” the logical form from arguments which they share? That is, how can we show that various arguments are instances of a common logical form? (ii) What is the nature of a logical form? Is a logical form a thing, and if so, what sort of thing is it? (iii) Does each argument have only one logical form? In the following three sections, we shall talk about these three questions, respectively.

Extracting Logical Forms

Let us, again, consider the arguments (1)-(3) and (7)-(9) which seem to share one and the same logical form. How can we show that they have a common logical form? First, we should represent them in logical symbols:

To see what these two arguments have in common, we must abstract away from (or ignore or leave aside) the specific contents of their particular premises and conclusions, and thereby reveal a general form that is common to these arguments. For example, we must ignore whether Alex is or is not a rose; all that matters is to replace “Alex is a rose” with B. In this sense, to obtain or extract the logical form of an argument, we must abstract from the content of the premises and the conclusion by regarding them as mere place-holders in the form that the argument exhibits. As you may have noted, we do not extract away the content of the logical connectives. It is an important question as to why we do not abstract away from the logical connectives. The basic thought is that their meaning constitutes an important part of the logical form of an argument, and thereby in determining its (in)validity.

To talk about logical forms, we shall use the lowercase Greek letters such as [latex]\alpha, \beta, \gamma,[/latex] and [latex]\delta[/latex]. For example, we can represent the logical form that (1)-(3) and (7)-(9) share as follows:

[latex]\alpha \rightarrow \neg \beta[/latex]

[latex]\beta[/latex]

[latex]/ \therefore \neg \alpha[/latex]

An analogy may help here: In mathematics, we think about particular arithmetical propositions such as “[latex]1 + 2 = 2 + 1[/latex]” and “[latex]0 + 2 = 2 + 0[/latex].” But when we want to generalize, we use formulas that contain variables, and not specific numbers. For example, “[latex]x + y = y + x[/latex]” expresses something general about the behaviour of the natural numbers. Whatever natural numbers x and y stand for, “[latex]x + y = y + x[/latex]” remains true. The same goes with the variables [latex]\alpha, \beta, \gamma,[/latex] and [latex]\delta[/latex], which enable us to talk in a general way about the premises and conclusion of arguments. Whatever meaning [latex]\alpha[/latex] and [latex]\beta[/latex] are given, that is, whatever propositions they express, (i)-(iii) remains valid, and so do all of its instances, such as (1)-(3) and (7)-(9).

As mentioned above, extracting a certain logical form allows us to talk, in a general way, about premises and conclusions of arguments. It does not matter what specific objects and properties—what specific subject matter—they talk about. And this leads us, again, to our initial concern about the real subject matter of logic:

Form can thus be studied independently of subject-matter, and it is mainly in virtue of their form, as it turns out, rather than their subject-matter that arguments are valid or invalid. Hence it is the forms of argument, rather than actual arguments themselves, that logic investigates. (Lemmon 1971, 4)

According to this conception of logic, logicians are in a position to evaluate the validity of an argument, even if they do not strictly understand the content of the claims within the argument, nor under what conditions they would be true. Whether or not the claims within arguments are true, therefore, is not a matter for logic. Instead, what logic does is to explore the logical forms of arguments, and thereby establish their (in)validity.

The Nature of Logical Forms

In this and the next section, we will look into more philosophical matters. In this section, we shall discuss our second question: what is the nature of a logical form? The question about the nature of logical form is reminiscent of the ancient question about the nature of universals. All red roses have something in common; they all share or instantiate something. But what is that thing, if it is a thing at all? Is the property of being red akin to a Platonic universal that exists independently of the red roses that instantiate it? Or is it like an Aristotelian universal whose existence depends on the existence of the particular roses? Perhaps, it does not have any existence at all; it is nothing more than a name or a label that we use to talk about red roses. We can ask exactly the parallel questions about logical forms: What is it that all valid arguments of the same form share or instantiate? Is it an entity in the world, or a symbol in language, or a mental construction formed and created by us?

Assuming that logical forms exist, what are they? There are, generally speaking, two lines of thought here. According to the first, logical forms are schemata, and hence, are linguistic entities. According to the second, logical forms are properties: they are extra-linguistic entities, akin to universals. They are what schemata express or represent. (An analogy may help here: The expression “is happy” is a predicate; it is a linguistic item. But it expresses an extra-linguistic entity, such as the property of being happy.)

Identifying logical forms with schemata appears to be quite intuitive. But it leads to a fallacy. As Timothy Smiley points out, the fallacy lies in “treating the medium as the message” (Smiley 1982, 3). Consider the logical form of (1)-(3):

[latex]\alpha \rightarrow \neg \beta[/latex]

[latex]\beta[/latex]

[latex]/ \therefore \neg \alpha[/latex]

You may like, with equal right, to identify the logical form of (1)-(3) with:

[latex]\gamma \rightarrow \neg \eta[/latex]

[latex]\eta[/latex]

[latex]/ \therefore \neg \gamma[/latex]

And yet another logician may prefer to capture its logical form with a distinct set of variables:

[latex]\chi \rightarrow \neg \delta[/latex]

[latex]\delta[/latex]

[latex]/ \therefore \neg \chi[/latex]

Which of these are the logical form of (1)-(3)? There are many different ways to capture its logical form. Which one of them has the right to be qualified as the logical form of (1)-(3)? This question is pressing if logical forms are taken to be schemata, and hence to be linguistic entities. If a logical form is just a string of symbols, then it varies by using a distinct set of variables. There will be no non-arbitrary way to choose one as opposed to any other as the logical form of a given argument. In other words, there will be nothing to choose between these linguistically distinct entities and, hence, none of them could be identified with the logical form of the original argument.

This may encourage us to identify logical forms as language-independent or language-invariant entities. On this view, logical forms are identified not with schemata, but with what schemata express or represent. They are worldly, rather than linguistic, entities. This view does not succumb to the above problem. Since, on this view, logical forms are worldly entities, none of the above candidates–i.e. (i)-(iii), (iv)-(vi), and (vii)-(ix)–is the logical form of (1)-(3). Rather, each of them expresses or represents its logical form.

One Logical Form or Many?

It seems then that we will be in a better position if we assume that logical forms are worldly entities. But this does not leave us completely home and dry, either. So far, we have assumed that logical forms are unique entities. That is, we assumed that arguments such as (1)-(3) and (7)-(9) have one and the same logical form. But is that the case?

In general, objects can take many forms. For example, a particular sonnet can be both Petrarchan and Miltonic, and a vase can be both a cuboid and a cube.^{[2]} Also, it seems that a single sentence can take many (at least, more than one) forms. Consider [latex]\neg(\textit{P} \rightarrow \neg \textit{Q})[/latex]. What is its logical form? It seems that each of the following options works perfectly well as an answer to our question: it is a negation; it is a negation of a conditional; and it is a negation of a conditional whose consequent is a negation.^{[3]}

Now, suppose that each of these logical forms is a logical form of a given argument. In virtue of what is each of them a logical form of one and the same argument? That is, what explains the fact that different logical forms are forms of one and the same argument? What unifies them in this respect? One answer is to say that all of these forms have a common logical form. But then you can ask the same question about this common logical form, since this very form has further different forms. In virtue of what are these logical forms forms of one and the same form? And this process can go endlessly. You have a logical form which itself has other logical forms, and so on. But this is not compatible with the thesis that logical forms are unique entities.^{[4]}

Question for Reflection

It seems that we cannot always talk of the logical form that an argument or various arguments share. If this view is correct, then what are its philosophical implications? Can we still understand the notion of validity in terms of the notion of logical form?

Summary

This chapter started with a question about the subject matter of formal logic: what is it that formal logic studies? We discussed the thesis that formal logic studies logical consequence through the form of arguments. We then explicated the notion of validity in terms of truth-tables, which specify the conditions under which a proposition is true or false–for example, a conditional proposition is false only when its antecedent is true and its consequence false; otherwise, it is true. Thus, as we discussed above, truth-tables can be employed to determine whether arguments formulated in the language of propositional logic are valid.

We then dug further into what it means for arguments to have a logical form, and how their logical form impacts their (in)validity. The chief idea is that every argument which shares its logical form with a valid argument is also valid, and consequently, every argument which shares its logical form with an invalid argument is also invalid. We saw how this understanding of the notion of validity enables us to identify formal fallacies, such as the fallacy of affirming the consequent. We ended this chapter by asking three philosophical questions about the nature, existence, and uniqueness of logical forms.

EXERCISES

Exercise One

Using a truth-table, show that the following argument, which is known as the fallacy of affirming the consequent, is invalid: [latex]A \rightarrow B, B; / \therefore A[/latex].

Exercise Two

Using a truth-table, how that the following argument, which is known as the hypothetical syllogism, is valid: [latex]A \rightarrow B}, B \rightarrow C; / \therefore A \rightarrow C[/latex]. [Hint: Your truth-table should have eight rows, as there are three propositional variables (A, B and C) that you need to include within it.]

Exercise Three

Use the truth-tables already given to you for the conditional [latex](\rightarrow)[/latex] and negation [latex](\neg)[/latex], and the two new truth-tables for conjunction [latex](\wedge)[/latex] and disjunction [latex](\vee)[/latex] below, which are used to logically express common uses of the vernacular ‘and’ and ‘or’, respectively:

Truth table for conjunction

[latex]A[/latex]

[latex]B[/latex]

[latex]A \wedge B[/latex]

T

T

T

T

F

F

F

T

F

F

F

F

Truth table for disjunction

[latex]A[/latex]

[latex]B[/latex]

[latex]A \vee B[/latex]

T

T

T

T

F

T

F

T

T

F

F

F

Evaluate whether the following arguments are valid or invalid. Firstly, identify their logical form, and then use truth-tables to establish their (in)validity.

We now know the situation. The Yankees either have to beat the Red Sox or they won’t make it to the World Series, and they won’t do the former.

Sarah will only pass the discrete mathematics exam if she knows her set theory. Fortunately, she does know set theory well, so she will pass the exam.

It just isn’t the case that you can be a liberal and a Republican, so either you’re not a Republican or you’re not a liberal.

If Dylan goes to law or medical school then he’ll be OK financially. Fortunately, he’s going to law school.

It is more accurate to say that every argument which shares its form with an invalid argument is also invalid within that logic, but not necessarily for every logic. For example, in propositional logic,

All men are mortal

Socrates is a man

[latex]/ \therefore[/latex] Socrates is mortal

is of the same logical form as:

All men are immortal

Socrates is a man

[latex]/ \therefore[/latex] Socrates is mortal

Both of these arguments can be translated as follows:

P

Q

[latex]/ \therefore[/latex] R

But (4)-(6), as opposed to (1)-(3), is invalid, for if all men are immortal and Socrates is a man, then Socrates is immortal. Thus, in propositional logic, both of these arguments have the same logical form, even though, from the perspective of a more expressive logic, such as first-order logic, which explains the role that quantifiers such as “all” and “some” play within arguments, only the first is valid. Thus, every argument which shares its form with a valid argument is valid within that logic, but not necessarily across the board. ↵

See Oliver (2010, 172), where he disagrees with Strawson (195, 54). ↵

This way of putting the point is due to Smith (2012, 81). ↵

This is reminiscent of the Aristotelian Third Man argument against Plato’s theory of Forms. ↵

definition

I'm going to put a table in here:

Cell 1

Cell 2

Cell 3

Row 2, Cell 1

Row 2, Cell 2

Row 3, Cell 3

Row 3, C1

R3, C2`

R3,C3

כל ישראל יש להם חלק לעולם הבא, שנאמר ועמך כולם צדיקים, לעולם יירשו ארץ, נצר מטעי מעשה ידי להתפאר.

!א.

!שה קיבל תורה מסיני ומסרה ליהושע, ויהושע לזקנים, וזקנים לנביאים, ונביאים מסרוה לאנשי כנסת הגדולה.

הם אמרו שלושה דברים:

הוו מתונים בדין, והעמידו תלמידים הרבה, ועשו סיג לתורה.

ב.

שמעון הצדיק היה משיירי כנסת הגדולה.

הוא היה אומר:

על שלושה דברים העולם עומד, על התורה ועל העבודה ועל גמילות חסדים.

ג.

אנטיגנוס איש סוכו קיבל משמעון הצדיק.

הוא היה אומר:

אל תהיו כעבדים המשמשין את הרב על מנת לקבל פרס, אלא הוו כעבדים המשמשין את הרב שלא על מנת לקבל פרס, ויהי מורא שמים עליכם.

ד.

יוסי בן יועזר איש צרדה ויוסי בן יוחנן איש ירושלים קיבלו מהם.

יוסי בן יועזר איש צרדה אומר:

יהי ביתך בית ועד לחכמים, והוי מתאבק בעפר רגליהם, והוי שותה בצמא את דבריהם.

ה.

יוסי בן יוחנן איש ירושלים אומר:

יהי ביתך פתוח לרוחה, ויהיו עניים בני ביתך, ואל תרבה שיחה עם האשה.

באשתו אמרו, קל וחומר באשת חברו. מכאן אמרו חכמים: כל המרבה שיחה עם האשה גורם רעה לעצמו, ובוטל מדברי תורה, וסופו יורש גיהנם.

ו.

יהושע בן פרחיה ונתאי הארבלי קיבלו מהם.

יהושע בן פרחיה אומר:

עשה לך רב, וקנה לך חבר, והוי דן את כל האדם לכף זכות.

ז.

נתאי הארבלי אומר:

הרחק משכן רע, ואל תתחבר לרשע, ואל תתיאש מן הפרענות.

ח.

יהודה בן טבאי ושמעון בן שטח קיבלו מהם.

יהודה בן טבאי אומר:

אל תעש עצמך כעורכי הדינין. וכשיהיו בעלי דינין עומדים לפניך, יהיו בעיניך כרשעים. וכשנפטרים מלפניך, יהיו בעינך כזכאין, כשקיבלו עליהם את הדין.

ט.

שמעון בן שטח אומר:

הוי מרבה לחקור את העדים, והוי זהיר בדבריך שמא מתוכם ילמדו לשקר.

י.

שמעיה ואבטליון קיבלו מהם.

שמעיה אומר:

אהוב את המלאכה, ושנא את הרבנות, ואל תתודע לרשות.

יא.

אבטליון אומר:

חכמים, הזהרו בדבריכם! שמא תחובו חובת גלות ותגלו למקום מים הרעים, וישתו התלמידים הבאים אחריכם וימותו, ונמצא שם שמים מתחלל.

יב.

הלל ושמאי קיבלו מהם.

הלל אומר:

הוי מתלמידיו של אהרן, אוהב שלום ורודף שלום, אוהב את הבריות ומקרבן לתורה.

How can we use the algebraic structure of a function [latex]f(x)[/latex] to compute a formula for [latex]f'(x)\text{?}[/latex]

What is the derivative of a power function of the form [latex]f(x) = x^n\text{?}[/latex] What is the derivative of an exponential function of form [latex]f(x) = a^x\text{?}[/latex]

If we know the derivative of [latex]y = f(x)\text{,}[/latex] what is the derivative of [latex]y = k f(x)\text{,}[/latex] where [latex]k[/latex] is a constant?

If we know the derivatives of [latex]y = f(x)[/latex] and [latex]y = g(x)\text{,}[/latex] how do we compute the derivative of [latex]y = f(x) + g(x)\text{?}[/latex]

In Chapter 1, we developed the concept of the derivative of a function. We now know that the derivative [latex]f'[/latex] of a function [latex]f[/latex] measures the instantaneous rate of change of [latex]f[/latex] with respect to [latex]x\text{.}[/latex] The derivative also tells us the slope of the tangent line to [latex]y=f(x)[/latex] at any given value of [latex]x\text{.}[/latex] So far, we have focused on interpreting the derivative graphically or, in the context of a physical setting, as a meaningful rate of change. To calculate the value of the derivative at a specific point, we have relied on the limit definition of the derivative,

In this chapter, we investigate how the limit definition of the derivative leads to interesting patterns and rules that enable us to find a formula for [latex]f'(x)[/latex] quickly, without using the limit definition directly. For example, we would like to apply shortcuts to differentiate a function such as [latex]g(x) = 4x^7 - \sin(x) + 3e^x[/latex]

Preview Activity2.1.1.

Functions of the form [latex]f(x) = x^n\text{,}[/latex] where [latex]n = 1, 2, 3, \ldots\text{,}[/latex] are often called power functions. The first two questions below revisit work we did earlier in Chapter 1, and the following questions extend those ideas to higher powers of [latex]x\text{.}[/latex]

Use the limit definition of the derivative to find [latex]f'(x)[/latex] for [latex]f(x) = x^2\text{.}[/latex]

Use the limit definition of the derivative to find [latex]f'(x)[/latex] for [latex]f(x) = x^3\text{.}[/latex]

Use the limit definition of the derivative to find [latex]f'(x)[/latex] for [latex]f(x) = x^4\text{.}[/latex] (Hint: [latex](a+b)^4 = a^4 + 4a^3b + 6a^2b^2 + 4ab^3 + b^4\text{.}[/latex] Apply this rule to [latex](x+h)^4[/latex] within the limit definition.)

Based on your work in (a), (b), and (c), what do you conjecture is the derivative of [latex]f(x) = x^5\text{?}[/latex] Of [latex]f(x) = x^{13}\text{?}[/latex]

Conjecture a formula for the derivative of [latex]f(x) = x^n[/latex] that holds for any positive integer [latex]n\text{.}[/latex] That is, given [latex]f(x) = x^n[/latex] where [latex]n[/latex] is a positive integer, what do you think is the formula for [latex]f'(x)\text{?}[/latex]

Subsection2.1.1Some Key Notation

In addition to our usual [latex]f'[/latex] notation, there are other ways to denote the derivative of a function, as well as the instruction to take the derivative. If we are thinking about the relationship between [latex]y[/latex] and [latex]x\text{,}[/latex] we sometimes denote the derivative of [latex]y[/latex] with respect to [latex]x[/latex] by the symbol

\begin{equation*}
\frac{dy}{dx}
\end{equation*}

which we read “dee-y dee-x.” For example, if [latex]y = x^2\text{,}[/latex] we'll write that the derivative is [latex]\frac{dy}{dx} = 2x\text{.}[/latex] This notation comes from the fact that the derivative is related to the slope of a line, and slope is measured by [latex]\frac{\Delta y}{\Delta x}\text{.}[/latex] Note that while we read [latex]\frac{\Delta y}{\Delta x}[/latex] as “change in [latex]y[/latex] over change in [latex]x\text{,}[/latex]” we view [latex]\frac{dy}{dx}[/latex] as a single symbol, not a quotient of two quantities.

We use a variant of this notation as the instruction to take the derivative. In particular,

means “take the derivative of the quantity in [latex]\Box[/latex] with respect to [latex]x\text{.}[/latex]” For example, we may write [latex]\frac{d}{dx}[x^2] = 2x\text{.}[/latex]

It is important to note that the independent variable can be different from [latex]x\text{.}[/latex] If we have [latex]f(z) = z^2\text{,}[/latex] we then write [latex]f'(z) = 2z\text{.}[/latex] Similarly, if [latex]y = t^2\text{,}[/latex] we say [latex]\frac{dy}{dt} = 2t\text{.}[/latex] And it is also true that [latex]\frac{d}{dq}[q^2] = 2q\text{.}[/latex] This notation may also be used for second derivatives: [latex]f''(z) = \frac{d}{dz}\left[\frac{df}{dz}\right] = \frac{d^2 f}{dz^2}\text{.}[/latex]

In what follows, we'll build a repertoire of functions for which we can quickly compute the derivative.

Subsection2.1.2Constant, Power, and Exponential Functions

So far, we know the derivative formula for two important classes of functions: constant functions and power functions. If [latex]f(x) = c[/latex] is a constant function, its graph is a horizontal line with slope zero at every point. Thus, [latex]\frac{d}{dx}[c] = 0\text{.}[/latex] We summarize this with the following rule.

Constant Functions.

For any real number [latex]c\text{,}[/latex] if [latex]f(x) = c\text{,}[/latex] then [latex]f'(x) = 0\text{.}[/latex]

Example2.1.1.

If [latex]f(x) = 7\text{,}[/latex] then [latex]f'(x) = 0\text{.}[/latex] Similarly, [latex]\frac{d}{dx} [\sqrt{3}] = 0\text{.}[/latex]

In your work in Preview Activity 2.1.1, you conjectured that for any positive integer [latex]n\text{,}[/latex] if [latex]f(x) = x^n\text{,}[/latex] then [latex]f'(x) = nx^{n-1}\text{.}[/latex] This rule can be formally proved for any positive integer [latex]n\text{,}[/latex] and also for any nonzero real number (positive or negative).

Power Functions.

For any nonzero real number [latex]n\text{,}[/latex] if [latex]f(x) = x^n\text{,}[/latex] then [latex]f'(x) = nx^{n-1}\text{.}[/latex]

Example2.1.2.

Using the rule for power functions, we can compute the following derivatives. If [latex]g(z) = z^{-3}\text{,}[/latex] then [latex]g'(z) = -3z^{-4}\text{.}[/latex] Similarly, if [latex]h(t) = t^{7/5}\text{,}[/latex] then [latex]\frac{dh}{dt} = \frac{7}{5}t^{2/5}\text{,}[/latex] and [latex]\frac{d}{dq} [q^{\pi}] = \pi q^{\pi - 1}\text{.}[/latex]

It will be instructive to have a derivative formula for one more type of basic function. For now, we simply state this rule without explanation or justification; we will explore why this rule is true in one of the exercises. And we will encounter graphical reasoning for why the rule is plausible in Preview Activity 2.2.1.

Exponential Functions.

For any positive real number [latex]a\text{,}[/latex] if [latex]f(x) = a^x\text{,}[/latex] then [latex]f'(x) = a^x \ln(a)\text{.}[/latex]

Example2.1.3.

If [latex]f(x) = 2^x\text{,}[/latex] then [latex]f'(x) = 2^x \ln(2)\text{.}[/latex] Similarly, for [latex]p(t) = 10^t\text{,}[/latex] [latex]p'(t) = 10^t \ln(10)\text{.}[/latex] It is especially important to note that when [latex]a = e\text{,}[/latex] where [latex]e[/latex] is the base of the natural logarithm function, we have that

since [latex]\ln(e) = 1\text{.}[/latex] This is an extremely important property of the function [latex]e^x\text{:}[/latex] its derivative function is itself!

Note carefully the distinction between power functions and exponential functions: in power functions, the variable is in the base, as in [latex]x^2\text{,}[/latex] while in exponential functions, the variable is in the power, as in [latex]2^x\text{.}[/latex] As we can see from the rules, this makes a big difference in the form of the derivative.

Activity2.1.2.

Use the three rules above to determine the derivative of each of the following functions. For each, state your answer using full and proper notation, labeling the derivative with its name. For example, if you are given a function [latex]h(z)\text{,}[/latex] you should write “[latex]h'(z) =[/latex]” or “[latex]\frac{dh}{dz} =[/latex]” as part of your response.

[latex]f(t) = \pi[/latex]

[latex]g(z) = 7^z[/latex]

[latex]h(w) = w^{3/4}[/latex]

[latex]p(x) = 3^{1/2}[/latex]

[latex]r(t) = (\sqrt{2})^t[/latex]

[latex]s(q) = q^{-1}[/latex]

[latex]m(t) = \frac{1}{t^3}[/latex]

Subsection2.1.3Constant Multiples and Sums of Functions

Next we will learn how to compute the derivative of a function constructed as an algebraic combination of basic functions. For instance, we'd like to be able to take the derivative of a polynomial function such as

which is a sum of constant multiples of powers of [latex]t\text{.}[/latex] To that end, we develop two new rules: the Constant Multiple Rule and the Sum Rule.

How is the derivative of [latex]y = kf(x)[/latex] related to the derivative of [latex]y = f(x)\text{?}[/latex] Recall that when we multiply a function by a constant [latex]k\text{,}[/latex] we vertically stretch the graph by a factor of [latex]|k|[/latex] (and reflect the graph across [latex]y = 0[/latex] if [latex]k \lt 0[/latex]). This vertical stretch affects the slope of the graph, so the slope of the function [latex]y = kf(x)[/latex] is [latex]k[/latex] times as steep as the slope of [latex]y = f(x)\text{.}[/latex] Thus, when we multiply a function by a factor of [latex]k\text{,}[/latex] we change the value of its derivative by a factor of [latex]k[/latex] as well. 1 The Constant Multiple Rule can be formally proved as a consequence of properties of limits, using the limit definition of the derivative.,

The Constant Multiple Rule.

For any real number [latex]k\text{,}[/latex] if [latex]f(x)[/latex] is a differentiable function with derivative [latex]f'(x)\text{,}[/latex] then [latex]\frac{d}{dx}[k f(x)] = k f'(x)\text{.}[/latex]

In words, this rule says that “the derivative of a constant times a function is the constant times the derivative of the function.”

Example2.1.4.

If [latex]g(t) = 3 \cdot 5^t\text{,}[/latex] we have [latex]g'(t) = 3 \cdot 5^t \ln(5)\text{.}[/latex] Similarly, [latex]\frac{d}{dz} [5z^{-2}] = 5 (-2z^{-3})\text{.}[/latex]

Next we examine a sum of two functions. If we have [latex]y = f(x)[/latex] and [latex]y = g(x)\text{,}[/latex] we can compute a new function [latex]y = (f+g)(x)[/latex] by adding the outputs of the two functions: [latex](f+g)(x) = f(x) + g(x)\text{.}[/latex] Not only is the value of the new function the sum of the values of the two known functions, but the slope of the new function is the sum of the slopes of the known functions. Therefore 2 Like the Constant Multiple Rule, the Sum Rule can be formally proved as a consequence of properties of limits, using the limit definition of the derivative., we arrive at the following Sum Rule for derivatives:

The Sum Rule.

If [latex]f(x)[/latex] and [latex]g(x)[/latex] are differentiable functions with derivatives [latex]f'(x)[/latex] and [latex]g'(x)[/latex] respectively, then [latex]\frac{d}{dx}[f(x) + g(x)] = f'(x) + g'(x)\text{.}[/latex]

In words, the Sum Rule tells us that “the derivative of a sum is the sum of the derivatives.” It also tells us that a sum of two differentiable functions is also differentiable. Furthermore, because we can view the difference function [latex]y = (f-g)(x) = f(x) - g(x)[/latex] as [latex]y = f(x) + (-1 \cdot g(x))\text{,}[/latex] the Sum Rule and Constant Multiple Rules together tell us that [latex]\frac{d}{dx}[f(x) + (-1 \cdot g(x))] = f'(x) - g'(x)\text{,}[/latex] or that “the derivative of a difference is the difference of the derivatives.” We can now compute derivatives of sums and differences of elementary functions.

Example2.1.5.

Using the sum rule, [latex]\frac{d}{dw} (2^w + w^2) = 2^w \ln(2) + 2w\text{.}[/latex] Using both the sum and constant multiple rules, if [latex]h(q) = 3q^6 - 4q^{-3}\text{,}[/latex] then [latex]h'(q) = 3 (6q^5) - 4(-3q^{-4}) = 18q^5 + 12q^{-4}\text{.}[/latex]

Activity2.1.3.

Use only the rules for constant, power, and exponential functions, together with the Constant Multiple and Sum Rules, to compute the derivative of each function below with respect to the given independent variable. Note well that we do not yet know any rules for how to differentiate the product or quotient of functions. This means that you may have to do some algebra first on the functions below before you can actually use existing rules to compute the desired derivative formula. In each case, label the derivative you calculate with its name using proper notation such as [latex]f'(x)\text{,}[/latex] [latex]h'(z)\text{,}[/latex] [latex]dr/dt\text{,}[/latex] etc.

In the same way that we have shortcut rules to help us find derivatives, we introduce some language that is simpler and shorter. Often, rather than say “take the derivative of [latex]f\text{,}[/latex]” we'll instead say simply “differentiate [latex]f\text{.}[/latex]” Similarly, if the derivative exists at a point, we say “[latex]f[/latex] is differentiable at that point,” or that [latex]f[/latex] can be differentiated.

As we work with the algebraic structure of functions, it is important to develop a big picture view of what we are doing. Here, we make several general observations based on the rules we have so far.

The derivative of any polynomial function will be another polynomial function, and that the degree of the derivative is one less than the degree of the original function. For instance, if [latex]p(t) = 7t^5 - 4t^3 + 8t\text{,}[/latex] [latex]p[/latex] is a degree 5 polynomial, and its derivative, [latex]p'(t) = 35t^4 - 12t^2 + 8\text{,}[/latex] is a degree 4 polynomial.

The derivative of any exponential function is another exponential function: for example, if [latex]g(z) = 7 \cdot 2^z\text{,}[/latex] then [latex]g'(z) = 7 \cdot 2^z \ln(2)\text{,}[/latex] which is also exponential.

We should not lose sight of the fact that all of the meaning of the derivative that we developed in Chapter 1 still holds. The derivative measures the instantaneous rate of change of the original function, as well as the slope of the tangent line at any selected point on the curve.

Activity2.1.4.

Each of the following questions asks you to use derivatives to answer key questions about functions. Be sure to think carefully about each question and to use proper notation in your responses.

Find the slope of the tangent line to [latex]h(z) = \sqrt{z} + \frac{1}{z}[/latex] at the point where [latex]z = 4\text{.}[/latex]

A population of cells is growing in such a way that its total number in millions is given by the function [latex]P(t) = 2(1.37)^t + 32\text{,}[/latex] where [latex]t[/latex] is measured in days.

Determine the instantaneous rate at which the population is growing on day 4, and include units on your answer.

Is the population growing at an increasing rate or growing at a decreasing rate on day 4? Explain.

Find an equation for the tangent line to the curve [latex]p(a) = 3a^4 - 2a^3 + 7a^2 - a + 12[/latex] at the point where [latex]a=-1\text{.}[/latex]

What is the difference between being asked to find the slope of the tangent line (asked in (a)) and the equation of the tangent line (asked in (c))?

Subsection2.1.4Summary

Given a differentiable function [latex]y = f(x)\text{,}[/latex] we can express the derivative of [latex]f[/latex] in several different notations: [latex]f'(x)\text{,}[/latex] [latex]\frac{df}{dx}\text{,}[/latex] [latex]\frac{dy}{dx}\text{,}[/latex] and [latex]\frac{d}{dx}[f(x)]\text{.}[/latex]

The limit definition of the derivative leads to patterns among certain families of functions that enable us to compute derivative formulas without resorting directly to the limit definition. For example, if [latex]f[/latex] is a power function of the form [latex]f(x) = x^n\text{,}[/latex] then [latex]f'(x) = nx^{n-1}[/latex] for any real number [latex]n[/latex] other than 0. This is called the Rule for Power Functions.

We have stated a rule for derivatives of exponential functions in the same spirit as the rule for power functions: for any positive real number [latex]a\text{,}[/latex] if [latex]f(x) = a^x\text{,}[/latex] then [latex]f'(x) = a^x \ln(a)\text{.}[/latex]

If we are given a constant multiple of a function whose derivative we know, or a sum of functions whose derivatives we know, the Constant Multiple and Sum Rules make it straightforward to compute the derivative of the overall function. More formally, if [latex]f(x)[/latex] and [latex]g(x)[/latex] are differentiable with derivatives [latex]f'(x)[/latex] and [latex]g'(x)[/latex] and [latex]a[/latex] and [latex]b[/latex] are constants, then

Let [latex]f[/latex] and [latex]g[/latex] be differentiable functions for which the following information is known: [latex]f(2) = 5\text{,}[/latex] [latex]g(2) = -3\text{,}[/latex] [latex]f'(2) = -1/2\text{,}[/latex] [latex]g'(2) = 2\text{.}[/latex]

Let [latex]h[/latex] be the new function defined by the rule [latex]h(x) = 3f(x) - 4g(x)\text{.}[/latex] Determine [latex]h(2)[/latex] and [latex]h'(2)\text{.}[/latex]

Find an equation for the tangent line to [latex]y = h(x)[/latex] at the point [latex](2,h(2))\text{.}[/latex]

Let [latex]p[/latex] be the function defined by the rule [latex]p(x) = -2f(x) + \frac{1}{2}g(x)\text{.}[/latex] Is [latex]p[/latex] increasing, decreasing, or neither at [latex]a = 2\text{?}[/latex] Why?

Estimate the value of [latex]p(2.03)[/latex] by using the local linearization of [latex]p[/latex] at the point [latex](2,p(2))\text{.}[/latex]

11.

Let functions [latex]p[/latex] and [latex]q[/latex] be the piecewise linear functions given by their respective graphs in Figure 2.1.6. Use the graphs to answer the following questions.

At what values of [latex]x[/latex] is [latex]p[/latex] not differentiable? At what values of [latex]x[/latex] is [latex]q[/latex] not differentiable? Why?

Let [latex]r(x) = p(x) + 2q(x)\text{.}[/latex] At what values of [latex]x[/latex] is [latex]r[/latex] not differentiable? Why?

Determine [latex]r'(-2)[/latex] and [latex]r'(0)\text{.}[/latex]

Find an equation for the tangent line to [latex]y = r(x)[/latex] at the point [latex](2,r(2))\text{.}[/latex]

12.

Consider the functions [latex]r(t) = t^t[/latex] and [latex]s(t) = \arccos(t)\text{,}[/latex] for which you are given the facts that [latex]r'(t) = t^t(\ln(t) + 1)[/latex] and [latex]s'(t) = -\frac{1}{\sqrt{1-t^2}}\text{.}[/latex] Do not be concerned with where these derivative formulas come from. We restrict our interest in both functions to the domain [latex]0 \lt t \lt 1\text{.}[/latex]

Let [latex]w(t) = 3t^t - 2\arccos(t)\text{.}[/latex] Determine [latex]w'(t)\text{.}[/latex]

Find an equation for the tangent line to [latex]y = w(t)[/latex] at the point [latex](\frac{1}{2}, w(\frac{1}{2}))\text{.}[/latex]

Let [latex]v(t) = t^t + \arccos(t)\text{.}[/latex] Is [latex]v[/latex] increasing or decreasing at the instant [latex]t = \frac{1}{2}\text{?}[/latex] Why?

13.

Let [latex]f(x) = a^x\text{.}[/latex] The goal of this problem is to explore how the value of [latex]a[/latex] affects the derivative of [latex]f(x)\text{,}[/latex] without assuming we know the rule for [latex]\frac{d}{dx}[a^x][/latex] that we have stated and used in earlier work in this section.

Use the limit definition of the derivative to show that

Use computing technology and small values of [latex]h[/latex] to estimate the value of

\begin{equation*}
L = \lim_{h \to 0} \frac{a^h - 1}{h}
\end{equation*}

when [latex]a = 2\text{.}[/latex] Do likewise when [latex]a = 3\text{.}[/latex]

Note that it would be ideal if the value of the limit [latex]L[/latex] was [latex]1\text{,}[/latex] for then [latex]f[/latex] would be a particularly special function: its derivative would be simply [latex]a^x\text{,}[/latex] which would mean that its derivative is itself. By experimenting with different values of [latex]a[/latex] between [latex]2[/latex] and [latex]3\text{,}[/latex] try to find a value for [latex]a[/latex] for which

Compute [latex]\ln(2)[/latex] and [latex]\ln(3)\text{.}[/latex] What does your work in (b) and (c) suggest is true about [latex]\frac{d}{dx}[2^x][/latex] and [latex]\frac{d}{dx}[3^x]\text{?}[/latex]

How do your investigations in (d) lead to a particularly important fact about the function [latex]f(x) = e^x\text{?}[/latex]