Forrest Cameranesi: Geek of all Trades

On Logic and Mathematics

Logic is the study of the formal, structural relationships between ideas, or between the linguistic encodings thereof. As elaborated in my previous essay on rhetoric, I contrast logic with rhetoric as two complementary studies of the use of language: while rhetoric as I would characterize it is more artistic, concerning itself with the style and presentation of arguments and appealing more personally to passions and feelings, logic is instead more mathematical, concerning itself with the form and structure of arguments and appealing more impersonally to dispassionate thought.

In pursuit of that study of form and structure, logicians create logical or formal systems, which are idealized forms of language that allow the validity of arguments, and the relationships between the statements in those arguments or the ideas those statements express, to be examined independent of the truth or meaning of the specifics of those statements, looking only at their form and structure. Mathematics, meanwhile, is essentially just the application of such logic: a mathematical object is defined by fiat as whatever obeys some specified rules, and then the logical implications of that definition, and the relations of those kinds of objects to each other, are explored in the working practice of mathematics.

On Logic

In the first part of this essay, I will explain many of the standard aspects of existing logical systems, alongside my own original thoughts on what would make for a better logical system.

On Mood

The highest-level aspect of my proposed system of logic, and the most original thought I have on the topic, is a direct application of my thoughts on language that I have previously elaborated. I propose the use of a set of functions to indicate the kind of speech-act being made, especially distinguishing the direction-of-fit aspect of it, so that that part of an expression can be separated from the propositional content of the speech-act, the idea that the speech-act is about. This is primarily because all of the rest of logic is about the relationships between those ideas alone, independent of whatever we might be communicating about some attitudes toward those ideas.

A classic example of a formal logical inference is that from the propositions "all men are mortal" and "Socrates is a man" we can logically infer the proposition "Socrates is mortal". But, I hold, we could equally well infer from the propositions "all men ought to be mortal" and "Socrates ought to be a man" that "Socrates ought to be mortal".

I say that it is really just the ideas of "all men being mortal" and "Socrates being a man" that entail the idea of "Socrates being mortal", and whether we hold descriptive, mind-to-fit-world attitudes about those ideas, or prescriptive, world-to-fit-mind attitudes about them, whether we're impressing or expressing those attitudes, even whether we're making statements or asking questions about them, does not affect the logical relations between the ideas at all.

Structure of Moods

So I propose that rather than treating a statement like "All men are mortal" as one proposition and a statement like "All men ought to be mortal" as another, completely unrelated proposition, we instead take the idea that they have in common, "all men being mortal", and wrap that in a function that conveys what we wish to communicate about some attitude toward that idea.

For example we might write there-is(all men being mortal) to mean "all men are mortal", and be-there(all men being mortal) to mean "all men ought to be mortal"; and generally, write there-is(S) and be-there(S) for the equivalent descriptive and prescriptive statements about the idea of some state of affairs S, whatever S is.

(We might wish to use shorter names for the functions, like simply is() and be(), or some other names entirely; I am merely using the indicative and imperative moods of the copula verb "to be" to capture the descriptive and prescriptive natures of the respective functions.)

These as I've written them so far are both implicitly statements and impressions, but we could also use a whole variety of similar functions to differentiate expressions from impressions and questions from statements, for example prepending an exclamation mark or question mark to differentiate statements from questions, as in !there-is() and ?there-is(), and prepending, say, a right-bracket for impression or a left-bracket for expression, as in !>there-is() and !<there-is(). We can think of these punctuation marks themselves as unary functions for which we are simply not writing the parentheses, giving us a total of six functions, three pairs of functions:

  • !() and ?()
  • >() and <()
  • is() and be()

or a total of eight possible combinations thereof:

  • !>is() ("there is..." something)
  • !>be() ("there should be..." something)
  • !<is() ("I think there is..." something)
  • !<be() ("I think there should be..." something)
  • ?>is() ("is there...?" something)
  • ?>be() ("should there be...?" something)
  • ?<is() ("I wonder if there is..." something)
  • ?<be() ("I wonder if there should be..." something)

to indicate the various different things we might wish to communicate about different attitudes toward the same, single, idea. All of the rest of logic can then deal entirely with such ideas, and the relations between them, without concerning itself with what anyone might be communicating about which of the various possible attitudes toward them. I call these kinds of functions "mood" functions, after their similarity to the concept of linguistics moods, such as the indicative and the imperative.


The use of these mood functions also facilitates something superficially resembling the motivations for non-classical types of logic such as paraconsistent logics and intuitionist logics, without actually abandoning the principle that differentiates classical logic from them: the principle of bivalence. The principle of bivalence is the principle that every statement must be assigned exactly one of two truth values, "true" or "false", no more and no less. Intuitionist logics allow for statements to be assigned neither of those truth values, while paraconsistent logics allow for statements to be assigned both of them at the same time.

With these mood functions, similar things can be constructed without actually violating the principle of bivalance, because there is nothing strictly logically prohibiting it being the case that neither is(P) nor is(not-P), if for example P were some kind of descriptively meaningless statement; it is merely necessary, to preserve bivalance, that either is(P) or not(is(P)), but not(is(P)) doesn't have to entail that is(not-P).

Similarly, there is nothing strictly prohibiting it being the case that be(P) and be(not-P), if for example there were some morally intractable situation where both P and not-P were required, and so any outcome was unacceptable; it is merely necessary, to preserve bivalence, that either be(P) or not(be(P)), and be(not-P) doesn't have to entail not(be(P)), so could be compatible with be(P).

Fleshing out the philosophical implications of things like descriptively meaningless claims and morally intractable situations is beyond the scope of this particular essay on logic, other than to point out that a logic of this form is in principle capable of discussing things that are, in a sense, "both true and false" or "neither true nor false", without technically violating the principle of bivalence.

On Mode

The next-highest-level aspect of my proposed logic system is a standard part of many logic systems, but with my own twist on it. It is mode, as in modal logic (not to be confused with mood above). The most common type of modal logic, called alethic modal logic (where "alethic" is a Greek word meaning pertaining to truth), deals with the concepts of necessity and possibility, and their negations contingency and impossibility.

It usually uses two functions, and , to represent necessity and possibility, respectively, and negations of them for their negations of course, but either one of those functions is sufficient to represent all of these different modes, because the main two functions of necessity and possibility bear a relationship to each other called De Morgan duality, which just means that each one is equivalent to the negation of the other upon the negation of its argument.

To clarify: necessity is the negation of the possibility of a negation; for something to be necessary means that its negation is not possible; it cannot possibly not be, it has to be, and so it definitely actually is. And conversely, possibility is the negation of the necessity of a negation; for something to be possible means that its negation is not necessary; it doesn't have to not be, it could be, even if in fact it actually is not.

Thus for something to be contingent, or not necessary, means that its negation is possible; it could not be, even if in fact it actually is. And for something to be impossible, not possible, means that its negation is necessary; it could not be, it has to not be, and so it definitely actually is not.

But besides that alethic modal logic, there are also other kinds of modal logic, of particular note, deontic modal logic (where "deontic" is a Greek word meaning pertaining to duty), which deals with obligation and permission instead of necessity and possibility. Obligation and permission bear the same relations to each other and to goodness as necessity and possibility bear to each other and to truth.

Obligation is the negation of the permission of a negation; for something to be obligatory means that its negation is not permissible, it may not not be, it morally must be, and so it definitely should be. Conversely, permission is the negation of the obligation of a negation; for something to be permissible means that its negation is not obligatory; it doesn't morally have to not be, even if maybe it was best if it wasn't and so it still should not be.

Thus for something to be omissible, not obligatory, means that its negation is permissible; it may not be, it's morally okay if it's not, even if it would be good if it were. And for something to be impermissible, not permissible, means that its negation is obligatory; it must not be, and so it definitely should not be.


Modality

Both alethic and deontic modal logics make possible the expression of subtler ideas than can be expressed with simple black-and-white concepts of truth or goodness, respectively.

While something being necessary entails its truth and something being impossible entails its falsehood, something being possible does not entail anything about its truth or falsehood, nor does something being contingent. Possible things might nevertheless be false, and false things nevertheless possible; contingent things might nevertheless be true, and true things nevertheless only contingent; and there are things that are only contingently possible, neither necessary nor impossible, that might be either true or false.

Likewise, while something being obligatory entails its goodness and something being impermissible entails its badness, something being permissible does not entail anything about its goodness or badness, nor does something being omissible. Permissible things might nevertheless be bad, and bad things nevertheless permissible; omissible things might nevertheless be good, and good things nevertheless only omissible; and there are things that are only omissibly permissible, neither obligatory nor impermissible, that might be either good or bad.

I think that failure to really understand or employ modal logic, as well as logical mood as I've defined it above, is behind a lot of the wrong opinions widely held on quite a lot of different philosophical topics.


In traditional deontic modal logic, these obligation and permission functions are usually written with the same and operators used in alethic modal logic, their meaning distinguished only by the surrounding context. But under my system of logic, with the mood operators described above, there is no need for that context, because once we have abstracted the descriptiveness or prescriptiveness of statements away into those mood operators, we are dealing only with the raw idea of whatever state of affairs being or not-being in some variety of contexts.

The usual semantics given to the alethic modal operators is that of "possible worlds": for something to be necessary is for it to be true in all possible worlds, for it to be possible is for it to be true in some possible worlds, for it to be impossible is for it to be true in no possible worlds, and for it to be contingent is for it to be true in not all possible worlds.

If we take those and operators to mean not specifically anything about alethic necessity or possibility, nor deontic obligation or permission, but instead as just representing the idea of whatever they are applied to being the case in either all or merely some possible worlds, then when we wrap our descriptive or prescriptive mood functions around them, we automatically get an alethic or deontic logic, both with all the same internal structure.

The descriptive mood function asserts that whatever idea being the case in whatever set of possible worlds is true, yielding necessity, possibility, etc; while the prescriptive mood function asserts that it is good, yielding obligation, permission, etc. For something to be obligatory is for the idea of it being the case in all possible worlds to be good; and for something to be permissible is for the idea of it being the case in some possible worlds to be good. For something to be obligatory is for it to be good for that thing to be necessary; and for something to be permissible is for it to be good for that thing to be possible.


Furthermore, I propose that we can not only use a single operator to replace both of those and operators, but that that single operator can also serve a much broader logical function, and also yield a temporal modal logic, dealing with things being or not being the case differently at different times (which is another traditional kind of modal logic); as well as a spatial modal logic, dealing with things being or not being the case differently at different places. This single operator I propose establishes the scope of contexts wherein a state of affairs is considered to be, and so might be written as something like at().

This function takes two arguments: the first is a set of contexts, such as places, times, or possible worlds, where some state of affairs is considered to be; and the second is the state of affairs itself, e.g. the state of affairs of "all men being mortal". So we might want to talk about, for instance, the idea of all men in Greece specifically being mortal, and so write at(Greece,all men being mortal) to encode that idea. Or if we want to talk about the idea of all men in the past having been mortal, we might write at(the past,all men being mortal) to encode that idea.

But most usefully, if that first argument is just the empty set, what you end up encoding is the idea of that state of affairs never, anywhere, at any time, in any possible world, being the case, which is to say, the idea of it being impossible. With that way of talking about something being impossible, we automatically have a way of talking about it being possible, by negating that formula; and of talking about it being necessary, by applying that impossibility function to the negation of the original state of affairs; and of talking about it being contingent, if we negate that formula in turn. Which then yields also functions for impermissibility, permissibility, obligatoriety, and omissibility, if we wrap that idea in a prescriptive mood function rather than a descriptive one.

By being more specific about the contexts specified, limiting ourselves to specifying sets of times, we can also say things about some state of affairs being the case at all times or some times or no times, without saying anything about all possible worlds, yielding a temporal modal logic (either an alethic temporal logic or a deontic one depending on what mood function we wrap the idea in). And if we instead limit ourselves to specifying sets of places, we can say things about some state of affairs being the case everywhere or somewhere or nowhere, likewise yielding a spatial modal logic (and again, either an alethic or deontic one, as we like).

With our mood functions allowing us to instead express rather than impress these ideas, we can also easily create things like a doxastic or epistemic modal logic, having to do with things like belief and certainty rather than truth and necessity.

And lastly, by specifying some fraction of possible worlds, or times, or places, this function can also serve to encode statements about probability, to say talk about things being likely or unlikely (which is to say, them being the case in many or few possible worlds) rather than strictly necessary or impossible (the case in all or no possible worlds); or, combined with the temporal or spatial possibilities, to talk about things being the case most of the time, most places, etc. The possible modal ideas expressible with this one at() function (plus the mood functions above) vastly outnumber those expressible with the traditional and functions.

On Quantification

The next-highest-level aspect in my proposed logic system deals with the topic that the oldest of logic systems were created to deal with: quantification of variables in logical formulae. This was the topic dealt with in Aristotle's original logic system, but it has since been greatly refined.

Aristotle's logic laid out explicit lists of which of forms of arguments, or syllogisms, each involving two statements either about all, some, none, or not all members of one category being in another category (e.g. "all men are mortal"), or about specific individuals being in categories (e.g. "Socrates is a man"), were valid or invalid. This captured much of the intuitive sense of logic people have in common discourse, but it had some major problems that have since been remedied.


One of them was an asymmetry in the way "all" statements were treated: in natural discourse, and in Aristotle's system of logic, to say something about all members of some set implies that there are some members of that set. But that asymmetry makes it impossible to completely translate statements about all (or not all) members of some set into statements about some (or not some, i.e. none) of the members of a set.

In natural discourse it's clear that if all A are B, then there must be no A that are non-B. Likewise, if not all A are B, then there must be some A that are non-B. And if some A are B, then not all A are non-B. But under Aristotelian logic, and in natural discourse, it does not follow that if no A are B, then all A are non-B, because it might just be the case that there are no A at all, in which case any "no A..." statement is true, but any "all A..." statement is false, because any "all A..." statement implies (by this old logic) a "some A..." statement that must be false if there are no A.

In contrast, modern systems of logic bite that counterintuitive bullet for the sake of clearer, more workable logical functions, and say that "all A are B" is just exactly equivalent to "no A are non-B", so if there are no A, then any "all A are B" statement is true, but in a trivial way that doesn't mean quite what we naturally want to take it to mean (that there are some A, none of which are B). This makes "all..." and "some..." functions once again De Morgan dual, just like the two traditional modal operators discussed above.


Another major problem with Aristotelian logic that has since been remedied is the problem of statements involving some mix of "all" (or "every") and "some" statements, such as "every mouse is afraid of some cat". There are two different things that that statement might mean, and Aristotelian logic is unable to distinguish between them: it might mean that for every mouse, there is some cat or another that that particular mouse is afraid of, maybe a different cat for every mouse; or it might mean that there is some particular cat of whom every mouse is afraid, that same one cat frightening every single mouse.

Treating these interpretations as equivalent is behind some major philosophical fallacies of the past: for instance, it would lead one to infer from the premise that everything comes from something (which is to say that for each thing, there is some thing or another from which that thing came, but maybe a different origin for each thing) to the conclusion that there is something from which everything comes (which is to say that there is one particular thing from which all other things came, a singular common origin to everything).

Thankfully that too has already been remedied in contemporary systems of logic, in which two functions are used, and , which are usually read aloud as "for all..." (or sometimes "for every...") and "there exists some...". These functions turn statements that would otherwise be about individual things into statements about categories of things, by using a variable as the subject of an ordinary statement seemingly about an individual thing, and then quantifying how many values of that variable satisfy the truth of such a statement.

For example, "all men are mortal" would be written ∀m(if m is a man then m is mortal) and read as "for all m, if m is a man then m is mortal", while "some men are Greek" would be written as ∃m(m is a man and m is Greek) and read as "there exists some m such than m is a man and m is Greek".

Statements involving both "all" and "some" functions can then be clarified as to which of their two possible interpretations is meant, by the order in which these functions are used: ∀mouse∃cat(the mouse fears the cat) means "for every mouse there exists some cat such that the mouse fears the cat" (i.e. each mouse has some cat or another that it's afraid of, but they might all be different cats), whereas ∃cat∀mouse(the mouse fears the cat) means "there exists some cat such that for every mouse, the mouse fears the cat" (i.e. there is one particular cat of whom all mice are afraid).

Likewise, the sense of "everything comes from something" that means each thing has some origin or another would be written as ∀thing∃other-thing(the thing comes from the other-thing), while the sense that means there is some particular thing that is the origin of everything else would be written ∃other-thing∀thing(the thing comes from the other-thing), and the two are not logically equivalent.


The manner of reading the symbol aloud is the first change to this aspect of logic that I propose, because I think it implies unnecessary assumptions or at least raises unnecessary questions about the existence of things in a more robust sense than this logical function strictly implies, questions that I will address later in this essay. I think a much better reading of the function is simply "for some...", rather than "there exists some...".

Both functions, and , only specify how many values of the variable they quantify make the statement that follows true, and the statement doesn't necessarily have to be asserting the existence of anything, so saying that there exists some thing goes beyond what this function really does; merely says that some value of the variable satisfies the following formula, just like merely says that any value of that variable satisfies the formula. Furthermore, with both functions being read as "for..." something, we can also more easily implement my next proposal, which is that we can once again make do with just one function to handle this entire aspect of logic and more besides that.

I propose a for() function that takes three arguments, the first being a set of values that some variable can take to satisfy some formula, the second being that variable, and the third being that formula. (This would then be read as "for [these values of] [this variable], [this statement involving that variable is true]").

This replicates some of functionality of another function frequently used together with the traditional quantification operators, , which properly indicates that whatever is on the left of it is a member of the set on the right of it, but together with the existential operators is often used to write things like ∀x∈S..., meaning "for every x in set S...", meaning that only the members of S satisfy the formula to follow. Expressions like the usual ∃x∈S... (meaning "for some x in set S...") can also be formed, with this function, by using the equivalent of an "or" function, as I will describe below, on the set in the first argument of for(), to yield an expression meaning "some of this set".

And once again, like with the single operator I proposed for all modal logic above, if the first argument is the empty set, we are left with a special case of this function meaning "for no...", which we can then easily turn into "for some..." by negation, and then turn those two into "for all..." and "for not all..." by applying them to the negations of the formula in the third argument.

E.g. negating "for no m, if m is a man then m is mortal", meaning "no men are mortal", gives us "for some m, if m is a man then m is mortal", meaning "some men are mortal"; while "for no m is it not the case that if m is a man then m is mortal" means "for all m, if m is a man then m is mortal", or in other words "all men are mortal"; and "for some m, it is not the case that if m is a man then m is mortal", in other words "some men are not mortal", of course means the same thing as "for not all m, if m is a man then m is mortal", or in other words, "not all men are mortal".

On Predication

The contemporary quantification functions were introduced hand-in-hand with the next aspect of logic I am going to discuss, predication functions, more commonly called propositional functions. A predicate is basically the rest of a proposition after the subject. For example, in "all men are mortal", "are mortal" is the predicate, while "all men" is the subject; and in "Socrates is a man", "Socrates" is the subject, and "is a man" is the predicate. The predicate is basically what a proposition is saying about the subject. Predicating something of a subject is usually taken as equivalent to saying that that thing is a member of some set, the set of all things that predicate is true of.

In contemporary predicate logic, the predicate is treated as a logical function, called the propositional function, and the subject of that predicate treated as its argument: the function upon that argument then yields a specific proposition. For example, the proposition "Socrates is mortal" might be decomposed into the function is-mortal() which indicates that whatever is put into it is mortal, and the subject Socrates, such that is-mortal(Socrates) means "Socrates is mortal". This can then be combined with the quantification functions already discussed above, to encode a proposition like "all men are mortal" as ∀m(if is-man(m) then is-mortal(m)).


My proposal for improving this aspect of logic is the use of a single function to handle predicating membership in any set of any subject, a function that is also capable of predicating a fuzzy, non-binary degree of membership, thus allowing the expression of ideas appropriate to fuzzy logic, which deals with sets to which individuals can be members in degrees somewhere between fully members and not at all members.

This is similar to simply separating the is out from those is-something() functions described above, but because in my entire system of logic we are dealing with ideas independently of the different kinds of attitudes we might have toward them, we want to encode not the idea that e.g. Socrates is mortal, any more than we want to encode the idea that Socrates ought to be mortal, but rather just the idea of Socrates being mortal.

So the function I propose is being(), and it again takes three functions: the first is a number from zero to one expressing the degree of membership in some set to be predicated of some subject, the second is the set to which that degree of membership is to be predicated, and the third is the subject of which it is to be predicated.

So for example to encode the idea of Socrates being entirely mortal (and noting for ease of reading here that x% is an equivalent way of writing x/100, so 100% = 1 and 50% = 0.5), we might write being(100%,mortal,Socrates); while if we wanted to instead encode the idea of e.g. Hercules being only half-mortal (whatever that might mean), we might instead write being(50%,mortal,Hercules).

(In an ideal constructed language, I think those sets would best be principally specified in terms of the output or input of functions, basically as either active or passive verbs, e.g. being(100%,verbing,subject) or being(100%,verbed,subject); with adjectives being formed by inflection to indicate propensity to verb or to be-verbed, e.g. being(100%,verby,subject) or being(100%,verbable,subject); and nouns formed similarly, e.g. being(100%,verber,subject) or being(100%,verbee,subject)).

On Truth Functions

Boolean Junctions

The final aspect of logic that I have yet to discuss is the most basic and fundamental aspect of it, the usual topic of any introductory course on logic: the truth functions like "and", "or", "not", "if-then", and so forth. It is already well-known in contemporary logic that these functions can be readily converted between each other; for example, conjunction ("and") and disjunction ("or") are once again De Morgan duals, where the negation of a conjunction ("not (A and B)") is equivalent to the disjunction of negations ("not-A or not-B"), and conversely the negation of a disjunction ("not (A or B)") is equivalent to the conjunction of negations ("not-A and not-B").

Implication ("if A then B" or "A only if B") in turn is equivalent to a certain kind of conjunction ("not (A and not-B)") which is likewise equivalent to another kind of disjunction ("not-A or B"); the reverse of implication ("A if B"), which I like to call "explication", is likewise equivalent to that conjunction and that disjunction with the negations of their terms likewise reversed ("not(not-A and B)" or "A or not-B").

Bi-implication ("A if and only if B"), which I like to call "complication" (meaning "bending together", as "implication" means "bending into" and "explication" means "bending out of"), is of course the conjunction of both implication and explication; and what's usually called "exclusive disjunction" ("A xor B"), which I prefer to call "displication" (meaning "bending apart"), is the negation of that.

There are still other, much less used, logical functions for saying which if either of two things must or must not be the case together for the entire state of affairs thus encoded to be the case, but the two most important ones are usually called "alternative denial" ("A nand B") and "joint denial" ("A nor B"), though I prefer to call them "disnegation" (meaning "negating apart") and "conegation" (meaning "negating together"). These two functions are important because either one of them can serve as a sole sufficient operator to build any of these functions I have just described, and the equally-many lesser-used ones I haven't even bothered to describe here.


Besides a few new names and symbols above, my main proposal for improvement in this aspect of logic is the introduction of yet another single, broader function that can serve in place of all of these other functions. I call it the "junction" function, after functions like conjunction and disjunction, but I write it of(), because it takes two arguments, a number and a set, and returns that number of members of that set, and so can be used to mean things like "none of...", "some of...", "all of...", etc. If the number in the first argument is zero, it returns no members of that set, and so is equivalent to the conegation, or joint denial, of all members of that set. That can then be used to construct any of the other functions just like a traditional joint denial function can.

The conegation of a single item is just that item's negation, so this serves straightforwardly in place of "not". The negation of a conegation of several things is equivalent to the disjunction of those several things, i.e. "not (neither A nor B nor C ...)" just means "A or B or C ...", so we have replicated the functionality of "or".

The conegation of the negations of several things is equivalent to the conjunction of those several things, i.e. "neither not-A nor not-B nor not-C ..." just means "A and B and C ...", so we have replicated the functionality of "and". And the negation of such a conjunction is a disnegation, or alternative denial, i.e. "not (neither not-A nor not-B nor not-C ...)" just means "A nand B nand C ...", so we have replicated the functionality of "nand".

And so on with all those we can replicate the functionality of implication, explication, complication, displication, and the rest of the truth-functions.

With this of() function we can talk about complex sets of things, as all of the truth-functions are equivalent to set operations: disjunction is equivalent to the union of sets (the set of things that are A or B is the set of things in the union of set A and set B), conjunction is equivalent to the intersection of sets (the set of things that are A and B is the set of things in the intersection of set A and set B), and so on.

With the being() function above, we can then talk about things being (to some degree or another) members of those sets

With the for() function above we can then talk about what set of things we're talking about being (to some degree or another) members of those sets.

With the at() function we can then talk about in what contexts we're talking about those sets of things being (to some degree or another) members of those sets.

And finally, with the mood functions discussed at the start of this essay, we can actually say things about whether there are, or whether there ought to be, in those contexts, those sets of things being (to that degree) members of those sets.

On Mathematics

All of this talk of sets segues into the topic of the rest of this essay: mathematics. To most lay people, mathematics is the study of numbers, but to actual mathematicians and philosophers of mathematics, mathematics is the study of a broad variety of things besides numbers including all manner of abstract structures, both in and of themselves and specifically in space and in time.

Mathematics is essentially just the application of pure logic: a mathematical object is defined by fiat as whatever obeys some specified rules, and then the logical implications of that definition, and the relations of those kinds of objects to each other, are explored in the working practice of mathematics. Numbers are just one such kind of objects, and there are many others, but in contemporary mathematics, all of those structures have since been grounded in sets.

The natural numbers, for instance, meaning the counting numbers {0, 1, 2, 3, ...}, are easily defined in terms of sets. First we define a series of sets, starting with the empty set, and then a set that only contains that one empty set, and then a set that only contains those two preceding sets, and then a set that contains only those three preceding sets, and so on, at each step of the series defining the next set as the union of the previous set and a set containing only that previous set. We can then define some set operations (which I won't detail here) that relate those sets in that series to each other in the same way that the arithmetic operations of addition and multiplication relate natural numbers to each other.

We could name those sets and those operations however we like, but if we name the series of sets "zero", "one", "two", "three", and so on, and name those operations "addition" and "multiplication", then when we talk about those operations on that series of sets, there is no way to tell if we are just talking about some made-up operations on a made-up series of sets, or if we were talking about actual addition and multiplication on actual natural numbers: all of the same things would be necessarily true in both cases, e.g. doing the set operation we called "addition" on the set we called "two" and another copy of that set called "two" creates the set that we called "four".

Because these sets and these operations on them are fundamentally indistinguishable from addition and multiplication on numbers, they are functionally identical: those operations on those sets just are the same thing as addition and multiplication on the natural numbers.


All kinds of mathematical structures, by which I don't just mean a whole lot of different mathematical structures but literally every mathematical structure studied in mathematics today, can be built up out of sets this way. The integers, or whole numbers, can be built out of the natural numbers (which are built out of sets) as equivalence classes (a kind of set) of ordered pairs (a kind of set) of natural numbers, meaning in short that each integer is identical to some set of equivalent sets of two natural numbers in order, those sets of two natural numbers in order that are equal when one is subtracted from the other: the integers are all the things you can get by subtracting one natural number from another.

Similarly, the rational numbers can be defined as equivalence classes of ordered pairs of integers in a way that means that the rationals are the things you can get by dividing one integer by another. The real numbers, including irrational numbers like pi and the square root of 2, can be constructed out of sets of rational numbers in a process too complicated to detail here (something called a Dedekind-complete ordered field, where a field is itself a kind of set). The complex numbers, including things like the square root of negative one, can be constructed out of ordered pairs of real numbers.

And further hypercomplex numbers, including things called quaternions and octonions, can be built out of larger ordered sets of real numbers, which are built out of complicated sets of rational numbers, which are built out of sets of integers, which are built out of sets of natural numbers, which are built out of sets built out of sets of just the empty set. So from nothing but the empty set, we can build up to all complicated manner of fancy numbers.


But it is not just numbers that can be built out of sets. For example, all manner of geometric objects are also built out of sets as well. All abstract geometric objects can be reduced to sets of abstract geometric points, and a kind of function called a coordinate system maps such sets of points onto sets of numbers in a one-to-one manner, which is hence reversible: a coordinate system can be seen as turning sets of numbers into sets of points as well.

For example, the set of real numbers can be mapped onto the usual kind of straight, continuous line considered in elementary geometry, and so the real numbers can be considered to form such a line; similarly, the complex numbers can be considered to form a flat, continuous plane. Different coordinate systems can map different numbers to different points without changing any features of the resulting geometric object, so the points, of which all geometric objects are built, can be considered the equivalence classes (a kind of set) of all the numbers (also made of sets) that any possible coordinate system could map to them.

Things like lines and planes are examples of the more general type of object called a space. Spaces can be very different in nature depending on exactly how they are constructed, but a space that locally resembles the usual kind of straight and flat spaces we intuitively speak of (called Euclidian spaces) is an object called a manifold, and such a space that, like the real number line and the complex number plane, is continuous in the way required to do calculus on it, is called a differentiable manifold. Such a differentiable manifold is basically just a slight generalization of the usual kind of flat, continuous space we intuitively think of space as being, and it, as shown, can be built entirely out of sets of sets of ultimately empty sets.

Meanwhile, a special type of set defined such that any two elements in it can be combined through some operation to produce a third element of it, in a way obeying a few rules that I won't detail here, constitutes a mathematical object called a group. A differentiable manifold, being a set, can also be a group, if it follows the rules that define a group, and when it does, that is called a Lie group.

Also meanwhile, another special kind of set whose members can be sorted into a two-dimensional array constitutes a mathematical object called a matrix, which can be treated in many ways like a fancy kind of number that can be added, multiplied, etc. A square matrix (one with its dimensions being of equal length) of complex numbers that obeys some other rules that I once again won't detail here is called a unitary matrix. Matrices can be the "numbers" that make up a geometric space, including a differentiable manifold, including a Lie group, and when a Lie group is made of unitary matrices, it constitutes a unitary group.

And lastly, a unitary group that obeys another rule I won't bother detailing here is called a special unitary group. This makes a special unitary group essentially a space of the kind we would intuitively expect a space to be like – locally flat-ish, smooth and continuous, etc – but where every point in that space is a particular kind of square matrix of complex numbers, that all obey certain rules under certain operations on them, with different kinds of special unitary groups being made of matrices of different sizes.

On Abstract Objects

I have hastily recounted here the construction of this specific and complicated mathematical object, the special unitary group, out of bare, empty sets, because that special unitary group is considered by contemporary theories of physics to be the fundamental kind of thing that the most elementary physical objects, quantum fields, are literally made of. Excitations of those quantum fields, which is to say particular states of those special unitary groups, constitute the fundamental particles of physics, which combine to make atoms, molecules, stars, planets, living cells, and organisms, including us, so in a very distant way we can be said to be made of empty sets.

(And as all of the truth functions, and so all the set operations, and all the other functions built out of set operations, can be built out of just conegation, and the objects they act upon are built up out of empty sets, everything can in a sense be said to be made out of negations of nothing).

In the same way that when we constructed a series of sets above that behave exactly like the natural numbers and so are indistinguishable and thus identical to them, so too can we construct complicated mathematical objects like this that behave indistinguishably from the fundamental constituents of reality and so are, for all intents and purposes, identical to them. And it is not a special feature of contemporary physics that says reality is made of mathematical objects; rather, it is a general feature of mathematics that whatever we find things in reality to be doing, we can always invent a mathematical structure that behaves exactly, indistinguishably like that, and so say that the things in reality are identical to that mathematical structure.

If we should find tomorrow that our contemporary theories of physics are wrong, it could not possibly prove that those features of reality are not identical to some mathematical structure or another; only that they are not identical to the structures we thought they were identical to, and we need to better figure out which of the infinite possible structures we could come up with it is identical to. We just need to identify the rules that reality is obeying, and then define mathematical objects by their obedience to those same rules. It may be hard to identify what those rules are, but as previously described in my essay against relativism, we can never conclusively say that reality simply does not obey rules, only that we have not figured out what rules it obeys, yet.

The mathematics is essentially just describing reality, and whatever reality should be like, we can always come up with some way of describing it. One may be tempted to say that that does not make the description identical to reality itself, as in the adage "the map is not the territory". In general that adage is true, and we should not arrogantly hold our current descriptions of reality to be certainly identical to reality itself. But a perfectly detailed, perfectly accurate map of any territory at 1:1 scale is just an exact replica of that territory, and so is itself a territory in its own right, indistinguishable from the original.

And likewise, whatever the perfectly detailed, perfectly accurate mathematical of reality should turn out to be, that mathematical model is a reality: the features of it that are perfectly detailed, perfectly accurate models of people like us would find themselves experiencing it as their reality exactly like we experience our reality. Mathematics "merely models" reality in that we don't know exactly what reality is like and we're trying to make a map of it. But whatever model it is that would perfectly map reality in every detail, that would be identical to reality itself. We just don't know what model that is.

There necessarily must be some rigorous formal (i.e. mathematical) system or another that would be a perfect description of reality. The alternative to reality being describable by a formal language would be either that some phenomenon occurs, and we are somehow unable to even speak about it; or that we can speak about it, but only in vague poetic language using words and grammar that are not well-defined.

I struggle to imagine any possible phenomenon that could cause either of those problems. In fact, it seems to me that such a phenomenon is, in principle, literally unimaginable: I cannot picture in my head some definite image of something happening, yet at the same time not be able to describe it, as rigorously as I should feel like, not even by inventing new terminology if I need to. At best, I can just kind of... not really definitely imagine anything in particular.


All of this is building up to me addressing the central question in the philosophy of mathematics, which is about the existence of abstract objects, like numbers and everything else that I've been discussing above. There are two main answers to that question, and some positions intermediate to the two, but I want to offer a position that I consider to be off of that spectrum entirely.

One of the usual two positions is Platonism, sometimes called either Platonic realism or Platonic idealism, which holds that abstract objects, or as Plato called them "forms" or "ideas", are real in the same sense that concrete objects, like rocks and trees and tables and chairs, are real; but that they don't exist in our space and time, and instead live in some separate, spaceless, timeless realm, from which they are somehow connected with the things in our realm that "partake" of them, in the way that a triangular rock "partakes of the form of the triangle".

It is held by Platonists that the existence, in some way, of these abstract objects is necessary in order for mathematical and other abstract statements that seem nominally to be about them to be true: for instance, the Pythagorean theorem, which describes the relations of the legs of a right triangle to the length of its hypotenuse, is not made true by the existence of any particular triangular objects, but rather by facts about the form of triangles generally, even if no concrete triangular objects existed at all.

I am not very amenable to this position at all, holding it to fall heavily afoul of the principles I laid out in my previous essay against transcendentalism.

The second of the usual two positions is called nominalism, which holds that abstract objects are merely empty names, that do not refer to real things that exist at all, and are just names for the similar properties of, and collections of, particular concrete objects. I am much more amenable to that position generally, but I think that a kind of existence can nevertheless be applied to abstract objects after all, a kind of existence abstracted away from the more familiar notion of concrete existence.

In the most restricted sense, one could say "only what I am experiencing right here right now exists". Everything else that we talk about existing is some degree of inference and abstraction away from that. There is a position in the philosophy of time, called presentism, that holds that only the present exists, neither the past nor the future. I agree with them to the extent that in a sense only the present exists: only the present presently exists, right now.

But a part of what I'm experiencing right now in the present is memory, from which I infer (automatically, intuitively, without thinking about it) the existence of other times, having an experience of moving between different times, from those remembered past times and toward projected future times, and there is a perfectly serviceable sense in which I can say that those other times "exist" in a timeless sense of the word: they don't exist now, presently, for sure, but they still exist at other times.

And in that "movie", so to speak, of my past, present, and future experiences that I have now inferred, I have the experience of seeming to move around different places, so I further infer that other places exist too, besides just the here that I am experiencing now. Like with presentism, only the place I am at exists here, but those other places can still reasonably be said to exist elsewhere.

In this way, a spatiotemporal kind of existence is already abstracted away from the more primitive kind of existence relevant to my local, present experiences. But beyond that, some philosophers such as David Lewis hold, and I agree, that other possible worlds, like the kind that we use to make sense of talk of alethic modalities like necessity and possibility, really exist, and aren't just useful fictions, even though they don't actually exist, because "actual" is an indexical term like "present" or "local": it refers to things relative to the person using the word. Just as other times don't presently exist but are still real in a more abstract sense, so too, on this account, other possible words don't actually exist, because "actually" means "in the possible world I am a part of", but they are nevertheless still real in a still more abstract sense.

Likewise, to finally get on to my point about the existence of mathematical objects, since we can in principle equate our concrete universe with some mathematical structure or another, and that mathematical structure definitely concretely exists (because it just is the concrete universe), we can say that other mathematical structures, i.e. abstract objects, don't concretely exist – because "concretely" is indexical, like "actually", it means "as a part of the mathematical structure that is our universe" – but they can nevertheless be reasonably called "real" in some even broader sense, the most abstract sense possible: they abstractly exist.

This position is held by physicist Max Tegmark, and he calls it the "ultimate ensemble"; it is more broadly called the mathematical universe hypothesis, or mathematicism, and it has precursors tracing back to the Pythagorean philosophers of ancient Greece.

This kind of existence for abstract objects does not run afoul of my position against transcendentalism the way that Platonism does, because the abstract objects don't exist in some wholly different kind of way separate from the kind of concrete objects that we can empirically observe. They are just the loosest part of the broader framework of explanation for our empirical observations.

We cannot directly observe other times or places, only the local present, but postulating the existence of other times and places helps to explain the patterns in our local, present experiences. Those other times and places aren't held to be discontinuous or of a completely different nature than the local present, they are just postulated extensions of the here and now. Likewise, I hold, with postulating other possible worlds, continuous with the one we find ourselves in and of same nature as it; and also likewise with other abstract objects besides whichever one is identical with the concrete universe, continuous with it and of the same nature as it.

But still, that last step into abstract rather than concrete existence is a significant one, comparable to the difference between being proficient and being beneficent explored in my previous essay on rhetoric and the arts: pleasures may be the only intrinsic goods, but other things can be instrumentally good for their usefulness in achieving pleasures (comparable to unobservable other places, times, and possible worlds being indirectly observable for their effects on directly observable things), but that last step to just being good at achieving something without regard for whether that something is instrumental to attaining pleasures is comparable to this last step into abstract existence finally being independent of any particular observables.

(Also comparable to my views on rhetoric and the arts is my answer to the question of whether we are discovering or merely inventing these abstract objects, for as you will recall in that preceding essay I argue that there is no clear distinction between invention and discovery of abstract possibilities, be they mathematical ones or artistic ones).

This view of the relation between the concrete and abstract also bears a similarity to what Immanuel Kant called the phenomenal and the noumenal, where on his account we cannot ever have direct experiential contact with noumena, but instead only project our ideas about them behind the world of phenomena that we experience, much like how on my account the truly abstract has no direct influence on the concrete world we experience, and we can only project our ideas of abstract objects behind that concrete world in an attempt to understand and explain it.


Continue to the next essay, On Ontology, Being, and the Objects of Reality.