Forrest Cameranesi Geek of all Trades

On Epistemology, Belief, and the Methods of Knowledge

Epistemology (from the Greek word episteme, meaning "knowledge") is the study of knowledge. It is about what it means to know something, and when and why beliefs are justified, or what the correct method of deciding what to believe is. This essay will mostly address knowledge as an attribute of a person, what it means for someone to be knowledgeable and how one can come to believe knowledgeably, rather than the institutional knowledge of an entire society, which will be addressed in a later essay on education.

Defining Knowledge

The traditional philosophical definition of knowledge, dating back at least to Plato, is that knowledge is justified true belief. That is to say that it is not enough merely to believe something to be the case, and it is not even enough for that belief to turn out to be true, but for someone to know something they must also have a justification for their belief, a reason to believe it, because it would not constitute knowledge to simply guess at an answer to a question (or otherwise come to believe it for insufficient reason) and just by luck turn out to be right.

Edmund Gettier has since proposed that even justified true belief is not enough to constitute knowledge, to the extent that reasons to believe something can sometimes be imperfect, can suggest beliefs that nevertheless turn out to be false, yet we nevertheless want to say that someone can still be justified in believing something for such reasons. Because if justification can be imperfect, someone could be justified in believing something that, despite that justification, might nevertheless turn out to actually be false, and in such cases we would not want to say that it counts as knowledge to be misled by imperfect justifications to believe something that could nevertheless have still been false but, by an unrelated coincidence, does happen to also be true, just not for the reasons justifying the belief.

This problem can trivially be remedied by insisting that only perfect justification, the kind that guarantees the truth of something, is good enough to turn true belief into knowledge; but that would imply that knowledge of almost any substantial topic, where such certainty cannot be obtained, is thereby impossible.

My response to this problem is similar to that of Robert Nozick: I say that knowledge is believing something because it is true, such that not only does one believe it, and it is true, but if it weren't true one wouldn't believe it. This last condition can, I think, be considered a different sense of "justification" from the usual one, and so salvage the traditional definition of knowledge, albeit only by turning the concept of justification on its head, which I argue needs to be done anyway to have a workably rational method of deciding what to believe.

Critical Epistemology

Fideism vs Skepticism

As should be expected from the positions already argued for in my previous essays against dogmatism and against cynicism, and summarized in my previous essay about commensurablism, my general position on the methods of knowledge is critical liberalism. That is to say, I hold that rather than by default rejecting all beliefs until reasons can be found to justify them, any belief should be considered justified enough by default to be tentatively held until reason can be found to reject it.

It is only when one wishes assert one belief over another that reasons need to be presented to show the other belief to be in some way wrong; and that alone does not in turn show that the proposed alternative is the one unique correct alternative, only that some alternative is needed, with the one put forth being merely one possibility. In this manner, knowledge-building is not, on my account, about starting from nothing and building up to grander and grander certainties piece by piece, but rather about starting with limitless possibilities, yet no certainty as to which of them is correct, and then embarking on a never-ending process of narrowing down the range of possibilities by eliminating those that can be shown to be incorrect.

The "liberal" aspect of this, at least initially granting warrant to hold any belief for any (or no) reason, rather than only beliefs positively justified in some particular way, is most of the way to the view called "epistemic anarchism", advocated by the likes of Paul Feyerabend. But on my account there is still the "critical" aspect counterbalancing that, the requirement that you be open to revising any beliefs, however you formed them, and not hold any of them beyond all question. This – and the empiricism that it entails, as explained in my essay against transcendentalism – is the only thing that I think distinguishes valid, scientific epistemologies from invalid, non-scientific ones; contra Feyerabend, who claims there can be no such demarcation at all.

This epistemological view is more generally known as criticism, critical rationalism, or as applied to a narrower set of beliefs about empirical phenomena, as falsificationism; and it has been promoted by philosophers such as Immanual Kant and Karl Popper.

Types of Knowledge

Philosophers make several distinctions between different kinds of knowledge, most notably the distinction between synthetic and analytic knowledge, and the distinction between a posteriori and a priori knowledge.

Analytic knowledge is knowledge about words or other abstract symbols and their meanings, independent of any investigation of the concrete world of experience. The classic example is the knowledge that all bachelors are unmarried, which depends only on knowing that the definition of a bachelor involves being unmarried, and so can be known just from that relationship between the words, even if you don't know what a bachelor is or what marriage is besides that marriage, whatever that means, is something bachelors, whatever those are, by definition haven't done. Synthetic knowledge, on the other hand, is knowledge about things besides just the meaning of words.

Similarly, a priori knowledge is knowledge that can be had before any particular experience of the world, knowledge that depends only on understanding concepts and their relationships to each other. Classical examples of a priori knowledge are mathematical facts, especially geometric ones, such as that the sum of the squares of the legs of a right triangle is equal to the square of that triangle's hypotenuse, which can be deduced just from thinking about imaginary triangles and squares and how they could or couldn't conceivably relate to each other, with no examination of actual triangular or square objects required. A posteriori knowledge, on the other hand, is knowledge that can only be had from some kind of experience of the world.

Types of Knowledge

For much of the history of philosophy, these distinctions were considered synonymous, with all and only analytic knowledge, about the meaning of words, being held to be obtainable a priori, without any investigation of the world; and conversely, all and only synthetic knowledge, about things besides the meaning of words, being held to be obtainable a posteriori, via an investigation of the outside world, not just armchair thought experiments. But Immanuel Kant argued, as I agree, that these distinctions are orthogonal to each other, and that there is such a thing as synthetic a priori knowledge, into which category he held knowledge of mathematics and much of what was considered metaphysics to fall.

For example, in the geometric example given above of a priori knowledge about triangles and squares, there is no dependency on the meaning of the words "triangle" or "square", or any definition of one word in terms of the other; one need only be able to imagine the things that those words happen to refer to, without needing any words for them at all, to work out that particular relationship between them are necessary, certain, could not possibly be otherwise. (Although it is conversely possible to define triangles and squares in terms of more primitive mathematical objects in such a way that that relationship between them is logically entailed by their definitions alone, without needing to understand the synthetic meaning of any of the words in terms of imaginable geometric shapes).

I am not aware of Kant or other philosophers discussing the implication of an analytic a posteriori kind of knowledge by that name, but in Naming And Necessity Saul Kripke addresses at least very similar topics (such as how the fact that two names refer to the same thing, and so the things they refer to are necessarily identical, is only known a posteriori). And I hold examining such a category is necessary for a complete understanding of knowledge.

For while all a priori truths are necessary truths, since synthetic a priori truths are internal to the mind (hence a priori), but not in terms of publicly established relationships of words (hence synthetic), they are not interpersonally relatable, and so are only a matter of private knowledge, not public discourse. The only necessities that can be treated publicly are those phrased in terms of words with assigned meaning, analytic a priori truths; but those in turn depend on the analytic a posteriori (and thus contingent) assignment of such meaning.

On Synthetic Knowledge

Like many empiricist philosophers, such as David Hume, I hold that synthetic a posteriori knowledge is in a sense the primary kind of knowledge, both in that it is the kind that we are most concerned about in trying to understand what concrete reality is like, and in that it is where we get the basic ideas that we can then extrapolate further ideas from in our imaginations, which become the basis of synthetic a priori knowledge.

Against Confirmationism

An immediate consequence of my critical rationalist epistemology is the rejection of a view called confirmationism, or more specifically hypothetico-deductive confirmationism (as some critics of this view continue to call their alternatives "confirmationism" anyway). That is the common view that if a belief has implications about what else one should expect to find true, and those expectations are later borne out, that confirms the original belief, or in other words gives further reason to continue holding that belief.

Falsificationists and critical rationalists more generally, including myself, hold this to be straightforwardly a case of a logical fallacy called affirming the consequent: given a conditional statement of the form "if P then Q", it does not then follow that "if Q then P", so even if it's true that if P then Q, and you find that the consequent Q is indeed the case, that does not thereby imply that the antecedent P must be the case. It might be, but it just as easily might not be, and to suggest that it must be, just because the consequent Q was found to be true, is to commit the fallacy of affirming the consequent.

The classical example of this is that if it were true that all swans were white, then any particular swan encountered would be found to be white; but encountering a particular white swan, or even many particular white swans, does not thereby prove that all swans are white, because it might still turn out to be the case (as in fact it is) that some swans are black, no matter how many white swans you've seen. (Indeed, as Carl Hempel points out, if that form of inference were valid, then because "all swans are white" and "all non-white things are non-swans" are logical equivalents, called contrapositives, the observation of any non-white non-swan, such as a green leaf or a red rock, would also count as evidence that all swans were white, which is intuitively absurd).

Thus one can never in any way positively confirm any beliefs to be true, just by finding that everything else so far seems to accord with those beliefs, because any new piece of evidence might always be the one to show those beliefs false. Beliefs can only be shown false, or not yet shown false; never positively shown true.

Note, however, that this distinction is not at all about whether the belief being tested is phrased as a negation of something else or not. You can always rephrase something as just a different term that doesn't involve negation: for example, "natural" and "artificial" can be taken as negations of each other, and either tested for without ever saying "not-" the other; one could tell that something was natural by seeing that the expected consequences of it being non-natural, in other words artificial, were false. The distinction between confirmation and falsification is entirely about whether you're deriving support for something from observing its expected consequences (confirmation), or from observing things contrary to the expected consequences of its negation (falsification).

Note as well that it is never merely one particular belief in isolation that is conclusively falsified, but only whole systems of belief, an insight that Willard Quine calls "confirmation holism". As Quine points out, what we think we have observed is contingent upon the broader web of beliefs through which we interpret that observation. E.g. if we find ourselves seeming to see a black swan, we could reject the hypothesis that all swans are white, or we could suppose that we are not really seeing a black swan, only some trick or illusion, or any other explanation for why we seem to see a black swan, but really don't. So all that the observation falsifies is the conjunction of all our beliefs together: that all swans are white, and that we are in fact seeing a black swan, and everything else that that implies. Exactly which of those beliefs to change, and so what different beliefs to adopt instead, always remains an open question.

On Probability

None of this is to imply that all beliefs not yet shown false are equal. Beliefs not yet shown false can still be more or less probable than others, as calculated by methods such as Bayes' theorem. Falsification itself can be considered just an extreme case of showing a belief to have zero probability: if you are frequently observing phenomena that your beliefs say should be improbable, then that suggests your beliefs are epistemically improbable (i.e. likely false), and if you ever observe something that your beliefs say should be impossible, then your beliefs are epistemically impossible (i.e. certainly false).

The traditional phrasing of a falsificationist inference from some evidence E to some hypothesis H is of the form "if not-H then not-E; and E; therefore H", or equivalently "H if E; and E; therefore H". A probabilistic rendering of that same direction of inference would be "the probability of H given E, times the probability of E, equals the probability of H". That is almost identical to a formula equivalent to Bayes' theorem: "the probability of H given E, times the probability of E, divided by the probability of E given H, equals the probability of H".

That additional term that Bayes theorem divides by is the same as the probabilistic rendering of a confirmationist inference. The traditional phrasing of a confirmationist inference from some evidence E to some hypothesis H is of the form "if H then E; and E; therefore H", or equivalently "E if H; and E; therefore H". A probabilistic rendering of that same direction of inference would be "the probability of E given H, times the probability of E, equals the probability of H". The last two terms of that are the same as the probabilistic rendering of a falsificationist inference, which is identical to the aforementioned formulation of Bayes' theorem, except that Bayes' theorem also divides by the first term of this probabilistic rendering of a confirmationist inference.

In other words, not only are the results of Bayes' theorem proportional to the results that falsificationism would give, they are furthermore inversely proportional to the results that confirmationism would give.

When it comes to practical decision-making, it is often most reasonable to act on the beliefs that have such a greater probability, to ensure a greater chance of success. But it is not epistemically wrong to believe something that is unlikely but not actually shown false yet, and as falsificationists like Popper have argued, it is in some ways even better to do so.

That is because beliefs that are more specific and detailed, having higher information content, are inherently less likely to be true – or conversely put, a belief that is so broad and general that it could not possibly be false accomplishes that by claiming nothing of substance at all, leaving no claims open to falsify. So such unlikely, high-information beliefs that, nevertheless, still have not been falsified, have automatically withstood much more testing than those that put forward nothing to test. And it is only by taking such risks, sticking our necks out and risking being wrong, that we can hope to find out more about what is wrong, and so narrow in further still on what in turn might still be right.

In general, I hold, we should tentatively adopt more specific and so risky beliefs when we can afford to risk being wrong, but when we cannot afford that risk, we should act in accordance with those beliefs that have the greatest probability of being true.

On Parsimony

This notion of the information density of beliefs dovetails into another prominent principle of belief-formation: the principle of parsimony, commonly called Ockham's Razor, after William of Ockham who coined a well-known formulation of it. This principle states that if given multiple beliefs or theories or abstract models that all concord equally well with the evidence at hand, the simplest of them should be preferred. Casually phrased, the principle states that it's preferable if you can speak more truth in fewer words.

I agree with this principle in general, and find that it relates in more detail to what Thomas Kuhn described as the structure of scientific revolutions. Kuhn's account was a sociological description of how he observed scientific revolutions to have actually occurred in practice, which he attributed to non-rational social and personal factors more than to adherence to any kind of rational principles. But I think it can be adapted into a more advanced version of the principle of parsimony and used as a normative epistemological rule for how science, and belief-formation more generally, should rightly be conducted.

Kuhn observed science to advance in phases. The process begins with what he called "pre-science" where there are no prevailing theoretical frameworks or approaches, or "paradigms", uniting the research efforts, but rather many different competing approaches. Then it proceeds into what he called "normal science" once such a paradigm has been established by the widespread success of some theory. And then, as new evidence that cannot successfully be incorporated into that paradigm builds up, a period of what he called "revolutionary science" occurs wherein once again many different approaches compete in the effort to reconcile all those anomalies together with all of the older evidence into a new paradigm, re-establishing a new period of normal science.

I think that this general process can be turned into a normative principle for belief-formation by formulating it in terms of parsimony, or in terms of how complex a theory needs to be to account for the evidence at hand; because the entire point of coming up with theories, abstract models to believe, is to have an easier, simpler representation of reality to work with than just the whole body of raw observational evidence thus far accumulated. In essence, the point of a theory is to compress the data of the observational evidence into a more compact formula while retaining the same informational content, as in the field of algorithmic information theory; and a more parsimonious theory is precisely one that can better compress the same data with no greater loss of information.

On my normative account, the period of pre-science is one in which no single theory has yet been devised that can account for all of the evidence at hand, and so there is no better, easier, simpler, more parsimonious way of describing all the evidence than many different theories used in a patchwork to account for each of the disjointed areas of evidence. Once a theory is devised that can account for all of that evidence, that then becomes the better, easier, simpler, more parsimonious way of describing it all, and so the patchwork of other theories are rationally, pragmatically discarded in favor of it. There may still be other theories that also account for all of that evidence, and so are equally unfalsified, but unless they are in turn even more parsimonious, there is no reason to use them instead, and pragmatic reason not to.

But as new evidence accumulates that cannot be reconciled with the existing paradigmatic theory, the best way to describe all the evidence at hand begins to grow again into an unwieldy patchwork of the main paradigmatic theory and all of the exceptions and special cases needed to be made and used to handle the anomalous evidence, until at some point that patchwork becomes so complex that other competing theories, previously rejected as less parsimonious than the paradigmatic one, are now more parsimonious than the old paradigm plus all of its exceptions, and it becomes rational to adopt the best of them instead of trying to cling to the old paradigm and its mess of special exceptions.

Though synthetic a posteriori knowledge is the kind of greatest concern, being about the concrete world we live in, it is only when we get to a priori knowledge that we can have our greatest certainty, for while all a posteriori knowledge is contingent, which is why we can only know it after investigating the world, all a priori knowledge is necessary, because it simply could not make any sense for things to be otherwise, there is no possible experience of the world that could contradict something that is a priori true.

On Analytic Knowledge

When we then apply words to the ideas in our imaginations, we get analytic a priori knowledge, where we no longer need to actually do the imagining and can just manipulate abstract, symbolic representations of things without even needing to know exactly what they are representative of. This is where we enter the realm of logical truths, things that are true just by definition, regardless of what is empirically observable. But this kind of knowledge in turn depends on another kind of a posteriori knowledge, just as synthetic a priori knowledge depends on synthetic a posteriori knowledge to form the ideas that it manipulates in the imagination.

Analytic a priori knowledge, knowledge about what is or isn't true by definition, depends on knowing what the definitions of the words in question are, and that is not something that is itself known a priori, but only a posteriori. Words themselves do not inherently mean anything, but rather, linguistic communities arbitrarily assign meaning to words, and could assign them differently. But we use language, with its conventionally assigned meanings, as a useful tool in our means of pursuing the actual truth, in terms of empiricism. Consequently, patterns in the assignment of meaning can still be better or worse for that use, even though the particulars are still arbitrary.

On the Assignment of Meaning

Initially, all words mean anything, and in doing so effectively mean nothing; it is the division of the world into those things the word means and those it doesn't that constitutes the assignment of meaning to it. Words mean what people mean them to mean, and so long as everyone involved agrees on the meaning of words, that is all that is necessary to know the truth of their meaning, the analytic a posteriori facts of what the words mean.

But when people disagree about what words rightly mean, we must have some method of deciding who is correct, if we are to salvage the possibility of any analytic knowledge at all. For if, for example, one person in a discourse insists that to be a bachelor only means to live a carefree life of alcohol, sex, and music (ala the Greek god Bacchus from whose name the term is derived), with no implications on marital status; while another person insists that to be a bachelor only means to be a human male of marriageable age who is nevertheless not married, with no implications on lifestyle besides that; then they will find no agreement on whether or not it is analytically, a priori, necessarily true that all bachelors are unmarried.

Such a conflict could be resolved in a creative and cooperative way by the use of qualifying terms to specify which sense of the word is meant: for example, the aforementioned disagreement might be resolved by the creation of the terms "lifestyle-bachelor" and "marriage-bachelor" to differentiate the two senses of the word "bachelor" in use. Or the same word can have multiple meanings, so long as the uses of the word in those multiple meanings do not conflict in context.

But if no such cooperative resolution is to be found, and an answer must be found as to which party to the conflict actually has the correct definition of the word in question, I propose that that answer be found by looking back through the history of the word's usage until the most recent uncontested usage can be found: the most recent definition of the word that was accepted by the entire linguistic community. That is then to be held as the correct definition of the word, the analytic a posteriori fact of its meaning, in much the same way that observations common to the experience of all observers constitute the synthetic a posteriori facts of the concrete world.

In cases where that most recent uncontested usage cannot be determined or is lost to history, or where all the original contestants are dead and the current contestants have been contesting whose usage is rightful their entire lives, then a new convention of use must be established to settle the question of meaning. Until the new convention is established, the words in question can rightly mean anything anyone involved is using them to mean.

And for that new convention to constitute knowledge, it must be one that everyone would agree to if they were blinded to their own place in that convention, e.g. which of the possible uses was their own current one. That in turn is usually tantamount to the usage that best fits with the patterns of usage of other words in the language, centered on the current usage of related words.

On Epistemic Rights

From these two types of analytic knowledge, we can derive a set of principles of discourse analogous to the principles called "rights" when applied to acts rather than speech, principles that we might call "epistemic rights". These really introduce nothing new beyond the above epistemology, but only reformulate much of it in an interpersonal context rather than an individual one. Rights in the traditional sense can be formulated in terms of obligations and their negations, and since as discussed in my previous essay on logic and mathematics, the descriptive equivalent of prescriptive obligation is necessity, these epistemic rights can similarly be formulated in terms of necessities and their negations.

Rights in the traditional sense have been categorized, following the work of authors such as Wesley Newcomb Hohfeld, into four groups organized across two different distinctions: the distinction between active and passive rights (not to be confused with positive and negative rights; either an active or a passive right can be either positive or negative), and the distinction between first-order and second-order rights. The epistemic rights I will discuss can be similarly organized into the same four categories of active first-order rights or "liberties", passive first-order rights or "claims", passive second-order rights or "immunities", and active second-order rights or "powers", but with each of the epistemic versions defined in terms of necessity rather than obligation.

On Epistemic Claims and Liberties

An epistemic liberty is something that it is not necessarily wrong to assert or to believe. It is the negation of the necessity of a negation, and so as discussed already in my previous essay on logic, it is equivalent to a possibility. Something you have the epistemic liberty to assert or believe may nevertheless be false, but you are not necessarily wrong in doing so; it is a position that might be defensible, and you are within your epistemic rights, so to speak, to try to defend it.

An epistemic claim, conversely, is a limit on others' epistemic liberty: it is something that it is necessarily wrong to disagree with, which is just to say that it is logically necessary. Per the epistemological principles already spelled out above, I hold that people have maximal epistemic liberty, limited only by the epistemic claim to non-contradiction: the only things necessarily wrong to assert or believe, the only indefensible positions to hold, are ones that flatly contradict the meaning of the very words used to assert or defend them.

But there is in turn an exception to that epistemic claim: there is still epistemic liberty to contradict the meanings given to words by those contradicting the rightful meanings of words, analogous to how in many common conceptions of rights in the traditional sense, it may be considered generally wrong to act upon another person's body, but an exception is made when acting upon someone who was already wrongly acting upon someone else.

For a concrete example using the word "bachelor" as discussed above: one person may have the epistemic liberty to assert that all bachelors regularly drink alcohol, inasmuch as that is something that might or might not be true, it is not necessarily false, and so it is a position that could feasibly be defended. Other people may have the epistemic claim to assert that no bachelor is married, and so the first person's epistemic liberty to assert the contrary, that some bachelors are married, is curtailed, given that "bachelor" just means "unmarried man". But those other people may in turn have the epistemic liberty to assert that not all bachelors regularly drink alcohol, even if that contradicts the meaning of the word "bachelor" used by the first person (say, "someone who lives the lifestyle of Bacchus"), if that first person is contradicting the true meaning of the word "bachelor".

On Epistemic Powers and Immunities

That all depends, of course, on what the true meaning of the word "bachelor" actually is, which is where the second-order rights come in to play, which have to do with changing what is or isn't logically necessary, by changing the meanings of words. An epistemic power is the liberty to do so, to assign or reassign meanings to words. An epistemic immunity, conversely, is a limit on others' epistemic power, just as an epistemic claim is a limit on others' epistemic liberty.

Per the epistemological principles already spelled out above, I hold that people have maximal epistemic immunity – nobody gets to redefine words out from under others, words continue to rightfully mean whatever they have meant before – limited only by the epistemic power to mutually agree to definitions (analogous to the power to form contracts in the domain of traditional rights), which power is how those words got their initial meanings to begin with.

That power, in turn, has its own exception, in the reflexive case just as with the exception to the epistemic claim to non-contradiction above: nobody has the power to agree to agree to someone else's definitions, such as by agreeing that some word means whatever some supposed authority says it means (ala "an X is whatever Oxford says an X is"), or that it means whatever it is popularly used to mean (ala "an X is whatever people call an X"). The mutual agreement that establishes a rightful meaning has to be an agreement to something specific, not a reflexive agreement to agree with someone.

At first glance, one would think a maximally freethinking discourse would be one in which there were no epistemic claims at all (because every claim is a limit on someone else's epistemic liberty), and no epistemic powers at all (because powers at that point could only serve to increase epistemic claims, and so to limit epistemic liberties). But that would leave nobody with any epistemic claims against others using fallacious arguments to establish epistemic authority in practice even if not in the abstract rules of knowledge, and no epistemic power to hold people to their definitions either, making tractable discourse nigh impossible. So it is necessary that epistemic liberties be limited at least by claims against such fallacies, and that people not be immune from the epistemic power to establish mutually agreed-upon definitions between each other.

But those epistemic claims and powers could themselves be abused, with those who violate the claim against such fallacies using that claim to protect themselves from those who would refute them, and those who would like for definitions not to require mutual agreement to leverage practical power over others to establish broader epistemic power over them. So too those epistemic claims to consistency and powers to define, which limit the unrestricted epistemic liberty and immunity that one would at first think would prevail in a maximally freethinking discourse, must themselves be limited as described above in order to better preserve that liberty.

But these epistemic rights really only establish side-constraints on discourse, laying out only the broadest boundaries of what is discursively acceptable. Within those broad boundaries, each party to the discourse has much freedom to use the rest of the epistemological principles laid out above to figure out to the best of their own individual ability what seems most likely to be true in accordance with their experience of the world. When multiple such parties come together to compare and contrast their experiences and mutually attempt to narrow in on the truth, building between themselves institutional knowledge in common for their whole society, entirely new principles are needed to see that that endeavor proves fruitful, which will be the topic of my next essay.

Continue to the next essay, On Academics, Education, and the Institutions of Knowledge.