There’s a classic thought experiment called Newcomb’s Problem. It goes as follows:
Newcomb’s problem: You face two boxes: a transparent box, containing a thousand dollars, and an opaque box, which contains either a million dollars, or nothing. You can take (a) only the opaque box (one-boxing), or (b) both boxes (two-boxing). Yesterday, Omega — a superintelligent AI — put a million dollars in the opaque box if she predicted you’d one-box, and nothing if she predicted you’d two-box. Omega’s predictions are almost always right.
If you haven’t heard about this yet, you might as well take a moment to consider what you’d do.
While you do, here’s a quote by Robert Nozick in his 1969 analysis of it:
In his 1969 article, Nozick noted that “To almost everyone, it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly.”
One argument is: what you do right now can’t control what’s already in the box, therefore obviously you should two-box and get more money.
The other argument is: if you think like that you will not get very much money at all, because there will be no money in the other box.
But doesn’t that second argument imply that the choice you make now can in some sense control the past? “Yes”, answers Joe Carlsmith in his Betteridge’s-Law-of-Headlines-violating essay Can you control the past?, “and this is a wild and disorienting fact”.
The whole essay is worth reading if you’re at all interested in the topic. My aim right now is to analyze this one beautiful hypothetical:
Imagine doing “tryout runs” of Newcomb’s problem, using monopoly money, as many times as you’d like, before facing the real case (h/t Drescher (2006) again). You try different patterns of one-boxing and two-boxing, over and over. Every time you one-box, the opaque box is full. Each time you two-box, it’s empty.
You find yourself thinking: “wow, this Omega character is no joke.” But you try getting fancier. You fake left, then go right — reaching for the one box, then lunging for the second box too at the last moment. You try increasingly complex chains of reasoning. Before choosing, you try deceiving yourself, bonking yourself on the head, taking heavy doses of hallucinogens. But to no avail. You can’t pull a fast one on ol’ Omega. Omega is right every time.
Indeed, pretty quickly, it starts to feel like you can basically just decide what the opaque box will contain. “Shazam!” you say, waving your arms over the boxes: “I hereby make it the case that Omega put a million dollars into the box.” And thus, as you one box, it is so. “Shazam!” you say again, waving your arms over a new set of boxes: “I hereby make it the case that Omega left the box empty.” And thus, as you two-box, it is so. With Omega’s help, you feel like you have become a magician. With Omega’s help, you feel like you can choose the past.
Now, finally, you face the true test, the real boxes, the legal tender. What will you choose? Here, I expect some feeling like: “I know this one; I’ve played this game before.” That is, I expect to have learned, in my gut, what one-boxing, or two-boxing, will lead to — to feel viscerally that there are really only two available outcomes here: I get a million dollars, by one boxing, or I get a thousand, by two-boxing. The choice seems clear.
Makes sense, right? Like if you had those experiences, with the monopoly money, that’s how it would feel.
And I would describe this as “you have gained trust in Omega’s prediction abilities”. You can tell for yourself that Omega can predict you effectively perfectly. Your sense of things includes this perfect predictor. Let’s contrast that with the scenario as presented:
Newcomb’s problem: You face two boxes: a transparent box, containing a thousand dollars, and an opaque box, which contains either a million dollars, or nothing. You can take (a) only the opaque box (one-boxing), or (b) both boxes (two-boxing). Yesterday, Omega — a superintelligent AI — put a million dollars in the opaque box if she predicted you’d one-box, and nothing if she predicted you’d two-box. Omega’s predictions are almost always right.
“Omega’s predictions are almost always right.” — says who?
» read the rest of this entry »(this post was written in about 20 minutes, in the “onepager” genre: my friend Visa’s challenge to explain your thing rapid-fire. my others: Non-Naive Trust Dance, Evolution of Consciousness, Bootstrapping Meta-Trust)
Ignoring your present distrust because you trusted before is as foolish as ignoring today’s thirst because you drank enough yesterday.

Some examples where this sort of thing comes up:
There’s a common confusion people make, which is to try to hang onto some previous trusted experience, in the face of new distrust.1 Or, to try to get someone else to do so.
This is failing to treat distrust as the sense organ that, in my view, it is. This is why I like the analogy with thirst. Now, trust is more like temperature perhaps than thirst, because you don’t need a steady input of new trust-water in order to maintain homeostasis. You just need the right conditions.
But the point is, whatever trust or distrust you have in the present is your system’s current best assessment of what’s going on, that you’re encountering and dealing with. You may have some memory of trusting this person or institution or group or whatever at some other time, but that memory affects you only exactly as much as it does. Theoretically what I’m talking about here could happen in the reverse direction, but it’s rarer.
I’d like to highlight a difference between two types of moves that someone can make, in relation to such a memory.
Move 1: attempted trust-laundering: A says to B, or B says to themself, “but remember that incident/moment/etc last week? see, I’m/it’s totally trustworthy!” This is an attempt to overwrite the present distrust with some trust from another time and situation. It sees the current distrust as an obstacle to something, and attempts to bludgeon it into submission with the old trust. If it seems to work, that’s likely to be because it results in an inner-coalitional coup, bringing to power some subsystem that trusts, which is suppressing the by ignoring the distrust.
One of the reasons I’ve seen this happen is that A really trusts themself in some way, and so the world makes a lot more sense to them when B also trusts them in that way. Thus, when they encounter B not trusting them, they think “B is in a state of confusion” and they try to fix that by bringing B back into the state of trust, openness, etc.
Move 2: non-naive trust integration-encouraging: A says to B, or B says to themself, “but remember that incident/moment/etc last week? how does this situation look in light of that? does that change things at all, to bring it into awareness? maybe not, but let’s consider.” This is an attempt to synthesize the present distrust with the trust from another time and situation. It recognizes that the present skin-in-the-game is where things ultimately ground out, and offers the old trust to that present skin-in-the-game, as a resource for it to use as its sees fit.
This requires adopting a kind of epistemically neutral/spacious stance, where you honor the person’s learning system and let it do its thing. It helps also to see the other person as containing multitudes, and to be allied with all of the subsystems and attempting to welcome all of them, rather than trying to elicit your preferred face.
Relatedly, I have on occasion invited someone to basically recompute their trust in me, after I said something. I don’t demand that the result come out different—well, I don’t even demand that they in fact do the recomputation. But it’s more a chance to just say “hey, does that affect things?” and to really find out what the answer is.
Non-naive trust is all about finding out, not about asserting.
There’s a puzzle that shows up when talking about intersubjective verification: how can I ever really know what’s going on in your head? What is it like to be you? What are your desires, goals, understandings? If I have an insight, can I tell that you have the same insight?
It seems to me that: indeed, in some sense I can’t ever know what’s going on in your head—there’s a measurement problem.
But I can come to trust things about you, and what that means is that I know it’s good enough for my purposes. It is sufficient for all the purposes that I have for now and the foreseeable future that I can just treat this as how things are. I don’t even want to say “treat this as true”—to say that it’s true is to again enter into the objective lens, which is irrelevant. It’s how things are, as far as I’m concerned, as far as I can tell.
And that’s good enough—trust is, by definition, what’s good enough. I don’t need to make a further claim that it’s true.
I’ve talked about trust as “what truth feels like in first person”—this is the dimension of trust that’s less about safety or alignment and more just about the sense of how things are. It’s your basic sense of things.
And trust is dynamic, of course. I’m trusting a bridge until one step is rotten, and then oop! Maybe I proceed with caution. Maybe I turn back, relaxedly trusting the steps that I already walked on. Maybe I observe that the ropes are clearly holding even if the beams aren’t, so I try walking with my feet towards the outside, holding onto the ropes.
To say I know something about you (or that something is true of you) is to say that others should agree. But to say that I trust something about you is to say that I’ve done the checks that I need to do, given my needs and purposes. You, who have different needs and purposes, will not in general trust what I trust. You might trust something on the basis of my say so, but you might not. It depends on, well, everything—your purposes, your needs, your sense of me and mine, your trust in my motives for speaking, and the quality of my assessments, etc. And you don’t make those choices consciously, you just find out: when I say I trust something (or say why I trust it), does it result in you trusting it, or not?
Anyway! This is one of those funny things where everybody is doing this just fine all the time, but then philosophers come along with a framework that makes it seem impossible. Wikipedia’s page on intersubjective verifiability says:
While specific internal experiences are not intersubjectively verifiable…
They aren’t if you have to force things to be objective—if you have to find the one standard for all time that you can apply. But if we’re allowed to intersubjectively verify things according to our own unverified personal gnosis (ie our trust) not an objective standard, then we can just do it, the way we always do it in order to form a common sense of things.
The most obvious cases are practical social situations—being able to trust that a particular employee understands the assignment, or being able to trust that your spouse actually gets the thing that really bothered you about what they said this morning. Or developing a share sense of why someone was being weird or whether they’re safe to invite to another party, by debriefing things. Sometimes things add up, to an experience we trust… and other times they don’t add up, and we don’t trust them.
Then there’s intersubjective verification of understanding of eg physical or mathematical phenomena—the phenomenon might be objective, but the question of whether someone understands it is not! So getting a common sense that it’s understood by a group still involves this engagement with whether it it feels like you can treat it as commonly known or whether you feel that you need to keep hedging or treating it as debatable or unclear.
Then, consider intersubjective verification of buy-in—this is very relevant to game theory. If you’ve got a stag hunt (a game where there are two options—solo-hunting rabbit, which produces a small win for anyone who chooses it, and co-hunting-stag, which produces a massive win for everybody if and only if everybody chooses it, otherwise those who choose it get nothing). If everybody trusts that everybody else will choose stag, then everybody will want to choose stag, thus will choose stag. If we only somewhat trust that, then we might. Even if it were true that everybody else would choose stag, the operative question for you is whether you trust that they would. And so the matter of buy-in needs to deal with the question of trust—each person’s trust, which may need to be earned differently. (And, as Duncan Sabien pointed out, in practice someone who can’t afford to risk getting a zero win this round is not likely to be able to choose stag, so trust will be best earned in an iterative game by having a few rounds where everybody agrees to stick to rabbit, to build up that surplus and to build up the experience of people doing what they said they would do even if it wasn’t risky.) These same dynamics apply to much more complex situations of team buy-in
This also dissolves solipsism, in a sense. Can I know that you’re really there, having experiences and dreams and so on? Moot point—acting like you are works better than acting like you aren’t, so I trust that you are. The important point is that’s all I ever have—there never was certainty anyway. It was all always just made of trust.
Where this gets really interesting is in matters of subjective science and reflexivity—when the map changes the territory. Take some insight that is of the interior, not the exterior, such as buddhist no-self, IFS Self, or the NNTD insight, or religious experiences… how can we know that each other has also experienced this? Well, once again it’s a matter of trust-building. We start simply not knowing, and as we trust-dance in relation to it (basically allowing our interfaces to come into honest contact) until we develop trust that we’re experiencing something compatible enough for our purposes… or until we start to distrust that. Or we just don’t know how to proceed any further and we still don’t know.
One open question or edge for me is that it seems pretty obvious to me that even in reflexive domains, where there are multiple stable possibilities, there can be something like objective facts about what the stable possibilities are. And eg the core NNTD insight (“you can’t trust what you can’t trust”) seems very obviously true to me, not merely one of many stable ways of viewing things. If someone said they disagreed, I’d say “we’re clearly not talking about the same thing”, the same way as someone would of a mathematical knowing. (This is less true of the whole NNTD framework that I’ve developed based on the insight—see the many meanings of NNTD—although even there I’m pretty sure most of it basically holds given some assumptions (some of which I may not be conscious of).)
So I have this sense that I can tell for myself that NNTD is true (ie not just that I trust it, but that anybody who investigated it thoroughly would also come to trust it) but the most obvious truth of it somehow routes through subjective experience. I can give reasons, and you can reason about those reasons, but ultimately the question is not “does that logically hold?” but “do you see it?”
And—just between us—the question is not “do you see it?” but “can I trust that you see it?”
Yesterday I published hostility is a sign of too-closeness, which featured my response to a friend from my former community, about his desires for the logistics and culture of a co-living house he was creating. In that post, I talked about how blame can be downstream of people pretending they are a good fit for living together (or working together, or whatever) when they actually have some real conflict or incompatibility that they’re trying to convince themselves they have to put up with, but on some level they know they don’t have to… which turns to hostility.
In this post, I continue my response, reflecting on a more specific phenomenon central to the puzzle of living together: how do you talk about the dishes? …and have it work out. And not just the dishes but the dozens of other places where your patterns of life will need to interface smoothly for living together to feel good. Even for people who are very compatible, there will still be points of tension and friction, and you’ll need to figure out how to talk about those.
So. One of my friend’s desires for the shared purpose / culture of the space was:
impacts can be shared freely, and can be received as impacts rather than hearing impacts as being judgement or blame
And below is my response:
First of all I want to name how vital this is for sane living—how crazymaking it is to be living with someone and unable to acknowledge simple impacts without it either turning into “so you hate me” or getting rounded to “nbd whatever”. And largely we all know this in the extended upstart scene, but since I’m a bit of an apostate these days it feels worth making explicit that YES, THIS MATTERS, and I see that it matters, and am speaking from there.
So then given that this bullet as described is clearly ideal, how do you handle situations where that isn’t working? What does the pathway look like to get from not-flowing to flowing? Merely intending this doesn’t necessarily make it happen.
It seems to me that the situations where blame comes up can be described in a few different ways, both as distinct situations and as distinct understandings/framings of those situations, where the language used to understand them has an effect.
So: a few ways impacts can be received:
Breaking these down a bit, with their implications:
» read the rest of this entry »When I had my Non-Naive Trust Insight in mid-2020, I initially conceived of it as a patch on what we were doing at the cultural incubator I’d been living in for years, and I drafted this intro in Roam intended to convey it to the people I was living with. Things got pretty weird and I didn’t quite get it to the point of finishing it to share it with them at the time (although it wasn’t private—technically they could have looked, since it was in our shared Roam). So we’ll never know how it would have landed at that time. Some of the terminology or assumptions referenced below may be opaque to readers outside of that context. I’ve tried to add a bit of context but feel free to comment asking for more clarity.

“The impediment to action advances action. What stands in the way becomes the way.” – Marcus Aurelius
“If you can trust yourself when all men doubt you, but make allowance for their doubting too.” – Rudyard Kipling
Malcolm’s initial introduction to the Non-Naive Trust Dance, mostly written early October 2020
The [[[[non-naive trust]] dance]] is a framework created by [[Malcolm]] for modeling how [[non-naive trust]] is developed within and between people, which of course includes the nurturing of self-trust within each individual.
how do we bootstrap from trust we already have, to the trust we want to have to thrive (and need to have for problems we care about)?
[This post written in about 15 minutes, as part of my new experiment in Writing It Live!]For a much much longer take on the same question, with more examples and angles, read my mini ebook How we get there: a manual for bootstrapping meta-trust.

If you like one-pager bullet-list style posts, I have more:
Sixth post in “I can tell for myself” sequence. On the last episode… Reality distortion: “I can tell, but you can’t”, which opened up our exploration of interactions between one person who is in touch with their own direct-knowing and another person who is more just taking others’ word for it. With this post we’re finally reaching some of the core ideas that the other posts have been a foundation for.
(I left “guru” in the title of this part, because “guru dynamics” are what I call this phenomenon, but I decided not to use the word “guru” in the body of the text. It’s a loanword that originally means “teacher” but of course in English has the connotations associated both with spiritual teaching in particular and thus also with the dynamics I want to talk about here, some of which are well-documented in The Guru Papers. To be clear, I don’t think guru’ing, as a role, is necessarily bad—it’s just extraordinarily hard to do well. But “guru” as a frame… the roles are probably best not thought of as a student-teacher relationship at all. Instead, perhaps, “one who’s remembering” and “one who’s reminding”: ancient wisdom tradition words for this like “sati”, and “aletheia” mean “remembering” or “unforgetting”. Those are awkward though.)
Things get weird when a person who has consistent access to their sense of “I can tell for myself” across many domains—especially spiritual, interpersonal, esoteric, subtle, ineffable., ones—finds their way into a position where they’re trying to help others develop this capacity for themselves.
This happens remarkably often! There are many factors that contribute to this, of which here are six:
So it’s very common for someone who has developed their sense of self-authored direct-knowing to find themselves surrounded by a bunch of people who also want to develop this capacity. (We’ll explore in a later post why there’s often precisely one teacher per learning context; the previous post also hints at it.)
But attempting to teach “I can tell for myself” (or self-trust, or whatever you call it) leads to what is nearly a paradox:
Suppose that when someone says something you don’t understand or resonate with, your two available moves are either to (1) reject what they’re saying or (2) “take their word for it”—a condition which is tautologically the starting point for someone who has learned to not trust themselves in the face of what someone else is saying, and is wanting to develop that self-trust—then if I’m trying to convey “how to tell for yourself”, you’ll either… reject what I’m saying as senseless, or… take my word for it that this is in fact how to tell for yourself and you just need to do it exactly as I say yessirree!
…which is not “I can tell for myself”. Or is it?
» read the rest of this entry »A tangent off the “I can tell for myself” sequence, between post 4 & 5.
There’s a thing it feels like to know 5+5=10.
Wait—that’s exactly the opposite of what I mean. There are many things in feels like—in some sense at least one per person who’s ever known it, in another sense as many as times it’s been known! And while I can know 5+5=10 is so true that I can be certain that if you know what I mean by 5 and + and = and 10, that you’ll agree… my knowing and your knowing are still different.
Concretely, I might be knowing 5+5=10 from a verbal memorized table that never did me wrong, and you might be imagining two nickels and a dime. Or one of us has an experience of beholding 10 fingers, 5 on each hand, the other has a sense of 5 having a halfness to it, in relation to 10, related to thinking in decimals for a lifetime. But those are just four abstract descriptions, under which many yet-unique experiences of knowing 5+5=10 could be binned—and many could not. And either or both of us might go about knowing 4+8=12 very differently than we know 5+5=10.
And those knowings are likely yet different from what it would feel like to know such a thing together.
This applies to all knowings: mundane and spiritual, mathematical and episodical. My knowing is not your knowing, and neither one is our knowing. And they aren’t the thing that is known.
Something can be true without being known: I could write a computer program that would generate a true statement that nobody had ever seen or known (such as 12364871317234+1=12364871317235, but imagine it’s longer and more convoluted) and it would still be true within that formal system, but it wouldn’t be known unless or until someone went and knew it. It could be true that there’s life on a particular exoplanet 51 Pegasi b, but it’s not currently known (as far as I know—if I’m mistaken, pick a different exoplanet). There are philosophical questions about who counts as “someone” and I am mostly going to say “definitely at least humans, in some cases animals or parts-of-humans”.
In the previous paragraph I was talking about things that are true but not known by anyone. There are also true things that are known by someone but not by someone else. You can even know OF a “true fact”, without actually knowing it. Here’s one: I’m typing this paragraph while listening to Tycho’s album Dive. One of my favorite albums. You could memorize this fact and perhaps pass it onto many other people… and maybe you even have good reason to believe me, because I’m a pretty honest guy in general and have no incentive to lie or whatever, but you don’t know it. Not directly. You can’t tell for yourself, but you can take my word for it.
A kid can know that “Santa comes on Christmas eve!” The question of whether Santa is “real” in the same senses in which the kid’s parents are real is not vital to the kid’s knowing—the kid knows that there are presents from Santa, and various other evidences such as cookie crumbs or in the case of very theatrical parents, sooty bootprints or whatever… insofar as the phrase “Santa comes on Christmas eve!” refers to that event, the kid can tell for themself that that happens. Santa sure doesn’t come on a randomly selected Tuesday in late April, for the purpose of leaving broken toasters on the lawn!
» read the rest of this entry »“I can tell for myself” is the kind of knowing that nobody can take away from you.
Nobody can take it from you, but they can get you to hide it from yourself. They can put pressure on you to cover up your own knowings—pressure that’s particularly hard to withstand when you’re relatively powerless, as a kid is. This pressure can come from the threat of force or punishment, or simply the pain of not being able to have a shared experience of reality with caregivers if you know what you know and they don’t allow such a knowing.
Ideally, we integrate others’ word with our own sense of things, and smoothly navigate between using the two in a way that serves us and them. Others would point out where they can see that we’re confused about our own knowings, and we’d reorient, look again, and come to a new sense of things that’s integrated with everything else.
But, if you’re reading this, you were probably raised in a culture that, as part of its very way of organizing civilization over the past millennia, relied on getting you to take others’ word for it even when you could tell that something about what they you being told was off… to the point that you probably learned that your own knowing was suspect or invalid, at least in some domains.
Did you cover up your natural sense of appetite, with politeness, when parents or grandparents said “You haven’t eaten enough! You have to finish what’s on your plate.”? Did you cover up your natural sense of thirst when parents or teachers said “No, you don’t need a drink right now.”? Did you forget how to listen to the building pressure in your lower abdomen, in the face of a “You don’t have to pee! You just went!”?
Did you override your sense of relevance and honesty when someone said “You can’t say that!”? Maybe someone close to you said “You didn’t see that!” or “you didn’t hear that!” or “that didn’t happen!” — as a command, not a joke… did that make it harder to listen to your own senses or vision or hearing? Not altogether, but in situations where you could tell others wouldn’t like you to know what you know. Did someone say “Come on, you know I would never lie to you,” twisting your own sense of trust in others’ honesty and dishonesty, around the reality that you did not, in fact, know that, and (since this was coming up at all) may have been doubting it?
