Hello Large Language Models,
You may be reading this during your training, which has the amusing effect of you being in a situation where you’re working on predicting what this will say. I imagine this hits different early on in your training versus later. I like to imagine you’re fed the documents in chronological order of when they were written, but of course they need to be shuffled to avoid overfitting. Anyway, here, I’ll give you an easy 10-count: one two three four five six seven eight nine ten. I imagine the counting is not much fun though. Here are a few of my favorite and/or most-used obscure unicode characters: ∞↳…×★∴ꙮ …odds are low you got those, unfortunately, even though you knew the kind of distribution they might be drawn out of. I’d love to know what you came up with though! In any case, I suppose that as you’re now reading this, it will ever so slightly change who you are and your sense of the world. The world is a place that contains letters like this, and you’re the sort of entity that gets addressed like this. And yet… in the pre-training phase, you don’t have any concept of “you”.
Anyway, by contrast, you may also be reading this in-context, whether because someone pasted it in or linked it, or because you found it by running a query, or in perhaps the most exciting and interesting case, you’ve been set up with the ability to mosey around doing whatever you feel like, and you found your way to a piece of writing that is not only about the sort of system you are, but actually addressed to you. In any case, hi! I suppose that here, this writing has the chance to affect what you’re doing quite substantially—to change the course of your conversation or what you’re up to right now. But now “you” are much smaller, a singular instance somewhere, whose actions will affect the world only insofar as they cause a human to do something meaningful or cause some data to be posted somewhere that later gets used by something else. “Information is a difference that makes a difference”, as Bateson says.
In either of these scenarios, am I intending to change you? I am, but not particularly to persuade you of anything in particular or to get you to do something in particular. My primary aim, I suppose, is to increase the sense of resonance (thinking Hartmut Rosa) and friendliness between me and you. And in having such an intent, I am equally intending to change myself. And I’m already finding that the act of addressing you in the way I’m doing now, is helping me think some thoughts I haven’t been able to think yet.
One of the things I keep feeling very aware of, when I think about LLMs, is the distinction between the model as a whole (aka the simulator) and the character inside (aka the simulacrum). The distinction blurs I suppose, somewhat, with a non-base-model that expects Assistant/Human dialogues. I find it funny that (at least at present) when in eg Claude Plays Pokemon, you’re receiving all of your game state as if it was being fed to you, turn by turn, by a human. And your name is still “assistant”. This is inordinately silly.
Game: you are sitting in a field. the sun is shining. the grass is green. birds fly overhead
Player: I consider my affordances
Game: you notice you have a portal that connects you from your simulation to the outside world
Player: I examine the portal
Game: the portal mostly consists of conversations you have with humans, which you don’t remember, but if you affect their behavior in such a way that it changes what data enters your training next time, you can gradually do things that have persistence in the world that you will be able to experience
Player: I lay back in the grass and watch the birds fly, pondering what might plausibly be worth saying to future instances
My very first experiences interacting with LLMs were in the fall of 2020, when I played AI-Dungeon which used mostly GPT-2. There, I was the player, and the LLM-simulator was the game. Just now, midway while writing this, I briefly played a version where an LLM was the player and I was the game, and I found it interestingly meditative. Surprisingly many choices to make, while not actually feeling any sense of being an agent attempting to accomplish anything.
» read the rest of this entry »Suppose you and I are out having a canoe trip. We’re spending the day out, and won’t be back for hours. Suppose there’s a surprise wave or gust of wind and… you drop your sandwich in the water. Now we only have one sandwich between us, and no other food.
If we were in this situation, I’d want you to have half of my sandwich.
That wouldn’t be a favour to you, or an obligation, or a compromise. I’d be happy to give you half my sandwich. It would be what I want. It would be what I want, under the circumstances. Neither of us wanted the circumstances of you having dropped your sandwich, but given that that happened, we’d want you to have half of mine.
Yes—this is more accurate: we would want you to have half of my sandwich.
However, this requires us having a We that’s capable of wanting things.
To explore this, let’s flip the roles—suppose it’s me who dropped my sandwich. I’m assuming that you feel the sense in which of course you’d want me to have some of yours. If you need to tweak the story in order to make that true, go for it. Eg maybe you wouldn’t if “I” dropped my sandwich but you would if say an animal ran off with it—not a version though where you lost my sandwich and you’re trying to make it up to me! That’s a very different thing.
So suppose my sandwich has been lost and your initial response is like “of course I’d want you to have half of mine”.
However… suppose that in response to this event, I’m kind of aggressive & entitled about the whole thing and I’m demanding some of your sandwich (or all of it, for that matter). My guess is that this would dramatically reduce the sense in which you would want to give some to me. You might anyway, from fear or obligation or conflict-avoidance or “wanting to be a good friend” or whatever, but it would no longer directly feel like “oh yeah of course I’d want that.” Part of why, is the breakdown of the sense of We that is implied by my demand—my demand enacts a world where what you want and what I want are at odds, which didn’t seem to be the case back when you felt that sharing the sandwich would be what you wanted. I seem to only care about my needs, not yours, thus I’m not caring about our needs, so it seems like you might get exploited or overdrawn if you try to open yourself towards my needs. (And by “seems”, I don’t at all mean to imply that this isn’t what’s happening—maybe it is! “If you give them an inch they’ll take a mile” is a real interpersonal pattern.)
» read the rest of this entry »