Any sufficiently advanced technology is indistinguishable from magic.
~Arthur C. Clarke: An Inquiry into the Limits of the Possible
Ready, Set, Action! (Protocols)
In this essay I introduce the notion of Action Protocol as a way to understand AI, ChatGPT, and other Large Language Models. The term “action protocol” has the power to integrate questions across diverse fields of inquiry, from physics and molecular biology, ecology and sociology, cognitive science and ethics, and in this case, also linguistics as well as “artificial epistemology.”
First, an action protocol is part “code” — the protocol piece, and part “procedure” — the action piece. The protocol-code that I write into the software triggers a sequence of procedures in the electronic device. When I knit a wool sweater with a cable design, or use a loom to make a carpet, I am employing both protocols and actions. Furthermore, the intention I have to move my hands and fingers in specific ways, is a kind of cognitive protocol that triggers procedural activity at the muscular-nervous system interface. At yet another, deeper level, the responses of my muscles can be seen as a protocol for the actions of biological molecules that perform the procedures necessary for the muscles to contract and recover in smoothly functioning fashion. Still deeper, you might imagine, that those molecules are passing protocols for bioelectric information down to physical properties of matter. This shows that the term “action protocol” is useful at any scale and across categories of reality.
Secondly, action protocols can describe entire fields of study: Science can be seen as a “set of action protocols to achieve results predicted by theory.” In terms of empirical science, this means that things in the physical world are acted upon, according to the specified protocols. On the other hand, whereas mathematics as well as logics are easily understood as protocols for manipulating symbols or concepts, it might be not so obvious where the action happens. That the action happens mostly “in our minds,” simply means the biological and physical procedures that body-brain-minds do, when they are manipulating symbols. This shows that the term “action protocol” can integrate across different epistemological domains.
Thirdly, action protocols can refer to individual agents as well as compound agents and collective agency. 1 For example, we can think of finance capitalism as the action protocol that governs most of the procedures that are associated with the movement of natural resources in and out of various input-throughput systems around the world. Democracy can be viewed as an action protocol in a similar way.
In this essay I introduce the notion of language as an action protocol. It is with this lens that I will talk about the potential benefits and harms of large language models, specifically ChatGPT4— the latest version I have been working with. (As a shorthand, I will write CG4 for ChatGPT4). It is easy to map CG4 onto the protocol part, and human behavior on the action part. Language is a powerful protocol for human action. Sometimes the protocols are short and direct, like “STOP!” Sometimes their power is derived from ideological allegiance, or memetic warfare. Sometimes the power of language is more sophisticated and nuanced, infiltrating the human psycho-cognitive system across longer time frames. It is this kind of language that produces a certain kind of subjectivity— or as Vicki Hearne said, “shapes the social relationships between individuals.” The production of subjectivity, itself is the creation of an individual’s “potential state,” whereas the shape of society itself, constitutes its potential state.”
Before going further, we have to understand another key feature about the theory of action protocols. Action protocols generate both internal and external relations, simultaneously. Using the above example, we can say that the potential state of the society as a whole “ingresses” into the individual as their subjectivity. This is how the individual is internally related to society. On the other hand, the actions that individual take (including speech acts and other communicative technologies) shape the potential state of the society, constituting the individual’s external relations to it. I use the term “habitas” to refer to the internal relations of the individual (the ingress of the outer world as subjectivity in the individual) and “habitat” as the external environment in which the individual agent acts (the expression as arena).
It can be argued that language is the primary environment in which the human animal swims. As such, it functions much like Gibson’s perceptual array, in which the particular medium in which the organism lives— water, earth, air— is a prior condition for, and an inseparable part of the animal’s perceptual organ. In other words, the eye of the fish is not its organ of perception. Rather its organ of perception extends into the sea itself. For humans, it is the optical array of light in the sky that constitutes our “perceptual sea.” When we shift from sensori-motor perception to virtual perception “in the mind,” language functions as our perceptual sea. 2
Now let’s put these ideas together with an extended metaphor. The beaver moves through its habitat, the pond. When the water breaches the dam in the pond, the beaver has an internal protocol to bring up mud and take down trees to patch it up. As the beaver takes more actions in this particular habitat (the pond and the surrounding ecological niche that supports it) the habitat is ingressed into the beaver’s habitas, creating a more precise mapping of its subjective potential and the external affordance. In other words, the beaver knows where to go to get things, what he needs to do the job, and how to get things done, in this particular arena. Over time, the one-to-one correspondence between the beaver’s habitat and habitas, between protocols that are served up by arousal energy and actions that satisfy them, becomes more precise.3
Now make the shift from beaver swimming in pond, to human swimming in language. Like ponds, language is an ecological composition with primary pond and secondary pools. The primary pond of the larger society is more influential, but the local, smaller pools are operating in the same way, albeit at lower, often subthreshold levels, where subthreshold means that secondary, minor protocols are being created, but do not lead to action4 in the arena.
The particular language game that one is swimming in ingresses into you, shaping your subjectivity. This is made especially important when one lives in the information and knowledge economy of late stage capitalism, wherein language becomes both the means of production and the product that is most demanded by consumers.5 We are swimming in language, in “media” and the pond keeps getting larger and larger. The question is, will CG4 breach the metaphorical dam in our metaphorical beaver pond?
The Agent in the Machine
Large language models like CG4 slowly but surely recompose protocols for human action. Every interaction with CG4 shapes our subjectivity, perhaps in tiny ways. Yet, when billions of people couple themselves to the same “protocol generator”, society as we know it, gets revised in the process. The agent in the machine is not in the machine, but is ingressed into the subjectivity of the billions who immerse themselves in it, and swarm around the globe, fulfilling new expectations that are shaped by it. If you are among the few who were looking for a new social imaginary, you need not look any further— it now lives in the machine.
As arena, as our habitat, however, the machine doesn’t get a free lunch. It too is shaped by the recursive interactions between language-loving humans with problems, and language-generating machines with answers that humans find suitable to either 1) assuage their uncertainty, or 2) present them with actionable choices. Either way, the machine carves out new grooves in the actionable protocols of people, eventually transforming the potential state of the planet. Meaning, making some things more likely to happen than others; some thoughts (and not others) to be mimetically reproduced at viral speeds; some actions (and not others) to be socially affirmed and some actions (and not others) to be socially condemned. As more and more people adopt the same platform, it will become as ubiquitous as capitalism, and constitute the single Supreme Action Protocol of everyone (and by implication of the Anthropocene, everything on the planet). It could usher in the end of capitalism, replacing it with either a more benign version of mass action, or even a more generative, perhaps beneficent one.
What might tip the scale between hyper-capitalism at one end, and a genuine transformation in human subjectivity and the global social spirit? That is what I will address in the next essay.
The difference between compound agent and collective agency may be merely a linguistic convention. I can easily understand the human being as a compound agent, a society of cells, as Whitehead would say; as well as I can see a single cell as a compound agent, as a society of sub-cellular organs. Zooming in this way, at a certain threshold I find myself reticent to assign individual agency. Zooming out, I can see that a society itself has real collective agency, but am I willing to assign to society the agency of a compound individual? Whereas, collective agency is an emergent outcome of the self-organized actions of many individuals, the compound agent navigates its own intentional space through protocols and passes those down to trigger procedures at the lower (individual member) level. A compound agent is sometimes referred to as a hypersubject in the frame of hyperbojects, or, as Whitehead proposed, a supersubject.
This is true conventionally for most of us, but it is possible to liberate oneself from langauge. This however, is beyond the scope of this essay.
Note that this theory of action protocols offers an alternative framework to predictive processing.
Note however, that all protocols themselves lead to procedural actions in the body, only some of which cross a threshold such that the micro-actions of the sub-routines in the body are expressed as actions in the arena. There is an energetic signature here, wherein arousal energy must be either discharged as micro-actions i the body or behaviors in the environment.
Notice that the information-knowledge economy comes the closest every to a Maxwell-Demon’s perpetual-motion machine, where production costs fall to near zero, while consumer demand approaches infinity, and there is no cap on the production of “new knowledge.” Whether this knowledge is actually new is in doubt, the compounding of knowledge through social media mash-ups and meta-analytic approaches, seems to satisfy our appetites for the time being. That compounded knowledge doesn’t solve real world problems, will eventually become too significant to avoid. In the meantime, this equation, between the near-zero cost of production and the near infinite supply and demand, coupled with the inability for any of this to solve real-world problems, is, I believe, the main reason why there is so much “money” in the system at the same time coupled with so much “debt” in the system. The money is put up by he perpetual motion machine (what we “gain” in the short run) while the debt is the scorecard of the deterioration of social and physical infrastructure (what we must pay in the long run).
This is fascinating and strange.
When I read action protocol, I hear (please correct if I misunderstand), structure - process and/or framework - flow.
The beavers, with their bidirectional inhabiting of river, are held by structures of muscle and fur, wood, water and kits, which submit to processes of instinct and survival. The action protocols of mating, of dam construction, of swimming, of catching fish flow in and from the mutuality between habitas and habitat.
The cgpt utilizes language of a limited pool, a fraction of human experience, (yet it has been celebrated as all knowing). It’s like a research study that fails to acknowledge its limitations/bias, the potential flaws in its method or the risks to participants.
Imagine the beavers losing the capacity to swim in other rivers...confined to the only river the other beavers agree exists.
Recently, I wrote a grant proposal, then asked cgpt to write one. Mine had a personal story, humor and an invitation to hope. In contrast, chat gpt’s proposal read like something general and formulaic and full of facts (which I wondered if were all true).
The social constructions of language, by collective consent, can (and do) lie, misconstrue, over simplify and dumb down. Nuance is often embedded in tone of voice, historical context, gesture or footnote. If the protocol is flawed, it leads to flawed action.
Although fluent in multiple languages, chatgpt doesn’t have the capacity to value the vast world of language, of action protocols, universe to atom. The guise that language encapsulates life (as you say, it’s the water within which we swim), but life is languages far beyond the bit and byte, birdsong, or how a surfer reads waves, a goodbye kiss or the rumbling of belly.
Imagine beavers sitting in dens staring at videos of other beavers swimming.
It seems like a brilliant action protocol with which to perpetuate a coup of the human collective imagination. But perhaps if the structural flaws in the protocol are exposed, through the poverty of limited language, human quirkiness can rise, resist, and disrupt, crafting new frameworks that return to us the intimate vastness of our world-habitat and therefore habitas.
Is AI primarily the mind seeking itself?