Art + Technics | A Polytechnic Odyssey
Compression, Paradox, and the Limits of Machine Understanding
We don’t give enough credit to the problem of naming, or to the fact that our machines have already inherited our answers to it. A great deal of modern machine learning rests on intuitions about how the mind makes a number, a thing, a fate feel graspable, and how it gives a world a handle so that it might stand still long enough to be worked on. Much of it also leans on an old and sensible intuition that a good explanation is an economical one. Whichever model best fits the data and costs the fewest bits to describe has, in some meaningful sense, earned its place among our clever solutions. But this economy started with a prior. Observers decide what deserves a name, what gets ignored, and what counts as signal worth compressing. So our machines don’t only learn patterns; they quietly absorb a human bias toward simplicity, and then hand it back to us. We re-acquire a desire for names that fit inside our models, and together, mind and machine drape assumptions over joined posture and hide them inside a growing appetite. There is an insatiable desire, in the modern (by which I mean machine-augmented) observer, to be nowhere in particular, looking out at the world as if we’d climbed a mountain or crossed a sea, and then to call that altitude “understanding.” But our very capacity to call anything true already depends on triangulation with other minds and a shared world, so the fantasy of a view from nowhere tends to demote the social conditions that make understanding possible at all. As our tools and attending skills for compression and prediction grow more powerful, we begin to demand the shortest effective description from more and more of what we encounter. All the while, the most enduring problem remains. Homer told us centuries ago that, if anything, that which cannot fit inside a name would be the most wicked problem of them all, so he gave us a poem, and a story, and a myth, and in all of that, a reminder that thing we tackle with cleverness can never be fully understood with cleverness alone.
It helps to remember where this appetite for short descriptions came from, and what kind of trouble it was built to avoid. In the early twentieth century logicians noticed a strange puzzle about how we describe numbers in ordinary language—now called Berry’s paradox. Imagine the number “one million, one hundred one thousand, one hundred twenty-one.” Now notice that the phrase “the first number not nameable in under ten words” appears to pick out a number by describing it—except the description itself is only nine words long. So is the number named, or is it not? By defining the number this way, we contradict the definition itself. This is where minds start to hurt. Notice though that the bottleneck for a good explanation, at least according to a human observer, is language’s ability to refer to all possible descriptions, including itself. Even as our sentences try to hold a stable order, language can behave like a mirror reflecting a mirror, where meaning starts to fold back on itself. Once natural language is allowed to quantify over all possible descriptions, allowing language to determine what is “definable in under N symbols” becomes an unstable ordering principle. This is a case where our attempt to treat semantic norms as if they could, by themselves, underwrite epistemic norms collapses. The very device we use to organize reasons can no longer be stably applied to itself.
That paradox sat for a long time in the foundations-of-mathematics world like a warning flare about the power and peril of naming. The problem-solving mind has been a particular concern of a long-standing Zen tradition, which offers its own solvent by confronting the issue like this: “Who are you? In your answer you cannot talk about yourself!” Berry works on me in that way. The more I sit with it, the more it feels like a clue to how much we rely on intelligence not just to state what is true, but to command a pervading sense of what to do next—how to proceed. Berry exposes a structural limit in the very act of explanation, a limit that shows up precisely when explanation tries to become complete.
Because this is a paradox that depends on a fantasy that naming is a purely propositional act, it appears the moment we try to build a description that ranges over all descriptions—when naming becomes self-referential and starts to grade its own performance. It tells us that the best description is not merely something we fail to find; it is, in certain senses, an impossible object to demand in the first place. Which matters, because it quietly reframes what the later computational tradition will attempt. If the mirror collapses in ordinary language, the method based approach orients to the idea that we can fence it by turning description into procedure, replacing sentences with programs, and words with bits, and demanding coherence as an output of relentless efficiency. We see it in formal semantics where we replace loose talk about ‘length’ or ‘description’ with a regimented mapping from strings to structures, taming our paradoxes at the cost of hiding their normative residue.
Thing is, in human cognition, propositional knowledge is rarely first to come to mind. This sort of knowledge rides on top of the living, dynamic reality that myth has always been pointing to: a living, time-bound reality for an agent, which is inseparable from his capacity to change. That capacity is bound up with how the observer metabolizes information, and learns to distinguish value with information. It’s here that the much-popularized “return of the hero” (Joseph Campbell’s name for a deep narrative pattern) is useful as more than a story technique for Hollywood hits. It names a cycle we also compress into the idea of recognition, in which one departs from a rule set in order to return to it with a changed perception, and therefore a changed ability to live within it, to know. Paradox exposes what happens when we try to make (by way of naming) a global, context-free object in a domain quietly governed by situated, perspectival constraints. The same mistake reappears when we try to name what understanding itself is, as if it were an object that can be reduced to its shortest description. Understanding, as we’ll come to find out, cannot be treated as a single achievement. It has layers we can gesture toward through experiential semantics—momentary glimpses, reliable re-entry, ready access, pervading orientation. But even more importantly, when we recognize understanding, we’re ultimately talking about an ability to wrap experience in words without flattening the experience itself.
By the 1960s and 70s, a handful of mathematicians and computer scientists—Solomonoff, Kolmogorov, Chaitin, Levin—began to converge on a new yardstick for what it even means to describe something. Instead of asking how many words it takes to name an object, they asked how many bits are required for the shortest computer program that can output it. This is Kolmogorov complexity, and it carries a quiet but radical shift in posture: language is no longer the medium of naming; procedure is. A highly structured sequence can often be generated by a short loop. A truly random-looking one cannot be appreciably shortened, because the most economical program is little more than a blunt instruction: “print the sequence.”
This move was meant, explicitly, to close the trapdoor that Berry-style paradoxes pry open. Once a description is a program written in a fixed formal language, a great deal of the slipperiness of natural-language “definability” disappears; the mirror is framed, the recursion is fenced. And yet the foundational sting remains. In the 1970s it became clear that Kolmogorov complexity itself is not computable in general, and the proof carries the same self-referential aroma as Berry’s paradox. Socrates’ pernicious daemon re-enters the story as a governing intermediary, allotting what can be said, known and by what course. Even if our methods are massively truth‑conducive, there is no procedure available from within the system that will certify, case‑by‑case, that we have reached the true minimum, only a virtuous but circular reliance on methods whose reliability cannot be non‑circularly proven. If you could compute exact minimal description lengths for all strings, you could construct a string designed to defeat your own procedure by referring to it—something like “the first string whose complexity is at least N.” The act of naming reaches back into the space of names and breaks the ordering; self-reference slips in through the formal door like a Trojan horse. In the same breath, we gain a clean, machine-based notion of description length, and we meet a hard limit on what can ever be mechanized about that notion.
By the late 1970s, the shape of the terrain feels almost like a lesson in intellectual humility. We have found a way to discipline “description” by translating it into programs and bits, and in doing so we tame a whole class of natural-language paradox. But the floor of the idea—the true minimum—cannot, in general, be computed. There is an irreducible boundary here, and the old problem of self-reference still patiently waits at that boundary.
Meanwhile, in a neighboring room, statistics and information theory were worrying a problem that is practical and, in its own way, moral: how do we choose between models when our data is finite? You can always build a model ornate enough to swallow the evidence whole; but then it becomes less an explanation than a kind of decorated concealment—perfect fit, brittle world. Overfitting as a failure of humility.
Jorma Rissanen’s Minimum Description Length (MDL) principle, introduced in 1978, is one way the computational lineage metabolized the uncomputable ideal into something we can actually practice. MDL does not promise the absolute shortest description. It offers a disciplined proxy: treat both the model and the data (given the model) as code that must be transmitted, and prefer the model that yields the shortest total message—bits(model) plus bits(data | model). What matters is not mystical minimality, but a computable comparison inside a restricted formal setup.
And there’s something psychologically revealing about this, if you pause with it. MDL avoids the Berry trap not by defeating it, but by refusing to step onto the floorboards that collapse. It turns away from the intoxicating universal—all possible descriptions, all imaginable namings—and chooses a bounded world: a model class, an explicit coding scheme, a finite procedure for deciding. The Berry sentence—“the first number not describable in under N words”—isn’t refuted so much as rendered inadmissible, like a kind of speech we have learned not to utter if we want the system to hold. Thus MDL is a vindicational strategy within a framework of agency and prediction. These coding principles are rational to adopt because they sustain effective action, even though they cannot be justified by a framework‑independent proof that they track the true structure of reality.
This is also, in broad outline, how classical computers—and even AI systems running on them—navigate the problem today. They do not resolve paradox by brilliance; they resolve it by refusing certain kinds of speech. They make description relative to fixed languages and model classes, they use approximations and penalties rather than exact minimality, and they avoid unrestricted self-reference that quantifies over “all descriptions expressible in this very language” while simultaneously turning that totality into an object of computation. Large language models, for their part, can talk about Berry’s paradox fluently, but they are not, in any strong sense, enumerating all possible definitions and then computing minimal code-lengths for them; they are producing plausible continuations of text. The paradox remains at the edges of theory, and the machines do their work inside the safe subset.
So when MDL appears in modern AI, it is carrying more than a preference for simplicity. It is carrying a century-old lesson about the hazards of unconstrained naming, a disciplined substitution of programs for prose, and a quiet acceptance that the shortest description—the true floor of compression—cannot, in general, be computed. Within its proper domain MDL is a powerful instrument. The question that remains, and that our ancient ancestors, should we allow ourselves to consult with them, will force us to ask, is what happens when we mistake that instrument for pointing towards a whole of knowledge.
But there is a test case that makes the boundary of this intuition visible. So, we turn to the epic of all epics, Homer’s Odyssey to grasp how the smooth geometry of our engineered alignment begins to slip. Alignment between a system’s model and its objective may seem computable, but any time we invoke lines, we might acknowledge that we’re talking about form. And one of the first principles of form is that it is contingent on what is excluded, not what is included. Though this point may be debatable in art theory, the point is that the structure of knowledge depends of the kind of knowing we allow inside the structure. If our definition of knowledge includes the nonlinear, recursive patterns of story, and the way meaning refracts through consciousness, biology, and desire, then alignment itself necessarily grapples with what is local, embodied, and irreducibly human. The Odyssey reminds us that when alignment is treated as a purely mechanical property, it amounts to a category error, mistaking our competence at grappling with algorithmic complexity theory for coherence of mind itself. Why we conflate the objective and the intersubjective, treating a normatively loaded, socially negotiated space of reasons and values as if it were just another space of causal regularities is a discussion for another day, another article. But for now, we can do well to remember that the very space within which alignment can be evaluated is not reducible to anything other than what we point to with mind. So it’s colored with mind’s ongoing constructions of value and purpose. If we treat the Odyssey as data and feed it to a machine as training material for a predictive model, then we are, in effect, asking the machine a very specific kind of question: what regularities in this poem are sufficient to reproduce it (or to generate convincing continuations of it) with maximal economy? In MDL’s framing, “understanding” is approached through proxy, so whatever captures the most structure while paying the smallest price in bits, counts as the best explanation shared between mind and machine. And in this case, the machinery has obvious footholds. Meter, formulaic epithets, recurring narrative turns, and the statistical habits of vocabulary are all learnable constraints; they offer handleable patterns that make prediction cheaper. With enough text and the right inductive biases, the system can exploit those regularities to forecast whole strings of words, even to produce a plausible imitation of the entire poem. So, we will indeed obtain compression. We may even obtain a competent counterfeit.
But rewarded competence at capturing form is not the same thing as knowledge about what the form is for. Recent evidence makes that gap harder to ignore. A few months ago, a group of European AI researchers reported that merely rewriting prompts into verse can bypass many models’ constraint heuristics, revealing that even a surface shift is enough to move the same intent outside the model where the guardrails reliably engage. Philosophically, that suggests the system’s appearance of understanding is often a locally stable order, tuned to familiar forms, rather than a principled understanding of intent that survives when form itself becomes adversarial. The evidence lands like a contemporary footnote to Berry. Modern computing only makes sense if it can fence the mirror (fixed formal languages, constrained model classes, penalties instead of absolutes). But poetry reminds us that meaning lives where can engage directly with those fences. So if verse can break machine alignment by changing only form, there’s no reason to think a machine can hold all of human knowledge at scale. Better aims for computational modeling, it seems, are of an economic and epistemic kind, but never totalizing.
Still the central and tacit matter of intent remains outside the frame. The Odyssey, being one of our most epic poems, has endured in part because it transcends every model we have come up with to contain it. That alone is telling, because it seems to be reflecting back to us aspects of our mind that we can only gesture towards with metaphysical techniques and practices. It is a world where story itself becomes a technology of self-transcendence, self-formation, and fate. Odysseus is the πολύτροπος of archetypes—a man of twists and turns, a figure who can be captain and suppliant, king and beggar, marauder and victim; the poem renders him from so many angles that it becomes, almost by design, an inexhaustible engine of reinterpretation. This many-mindedness is a cognitive structure that reminds us that life is held together by mutually-correcting orientations, each one limiting the blindness of the others. Even divine omniscience makes an appearance, in the form of Sirens who make an offer to our hero that is laced with the seductive promise of a view of from nowhere (something we idolize today as objectivity). But the epic’s final wager is that no “self” singularly secured as observer or judge can do the work of staying sane inside the paradox; it takes feedback constrained by form, to let irrelevant patterns die without collapsing the world. The epic is therefore not simply a string of events but a model—complex, recursive, self-reflexive—about the peril and power of description: what happens when a life is turned into a tellable form, and what is lost (or gained) in the telling.
So, MDL’s clarity also becomes its limitation. MDL measures regularity in the text; it tells us which patterns can be exploited for shorter codes and better predictions. What happens, though, when prediction is mistaken as a substitute for the principle of presence? In physics, we can model what a system will do next; but the question the Odyssey keeps opening is not so much about what information does next, so much as it asks a more combinatorial set of questions, about intent and orientation, about the dynamic alignment between self, context, and action, and in some grander sense, those conditions by which consciousness becomes legible to itself. So, what makes the Odyssey matter still is largely what it tells us about how life moves across time and space: how it reorganizes attention, how it initiates a reader into grief and cunning and homecoming and hospitality, how it changes the internal sense of human telos. Even Odysseus is unsettled by his own story. In one scene, the hero weeps while listening to the Trojan story and is compared, in an extended and troubling simile, to a woman crying over her husband as she is driven into slavery; the comparison refuses to let us keep Odysseus neatly bound to our model of a hero, because it implicates him as both mourner and the kind of soldier who causes such mourning. Hardly compressible information, the meaning of that narrative device is well regarded as containing moral and psychological complexity. It’s the kind of information that perturbs our perception and forces us change its contours.
From the MDL point of view, the best explanation is the one that predicts with the fewest bits. From the human point of view, the best account of why this work matters is often longer, denser, and more context-laden than the work itself, because what we are trying to explain is not the production of a string but the transformation of a being. The Odyssey survives centuries of progress in the sciences precisely because it keeps generating new Odysseuses in the form of Dante’s transgressor of limits, Joyce’s modern wanderer, Walcott’s Caribbean epic re-mapping, Glück’s Penelope. From mythos, Odysseus is braided into contemporary desire and fidelity, and becomes an entire civilization’s ethos. A criterion for what we call an archetype is reiteration, and in the Odyssey, archetypal reiterations are not reducible to a single minimal description; they are a record of how different ages use the figure to think with, to suffer with, and to reimagine themselves—because what reiterates is not merely a motif in a text, but a property of mind: an intrinsic tendency of what exists, showing up again and again across the changing furniture of history. Even if the human mind could, in theory, let go of story, or fantasy, for a time, we do not return to some fantasy-free ground filled with facts that fit the model, for the very medium through which anything becomes salient to consciousness is meaning.
This tension, you see, between mind and machine, is a clue. It suggests that “information-as-compressibility” is a powerful slice of reality, but not the master key to the human mind. Even within the hard sciences, the discovery of algorithmic randomness forced an uncomfortable recognition for scientists. In chaotic systems, unless initial conditions are specified with infinite precision, the smallest uncertainty can cascade into macroscopic divergence—so unpredictability can be inevitable even when the underlying dynamics are deterministic. From a human point of view, “infinite information” is functionally indistinguishable from total incomprehensibility; like a charybdis whirlpool that can swallow the very ground from under our feet if we come too close, we, like Odysseus, are at risk of being swallowed by data that never condenses into understanding. One response, as chaos theory pioneer Joseph Ford suggested nearly 60 years ago, is to prune the continuum down to humanly meaningful numbers and forms, eliminating those with maximal complexity or infinite information content. Another, closer to modern accounts of nonlinear science and computer visualization, is to cultivate a new kind of intuition at the interface layer, which allows machines to stage phenomena so that an embodied observer can iterate, watch, and develop a felt sense for recursive symmetries and strange attractors. This intuition lives on in strands of machine learning visualization, model steering, human‑in‑the‑loop interpretability, VR-based dynamical systems exploration, and embodied/interactive machine learning—places where the interface is designed so a human can literally feel their way through the model’s phase space. But obviously, that’s not how we are popularly coming to understand the computational shifts happening today. So, MDL lives on, and is brilliant where we care about structure in data and where “better” means more efficient codes, tighter predictions, fewer bits. But the kinds of information that persist in epics do not cooperate with that metric. They are answering to another axis of value—one closer to the mind’s demand for meaning, orientation, and transformation than to any computable minimum of description length.
There are certain human experiences that reliably convince mind to contract. The field narrows; the world becomes small. And in that narrowing, our objects of grasping—numbers, concepts, explanations—can inflate until they seem to occupy the whole of perception. In such moments, the temptation is not (at first) to relax back into spaciousness, but to reach for instruments: computers, models, interfaces, techniques that promise to augment attention—so we can ingest more, see more, and, by scaling the field, force new patterns and truths to reveal themselves. I think there is something quite remarkable about the way the psyche seeks for a larger container the one that is given to it almost effortlessly by nature. But widening is something mind can already do.
We often outsource to machines what the mind can do better—not because the machine is wiser, but because we forgot (or were never taught) that perception can re-orient itself. One can open to the totality of sound, widen the field of vision, and notice the space around and between objects. What is required is a genuine change in the geometry that holds experience. And when that shift happens, the vicious circle loosens: the compulsive need to name, to solve, to compress reality into something manageable, begins to dissolve from the inside. The Odyssey is an instrument in this sense—an ancient technology of de-contraction—and, in that way, a gift. It interrupts our tendency to hold ideas as singular, captive objects. It trains the imagination to exhale: to hold paradox without snapping shut, to remember that “space” is not emptiness and emptiness is not a space. Like Odysseus, our suffering so often arises from the mind’s habit of giving solidity to experience—by the way it grips, narrates, and names. The epic does not merely tell us this; it re-stages it, until mind recognizes its own tightening and, just as importantly, its own capacity to open.
If we let the MDL worldview generalize the substrate of human understanding by default—if we let “whatever can be fruitfully compressed and optimized by machines” define the frontier of exploration—we smuggle in a cap on what counts as knowable. The tools we built as an outcome of one lineage of questions, that reveal discriminate layers of information patterns, are so effective that they are being allowed to colonize the entire stack of meaning. If you go back to the foundational tenets of modern information theory, you’ll see that scientists first separated information from meaning so that chaos could be revalued as a rich source of “maximum information.” So, the paradigm we draw from is double edged. We’ve largely chosen one path, ignoring that another path is also possible. Should we situate MDL locally where it shines, and recognize, with equal seriousness, that some domains for which humans understand require a different kind of instrument—one that can register not only patterns in strings, but the ways stories make and unmake the self, we may begin to acknowledge, in more formalistic terms, that two epistemic processes must as a matter of necessity, run in parallel. One is computational, which refines our animal grip on structure and prediction. The other is contemplative, aesthetic, and relational, refining the reflective standards by which we evaluate what counts as a good picture of the world and of ourselves. The Odyssey is possibly one of the most powerful (and therefore ideal) creative artifacts for us to stress test our information theories. If a definition of “information” cannot explain why this poem, with its nested tales, seductive songs, and endlessly interpretable hero continues to reorganize its readers, then the definition may be useful, but it is not complete. That incompleteness is the portal to the most important frontiers of understanding we were always looking for.


