{sticky}

✴︎

✴︎
  • WUBRG (pronounced “woo-berg”)

    Below are the five colors of Magic: white, blue, black, red, and green. Each color has a central goal, and a default strategy. White White seeks peace, and it tries to achieve that peace through the imposition of order. White believes that the solution to all suffering and unhappiness is coordination and cooperation and rules and restraint. The archetypal white organization would be

    continue?

{latest}

✴︎

✴︎
  • WUBRG (pronounced “woo-berg”)

    Below are the five colors of Magic: white, blue, black, red, and green. Each color has a central goal, and a default strategy. White White seeks peace, and it tries to achieve that peace through the imposition of order. White believes that the solution to all…

  • Dumb counting

    ready to count baby 12345678901234567890123456789012345678901234567890123456789012345 this is how many characters should be on a line basically, 65 here is more, the max, 75 123456789012345678901234567890123456789012345678901234567890123451234567890 last thing, let’s do 65 and then add some stuff after so we can…

  • Noodling on Autonomy

    WHAT IS THIS? This document is meant to informally discuss agentic autonomy in the context of [REDACTED], with the primary purpose of aligning on next steps towards productionalization. We already have a great demo; what do we need…

{tags}

✴︎

✴︎
  • Testing

    this is a longer test still going strong go pistons, because they just won first round yahoooooo lets go okay that is it NO IT IS NOT IT! we are still going okay i am gfoing to just get a lot of characters A week ago a friend invited a couple of other couples over

    continue?

  • Hello world!

    Welcome to WordPress. This is your first post. Edit or delete it, then start writing!

    continue?

More posts

  • ✴︎

    ✴︎

    Below are the five colors of Magic: white, blue, black, red, and green.

    Each color has a central goal, and a default strategy.


    White

    White seeks peace, and it tries to achieve that peace through the imposition of order. White believes that the solution to all suffering and unhappiness is coordination and cooperation and rules and restraint. The archetypal white organization would be a church, and a white dystopia would be a fascist regime such as the one in George Orwell’s 1984, or a stagnant society like the one in Lois Lowry’s The Giver.

    The “white mana” symbol (™ and © Wizards of the Coast)

    Central examples of white characters from pop culture include Brienne of Tarth from Game of Thrones, Javert from Les Misérables, Ozymandias from Watchmen, Superman, McGonagall from Harry Potter and the Methods of Rationality, and Marge from The Simpsons. In the actual game of Magic, white cards are angels and knights and clerics and loyal steeds, healing spells and protective auras and laws that bind all parties equally, and anthems that strengthen all of your allies at once.

    A white agent, when presented with a decision or quandary, asks what is the right course of action to take, where “right” depends on their moral or cultural framework.

    Victory for a white agent feels like brightness, purity, exaltation — a clean breeze sweeping across a high plain under a bright sun. Defeat feels like watching the corrosion creep forward, the great monuments crumble, or the enemy pouring over the gates, knowing that the goodness of the world is unraveling.

    Other words associated with white: authority, compassion, community, contribution, fairness, happiness, honesty, justice, kindness, leadership, peace, religion, responsibility, security, service, trustworthiness, altruism, cleanliness, commitment, consistency, duty, conviction, courtesy, dedication, discipline, endurance, gratitude, honor, integrity, patience, poise, respect, teamwork, tradition, unity, valor, honor, formality, generosity, protectiveness, asceticism, authoritarianism, morality, fanaticism, intolerance

    From most to least, white agents have the following Big Five traits:

    • Conscientiousness (++)
    • Agreeableness (+)
    • Extraversion (~)
    • Neuroticism (-)
    • Openness (–)

    From a negative perspective, white craves order. It needs certainty, predictability, clear expectations. Rules. Clear distinctions. Justice. White struggles with ambiguity and nuance, and doesn’t do a good job of stepping outside of its own frame or perspective.

    A white agent, when injured or disoriented or low-resourced, will tend to double down on structure—trying to force things into place, ignoring or attacking anything that threatens to not-fit its preconceptions. And with too much white, things ossify, freezing into suffocating unchangeability, rituals that are empty of meaning and have lost their original purpose.


    Blue

    Blue seeks perfection, and it tries to achieve that perfection through the pursuit of knowledge. Blue believes that things could be almost arbitrarily good if we could all just figure out the truth, and then apply that understanding to its fullest extent. The archetypal blue organization would be a university or a research lab, and a blue dystopia would be one in which efficiency were pursued without morals or limits, or in which intelligence were the sole axis of a meritocracy.

    The “blue mana” symbol (™ and © Wizards of the Coast)

    Merlin is a classic blue character, as are Spock from Star Trek and Dr. Manhattan from Watchmen. Lisa from The Simpsons is blue, and Ravenclaw House from Harry Potter exists to serve blue students. Interestingly, there’s a strong argument that Spongebob Squarepants is at least partially blue, despite not being particularly intelligent or competent, because he’s often driven by his curiosity and his desire for perfection. Harry James Potter-Evans-Verres in HPMOR is more than one color, but his projects with Hermione and Draco are strongly blue-leaning. In MTG, blue cards are wizards and faeries and monsters of the deep, counterspells and illusory tricks, magic that accumulates knowledge and incremental advantage and undercuts, rather than directly opposing (brains over brawn and mind over matter).

    A blue agent, when presented with a decision or quandary, asks what course of action makes the most sensewhere “sense” is determined by careful thought and the application of knowledge and expertise.

    Victory for a blue agent feels like clarity, revelation, actualization, conclusion — a final puzzle piece clicking into place, or the last note of a perfect symphonic performance. Defeat feels like everything is slippery, foggy, intractable (and will be evermore), like there’s no path forward and nothing to be done, like all of the potential is wasted and all of the confusion is permanent.

    Other words associated with blue: challenge, competence, creativity, curiosity, knowledge, optimism, accuracy, adaptability, awareness, brilliance, cleverness, concentration, development, efficiency, foresight, imagination, insight, logic, quality, rigor, trickery, strategy, service, truth, vision, wonder, perception, nuance, aspiration, focus, invention, patience, wordplay, rationality, subtlety, scholarship, absent-mindedness, cerebral, deception, enigmatic, skepticism, aloofness

    From most to least, blue agents have the following Big Five traits:

    • Openness (++)
    • Conscientiousness (+)
    • Extraversion (~)
    • Agreeableness (~)
    • Neuroticism (-)

    From a negative perspective, blue craves clarity. Its need to see and understand and optimize can become frantic, like a child clutching a stuffed animal—even if knowledge won’t do anything, won’t allow for any new actions or help in any way, blue will often scrabble for it, even at the expense of other stuff that would help. A blue agent, when injured or disoriented or low-resourced, will often perseverate, spinning its wheels on irrelevant questions or tinkering with trivialities as a way to avoid having to engage with the big things it doesn’t understand and doesn’t feel ready for.

    And with too much blue, everything not easily quantifiable can evaporate or suffocate. Pathological blue is dismissive of that which it doesn’t understand or can’t sufficiently explain. Out-of-balance blue often disregards large swathes of what matters, and that disregard grows louder and more insistent as the situation worsens.


    Black

    Black seeks satisfaction, and it tries to achieve that satisfaction through ruthlessness. Black wants power and agency so that it can act upon its preferences at any time, doing whatever it wants, whenever it wants, and reshaping the world around it as it sees fit.. It recognizes no limits upon this pursuit except those which emerge from its own desires and self-interest. It is capable of cooperation and alliance, but only consequentially, as in game theory; at its core, black is amoral, not immoral, since it doesn’t think morality is even really a Thing. The archetypal black organization would be a hedge fund or a startup, and a black dystopia would be a totalitarian dictatorship.

    The “black mana” symbol (™ and © Wizards of the Coast)

    In the first Star Wars film, Han Solo was a sympathetic black character, whereas in Game of Thrones Cersei Lannister is a black villain. Every major character in Seinfeld is black except Kramer, and both Bart Simpson and Slytherin House embrace and embody black ideals. Blaise Zabini and Sirius Black are black characters in HPMOR. In the game of Magic, black cards are vampires and necromancers, demons and horrors, kill spells and resurrection spells and sacrificial spells that trade life and creatures for power and pain.

    A black agent, when presented with a decision or quandary, asks what’s best for me? What course of action will leave me best off, where “best off” includes having power, influence, safety, and wealth, as well as having moved closer to one’s goals.

    Victory for a black agent feels hefty, exultant, and satisfying, like a bag of gold coins or a heavy hammer — it’s the feeling you have when you know that the game is won, even if you haven’t yet crossed the finish line. Defeat, on the other hand, feels like aging or imprisonment — like scrabbling against an unscalable wall behind which your dreams are turning to ash and trickling away, leaving you with nothing.

    Other words associated with black: achievement, autonomy, determination, fame, influence, pleasure, popularity, reputation, success, status, wealth, ambition, control, dignity, excellence, improvement, innovation, liberty, mastery, performance, power, self-reliant, talented, undaunted, decisive, relentless, industrious, persuasive, realistic, suave, competitive, political, proud, solitary, uninhibited, amoral, arrogant, calculating, egocentric, hedonistic, malicious, opportunistic

    From most to least, black agents have the following Big Five traits:

    • Openness (+)
    • Extraversion (~)
    • Neuroticism (~)
    • Agreeableness (-)
    • Conscientiousness (–)

    (Although that last one is complicated; black is low on shoulds but is quite capable of diligence and effort when it feels like it.)

    From a negative perspective, black can’t handle codependency or obligation. It starts to freak out if it feels penned-in, depended-upon, trapped, drained. A black agent, when injured or disoriented or low-resourced, is extremely loath to cooperate, to invest, to engage in interactions that don’t visibly and immediately pay off. Black needs a feeling of power and possibility and potential, and when that’s missing or threatened, it tends to shift even harder into a kind of short-sighted transactional mode, often driving away precisely the people and opportunities that would have helped. With too much black, concepts like “cooperation” and “sustainability” drop away—black-out-of-balance is like a wildfire, consuming everything as quickly as possible, sowing the seeds of its own suffocation.


    Red

    Red seeks freedom, and it tries to achieve that freedom through action. Red wants the ability to live in the moment and follow the thread of aliveness and passion. It’s a bit strange to speak of a red “organization,” but to the extent that it’s possible to have an archetypal red organization, it would be one of those art studios that’s owned by no one where there’s paint on every wall and it’s almost impossible to move around what with all of the dancing and debating and half-finished projects. A red dystopia, on the other hand, would simply be anarchy.

    The “red mana” symbol (™ and © Wizards of the Coast)

    Red characters in popular culture include Toph Beifong from Avatar: The Last Airbender, Wile E. Coyote from Looney Toons, both Romeo and Juliet, and Kramer from Seinfeld. Of the Simpsons, Homer is the one who best embodies the spirit of red. The character of Joyce Byers in Stranger Things (Winona Ryder’s character) is loudly embodying red through both the first and second seasons. In Magic, red cards are goblins and pyromancers and dragons, Lightning Bolts that burn the opponent, illusions that taunt and enrage their allies, spells that grant speed and haste and fragile power, desperate gambles that put everything on the line, and chaotic effects that upset the whole battlefield.

    A red agent, when presented with a decision or quandary, asks what do I feel like doing? Which path seems most alive? What does my heart tell me?

    For a red agent, victory feels fiery, beautiful, magnificent, and fierce — it’s the climax of a dance or a brawl or a love affair, the feeling of cresting a summit or having successfully ridden a wave. It’s feeling alive. Defeat for a red agent is correspondingly quiet, empty, and gray — being trapped by things you can’t even pinpoint, to rail against; having nothing to love, nothing to do, nothing to be; feeling nihilism and pointlessness slowly swallowing you whole.

    Other words associated with red: authenticity, adventure, beauty, boldness, friendship, fun, humor, loyalty, candor, courage, creation, drive, empathy, enthusiasm, ferocity, independence, individuality, irreverence, joy, originality, passion, purpose, sensitive, spontaneous, trusting, dramatic, flexible, forthright, casual, stubborn, angry, blunt, careless, reckless, destructive, fickle, flamboyant, impulsive, performative, poetic

    From most to least, red agents have the following Big Five traits:

    • Openness (++)
    • Extraversion (+)
    • Agreeableness (+)
    • Neuroticism (~)
    • Conscientiousness (–)

    (As with black, red is capable of sustained diligent effort, but only if intrinsically motivated; no conscientiousness for conscientiousness’s sake.)

    From a negative perspective, red is pathologically incapable of accepting limitations—the sort of person who’s unable to marry because they’re unable to commit, because committing means cutting off avenues of future possibility. It’s also unable to tolerate quietness, emptiness, boredom, ennui. Red is restless, needing independence, freedom of movement, a sense of unconstrained choice, passion.

    A red agent, when injured or disoriented or low-resourced, will tend to flail or explode, magnifying and amplifying its emotions and then following them wherever they lead (and justifying the actions it takes as being valid and unimpeachable because they came from the heart). Red, fearing a loss of freedom and direction, responds by breaking everything around it and driving as hard and fast as it can in whatever direction it happens to be facing. With too much red, there’s no pattern, no ground to stand on, no reliability, no predictability.


    Green

    Green seeks harmony, and it tries to achieve that harmony through acceptance. Green is the color of nature, wisdom, stoicism, taoism, and destiny; it believes that most of the suffering and misfortune in the world comes from attempts to cast off one’s natural mantle, step outside of one’s natural role, or fix things which aren’t broken — it’s the color of Chesterton’s Fence. It seeks to embrace what is, harmony as distinct from order — the archetypal green organization would be a hippie commune, or the pop culture interpretation of a Native American tribe (such as in Disney’s Pocahontas), while a green dystopia would be something like the society in Divergent or a tribe with absolutely rigid traditions and an unchanging and unchangeable relationship to its environment.

    The “green mana” symbol (™ and © Wizards of the Coast)

    Green characters are slightly harder to find in the role of the protagonist, but often crop up around the edges of a story. If green had a martial art, it would be aikido — a sort of bending, accepting formlessness backed by subtle power. Both Yoda from Star Wars and Guinan from Star Trek are green, as is Tom Bombadil from Lord of the Rings. Buffy (the vampire slayer) has other colors but moves toward green as she embraces her destiny and, on the more feral side, Wolverine from X-Men often acts from green. The centaur society in HPMOR is green, in that they had sworn not to set themselves against destiny, even if it meant the end of all things. Our last Simpson, Maggie, is green as well, but that’s got more to do with her age than her fundamental character. In the game, green cards are druids and sages, mighty monsters and howling wolves, auras that restore the natural order and regenerate the wounded, and bursts of magic that produce enormous, feral strength or quell entire battles.

    A green agent, when presented with a decision or quandary, asks how are these things usually done? What is the established wisdom?

    Victory for a green agent feels peaceful, fertile, and balanced — a tired general retiring to his farm, a mother nursing her baby, a valley lush with growth now that the rains have come and the pestilence has passed. It’s solemn, but without sadness; joyful, but without ego. Defeat, on the other hand, feels like having no ground beneath your feet, like being cut off from your tribe and family, like watching fair and fragile goodness being crushed underfoot and having everything you thought was true called into question.

    Other words associated with green: growth, harmony, respect, spirituality, stability, acceptance, calm, centered, cautious, common sense, contentment, experienced, humility, intuition, maturity, meaning, moderation, restraint, reverence, serenity, sharing, significance, simplicity, strength, vigor, agreeable, contemplative, hearty, barbaric, virile, well-adapted, conservative, traditional, eldritch, ancient

    From most to least, green agents have the following Big Five traits:

    • Agreeableness (++)
    • Conscientiousness (+)
    • Extraversion (+)
    • Neuroticism (-)
    • Openness (–)

    From a negative perspective, green is pathologically passive. It has too much faith in everything working out, in things being the way they should be, in accepting whatever comes, however horrible. It can be phlegmatic to a fault, passing up opportunities to save itself and refusing to prepare for predictable, oncoming change. Green struggles with taking a stand, and is suspicious of any kind of novel agency.

    A green agent, when injured or disoriented or low-resourced, will often surrender or give up, turning to blind faith or repeating its default actions over and over again, unable to try something new. Worse, out-of-balance green will often undermine or sabotage others’ attempts to salvage a situation, dragging everyone else down with it. Green tends to fall back on what it already knows, regardless of whether that’s appropriate. And with too much green, all distinctions between good and bad, or better and worse, fall away, and everything becomes gray and indistinguishable.


    On the flip side, a world without white is a world of unreliability, with no scaffolds or handrails, no rules or recourse, no sense of fairness and no moral compass. White is the hard and durable skeleton of society, and without it, much of the cooperation and coordination that we rely and depend upon vanishes — even things like driving on a particular side of the road.

    A world without blue is a world without curiosity, without investigation, without the nitpicking desire to get every cog into just the right place. More than any other color, blue represents what makes humanity different from other animals, other species — without it, we sink back into the present and lose our bridge to the future.

    A world without black is a horror show of codependency, with all the inefficiencies of communist Russia and all of the insipid conformity of the town in Footloose or the society in Equilibrium or the people in the parable of the Emperor’s New Clothes. It’s a place where the sovereignty and nobility of the individual vanishes beneath the weight of the collective — it looks good at first, but without black, you lose the will to empire, the thirst for recognition, the desire to get ahead, the deep and personal wants that define and shape a person’s whole destiny. What’s left is pleasant, but there’s no soul at the core of it — nothing that burns with the hunger for something more.

    A world without red lacks a different sort of fire — it’s a world that has wanting, but no passion…only a base and selfish grasping, with no real spark behind it. It’s a world where the rules never change, where the assumptions are never questioned, a world without teenage love and modern art and violent protests and spur-of-the-moment adventures. Without red, everything moves in slow motion and everything has its temperature turned down — like an entire society that’s been sedated.

    A world without green, on the other hand, is a world unmoored from reality and disconnected from its own history. It’s a world full of bold schemes that fail to pan out, disasters that take generations to build momentum but are noticed too late. It’s a world where everything is out of place — where nothing truly even has a place — a constant parade of divorces and suicides and famines and extinctions, where things like global warming and eugenics and welfare programs with misaligned incentives happen all the time. It’s a world where the qualities that people derive from Zen Buddhism, or from the contemplation of a sunset, or from a hike in the mountains, or from the embrace of a grandmother, or from the sermon of an ancestor, are all entirely absent. It’s a place where you eat and eat and eat, but you never feel truly full.


    Colors in Conflict

    Another way to define the colors is to look at their disagreements with one another. There are five central conflicts between colors on opposite sides of the wheel, which help to define them in contrast with their enemies.

    In earlier versions of this essay, I defined the conflicts using pairs of words like “order versus chaos,” or “preservation versus exploitation.” Attentive readers pointed out, though, that defining a conflict as order versus chaos is pretty overtly taking a position on that conflict! It’s rather like the choice to define a guerilla group as “terrorists” or “freedom fighters” … the terminology you use is often downstream of having already chosen a side.

    So below, I’ve defined each conflict three times—once through the eyes of each of the colors involved, and once with a more neutral summary.

    hey everyone!

    Below are the five colors of Magic: white, blue, black, red, and green. Each color has a central goal, and a default strategy. White White seeks peace, and it tries to achieve that peace through the imposition of order. White believes that the solution to all suffering and unhappiness is coordination and cooperation and rules and restraint. The archetypal white organization would be…

    delete first hello ilovetags more tags test yay

  • ✴︎

    ✴︎

    ready to count baby

    12345678901234567890123456789012345678901234567890123456789012345

    this is how many characters should be on a line basically, 65

    here is more, the max, 75

    123456789012345678901234567890123456789012345678901234567890123451234567890

    last thing, let’s do 65 and then add some stuff after so we can see the line break

    12345678901234567890123456789012345678901234567890123456789012345 1 1 1 1 1 1 1

    the end

    hey everyone!

    ready to count baby 12345678901234567890123456789012345678901234567890123456789012345 this is how many characters should be on a line basically, 65 here is more, the max, 75 123456789012345678901234567890123456789012345678901234567890123451234567890 last thing, let’s do 65 and then add some stuff after so we can see the line break 12345678901234567890123456789012345678901234567890123456789012345 1 1 1 1 1 1 1 the end

    delete first hello ilovetags more tags test yay

  • ✴︎

    ✴︎

    WHAT IS THIS?

    This document is meant to informally discuss agentic autonomy in the context of [REDACTED], with the primary purpose of aligning on next steps towards productionalization. We already have a great demo; what do we need to make it real? [REDACTED] Answers to these questions and more, coming up.

    Crucially, this is a first step. Once we are aligned on the big rocks that will move the needle towards autonomy, subsequent deep dives will take a closer look at specific use cases (including, at least, [REDACTED]) and the specific systems and features we need to build to support them.

    WHAT IS AUTONOMY?

    It’s simple: stop telling the agent how, and start telling the agent what. This is exactly the same kind of autonomy we expect out of, say, our engineers. Now, of course, it’s a scale. One agent can be considered more autonomous than another. So, the goal isn’t really autonomy as some goal post, but rather more autonomy in comparison to our current systems or even plans.

    Take [REDACTED] as a use case. An agent responsible for [REDACTED] might be considered more autonomous if we let it choose its own tools rather than hand-selecting those tools. That same agent might be considered more autonomous if we broke it free of a pre-defined workflow with hardcoded approval steps or computations. But an even more autonomous agent might not even be asked, explicitly, to set [REDACTED]. Instead, that higher-level agent could be asked simply to handle [REDACTED]

    Going even further, perhaps the most autonomous agent is not given tasks at all. Instead, they are given something more like job roles or responsibilities. How and when they convert those responsibilities into action is completely up to them. In this world, agents start to look less like glorified functions with inputs and outputs, and more like Bezos-style mechanisms. Like these mechanisms, these agents must establish feedback loops and improve.

    This brings us to a possible definition for full autonomy: systems that have “the keys to the city”, and use those keys as necessary to carry out any given responsibility as well as possible.

    ITERATIONS TOWARD AUTONOMY

    In this section, we explore some of the high-level features that we require for a fully autonomous system. And, crucially, we lay out these features in an iterative manner; we do not need to boil the ocean. The purpose of all of this is to, then, debate which of these features deserve the most short-term development effort.

    Imagine we had just an LLM. I mean — I say “just” but they are obviously very powerful. And very good and what they do. But can they run the business? No. That LLM is pretty much limited to surface-level chat. The road to autonomy from this point is long. For starters, the LLM is reactive, not proactive. It is all talk, without the ability to act. And it really does not understand much about the Private Brands business.

    The obvious next step is that we give the agent a bunch of Private Brands-specific tools via an MCP server. (You might call out knowledge bases separately, but they are close enough in their impact and use that I lump them in with tools.) Our chatbot has grown in both smarts and capability; learn and be curious, and bias for action. Are we done? No. There a few problems. In particular, it’s still just a chatbot. It’s a glorified UX, working on behalf of the User behind the keyboard. Worse, it sometimes makes mistakes that impact the business.

    Let’s address this safety problem by introducing change management. This comes in different forms. The MCP server can be expanded with change management tooling, so that the agent can specifically request approvals and so on. More powerfully, we can provide indirection between the agent-facing tool and the actual business update, giving us a hook to inject change management on an as-needed basis. The decision of whether or not to auto-approve can be driven by simple heuristics, or it can be driven by another agent. The configuration that drives our MCP tools must now support change management customization as well.

    Presumably, the change management system preserves a history of approvals. With this history, we can audit problems as they occur. But they don’t tell the whole story. They don’t explain why the change was requested in the first place. They don’t show the User prompt or tool-provided data that provided the context for that update. To understand what is going on across an increasingly intelligent system, we need to centrally track business decisions. These are, in my mind, similar to our Architectural Decision Records (ADRs). Agent-facing update tools will expect the agent to provide rationale, further decorating the intended action in a manner similar to the Command Pattern. The system itself can augment agent-provided rationale with conversational facts like previous tool invocations.

    At this point, we might find that some of these decisions are going awry because the agent is over-indexing on industry standards and general knowledge. We can introduce the business playbook to capture and share as much tribal knowledge as possible. This can be exposed as a knowledge base or toolset. Or, why not both? The creation and maintenance of this catalog is obviously a massive undertaking, but we can take it incrementally. (And the timing is right, now. With the move to Centric, we need to unearth and update much of this knowledge regardless.)

    Sooner rather than later, the size of the business playbook (and, really, the complexity of the business as a whole) will break a single agent. It will hallucinate, fly through context windows, and make bad choices. We need more than a single agent; we need an entire agent catalog. Each of these agents must be configurable with the prompts, knowledge, and tools to effectively drive a slice of the business. And they must be accompanied by the metadata required for discovery and reuse — not just as User-facing chatbots, but for agent delegation as well.

    Agents calling agents calling agents. It’s agents all the way down. From a system perspective, the behavior of the business is becoming more emergent; more chaotic. One way to think about this is that it is computationally irreducible. That is, there are no shortcuts to figure out even a rough idea of what might happen. The only way to know what will happen is to let the damn thing loose and see what happens. This is actually a form of strong emergence, where even in hindsight we cannot fully understand why the system behaves the way that it does. 

    We can — no, must! — counterweight this chaos by tracking context. That is, every system involved in taking action against our business should understand the context that it is operating in. Then, when a given action is taken, we can trace up this context chain, which effectively amounts to a call stack. Our business is big, but it’s nothing like, say, SCOT. Or the detail page. Stuff is happening, but not that often. Every action taken against our business should be considered a big deal.

    If agents are waiting on other agents, we do not want them idling around wasting resources. It’s time for an event-driven architecture. This is what the Agent2Agent protocol fundamentally offers, although we would, ideally, like something that works just as well for other asynchronous work like human-facing tasks or ETL jobs. This effectively amounts to an instance-bound subscription. Again, this should work with any number of building blocks, not just agentic work.

    With event subscription, we are officially moving away from glorified LLMs towards true agents. That said, we are still missing cognition. This includes features from short-term and long-term memory all the way to planning and cognitive loops. The difference between an agent with or without cognition is like the difference between me writing a document or rambling in a meeting; the former might be useful for mind-melding and brainstorming, but the latter introduces a level of research and rigor that is required for accurate decision-making.

    At this point, the agents are working pretty well, but they are still reactive. They behave like functions — take some input, produce some result (and, perhaps, side effects in the system.) To achieve an autonomous system, we need agents that work more proactively. We need agents that behave like mechanisms rather than functions. We need agents that are given responsibilities rather than tasks. These agents work continuously, using the event-driven architecture described above, to deliver on those responsibilities. 

    To create a true Bezos-style mechanism, we need a feedback loop. We need evaluation. This is not a purely agentic concern; we can just as well evaluate a traditional workflow. Here, we must face Goodhart’s Law: the metrics we leverage as proxies might lead us astray because they lack a strong cause-effect relationship with our north star goals. On the other hand, our north star metrics are noisy and far-removed from the actions under our control. One proven means to address these difficulties is to develop KPI scorecards.

    With a means to evaluate these systems, we can begin to experiment with different implementations. I like to think of this as pitting agents against each other in sort of Darwinian dystopia. This might begin as simple A/B testing, but it can be as complex as we need, while maintaining statistical power. In particular, I like to think of exploring the “tool space”, since tool selection seems to be such a critical factor in agent success.

    In order to experiment across paradigm boundaries (e.g. agent vs. ML model vs. simple heuristic) we must define what they have in common. These paradigms are implementation choices; what is their interface? This requires that we define non-agentic abstractions for the agentic work. Think tasks, goals, mechanisms, functions, and so on.

    With these new abstractions, we can start to plug-in agents to other aspects of our software. Let me explain. To this point, we have top-level (root) agents in the form of mechanisms, and bottom-level (leaf) agents in the form of User-facing chatbots. We can get more mileage out of our agents by putting them to work in the middle. Have agents work on workflow tasks; have agents perform transformations during integrations; have agents respond to the event of price updates. This is easy when our workflows and integrations and event policies can integrate with abstractions like mechanisms and tasks — no special integration with agents required.

    Plugging in agents directly into these existing processing paradigms (workflows, functions, policies) is all-or-nothing. It is either implemented by that agent or it isn’t. We need a way to get the best of both worlds: the dynamism and simplicity of agentic implementation; the consistency and safety of human-provided guardrails. The paradigm proposed to solve for this tension is the work set, which behaves something like a workflow where the flow is optional. Flexible when we can, rigid when we can’t.

    Phew. This is a lot already. These agents are clearly very powerful. But that doesn’t mean they don’t occasionally need a helping hand. If our business were an episode of Who Wants To Be A Millionaire?, we want our agents to be able to phone a friend. This is especially true for value judgments. This is different from approvals; it is about active collaboration rather than guardrails. My favorite approach for this kind of human-in-the-loop interaction is the question and answer. Flip the script on the traditional chatbot and have the agents reach out to Users with a schema-backed question, almost like a micro-form.

    We have covered agent-human and agent-agent collaboration, already. But what if an agent needs the help of another agent that does not yet exist? This is where code mode comes in. Let the agent configure new building blocks as needed in a centralized workbench. These building blocks can then be shared in the catalog, or used as a one-off. And they can be anything from workflows to agents to tables — anything. This has the potential to explode agent power similar to what third-gen programming languages did for Developers.

    This workbench perfectly illustrates the need for a catalog. Earlier, we established the need for an agent catalog; here, we extend this need to a building block catalog. Take ASINs. It is not enough to offer tools for creating, querying, fetching, updating, and deleting ASINs. Those tools (or their underlying APIs) help the agent execute, but they do not help the agent build. For example, an agent might want to make a given ASIN the scope of a workflow task. For that kind of work, the agent must understand the nouns, not just the verbs. This has historical precedent in declarative programming.

    You know what would make the catalog a lot more useful to agents? Natural language metadata. That is, it should be possible for agents and humans alike to talk about the entities in the model so that they can better work with that data. Think something like comments in code; the two live side by side.

    There is one more problem. A traditional catalog can be explored in depth by human builders. Those humans can learn all the idiomatic behavior and interactions of those components as they use them to build something new. However, as we know, an agent is only as good as its tools. If it fails to find the right tools, it’s going to fail to do the job. We can address this through something like a sommelier; a service that excels at zooming in on the specific aspects of a massive business that matter for a specific use case.

    Are we done? No! I tricked you earlier when we left the topic of mechanisms and feedback loops. Evaluation and experimentation are useless until we use them to improve. Now, we could do this manually, of course — that’s how Weblabs work today, for example. But an agentic mechanism should improve itself. That means exploration and exploitation. It means reinforcement learning. I will leave the details to a future doc.

    Phew! That’s enough for now. You can see a breakdown of these features below.

    FROM TOY TO REAL BOY

    We just presented an awesome demo aiming towards autonomy from [REDACTED]. You can check out that demo at [REDACTED]. A picture is worth a thousand words, of course…

    [REDACTED]

    So — what would it take to actually leverage this? Ultimately, these are the features (and, therefore, components) that we plan to build now.

    MUST HAVE (1) Change management. Agents should never take action against our systems directly. Those actions (via MCP tools, API calls, or whatever) should always go through a kind of trust barrier, a separate component dedicated to making sure those changes are the right ones. The changes do not need to be human-reviewed, necessarily! They could be reviewed by an agent; they could be let right on through via an established policy. The point is, we ultimately maintain control over the changes happening to the business.

    MUST HAVE (2) Auditing. This is closely tied to change management; when we see data that doesn’t make sense, we need to know what change it was a part of, why it was approved, and the rationale behind the change. This might be recursive! That change might be made in response to another change, or the task that drove the change might be part of a much larger task. This doesn’t have to be perfect, but we need to start with at least some basic auditability. We do not need a formalization of business decisions, but we do need the context to trace through changes and tasks and agents and workflows. On the other hand, we might find that without properly tracking decisions and enforcing a kind of “show your work”, the agents are a little eager to forge ahead under whatever assumptions they make at the time.

    MUST HAVE (3) Business playbook. Today, our agents are “hardcoded” to the degree that they have custom-built prompts and tool selection. A huge part of autonomy is getting out of this business. We would much rather the agent (under the hood, a higher-level supervisor agent) to do the research necessary to determine the best prompts and tools for the task at hand. However, this requires an understanding of the business, and that in turn requires a cataloging of our existing workflows, SOPs, and tribal knowledge. Again, this doesn’t need to be perfect. We can iterate on the contents of the playbook over time to facilitate new use cases as they come up. But we do need the container in which the playbook can grow.

    SHOULD HAVE (4) Existing workflow integrations. This might be better bucketed as event-based triggers. The point is, the agents we have so far must, generally, be invoked by a human being. That puts an upper bound on autonomy, because the agents must be told what to do. We can get more mileage (and more CO-LAB-O-RA-TION!) out of these agents by plugging them into our existing workflows. Let them help with the work we are already doing.

    [REDACTED]

    THAT’S ALL FOR TODAY

    Thank you as always for reading! 

    hey everyone!

    WHAT IS THIS? This document is meant to informally discuss agentic autonomy in the context of [REDACTED], with the primary purpose of aligning on next steps towards productionalization. We already have a great demo; what do we need to make it real? [REDACTED] Answers to these questions and more, coming up. Crucially, this is a…

    delete first hello ilovetags more tags test yay

delete first hello ilovetags more tags test yay

, , , ,