Chapter 3 — The Tools Were Already There

STATUS: v0 (all six parts drafted) Documentary parallel: chapters/03-the-tools.md (Beats 3.1–3.12) Last updated: 2026-05-22

What this chapter does

Chapter 1 put a bush in your head. Many origins of multicellularity, scattered across all three branches of the tree of life, each lineage using different molecular parts. This chapter is about those parts: where they came from, and why most of them are older than the multicellular bodies they now build.

The chapter does two things at once. First, case by case, it shows that the molecular toolkit later used to build animal bodies (the proteins that stick cells together, the switches that let cells talk to each other, the regulators that make one cell different from its neighbour, the enzymes that execute a cell’s controlled death) was already present in single-celled organisms hundreds of millions of years before animals existed. Each piece was assembled in unicellular life and redeployed when multicellular lineages arose.

Second, it pulls back to ask a question the case studies prompt: if the parts were lying around, are they reused to the same depth across every origin of multicellularity? They are not, and the pattern of how the reuse varies is the chapter’s analytical payoff. Adhesion machinery is lineage-specific; every multicellular origin built it from scratch. Differentiation machinery partly draws on regulator families already shared across eukaryotes. Cell-death machinery is shared deeper still, predating multicellularity and, in some of its parts, the origin of eukaryotes. The age of the molecular reuse tracks the age of the underlying problem.

By the end of the chapter, you should be able to articulate:

  1. What the 2008 Monosiga brevicollis genome paper changed.
  2. What a cadherin and a tyrosine kinase are, and why finding both in a single-celled organism was a surprise.
  3. What co-option means, with the Volvox clock-to-switch story as the worked example.
  4. The three-tier reuse pattern (adhesion solved from scratch, differentiation reusing partly-shared regulators, cell death using deeply shared machinery) and why that pattern makes sense.
  5. Why the toolkit story is co-option plus invention: Wnt, TGF-β, the Hox cluster, and most neuron-specific gene families really are animal-stem additions.
  6. Why horizontal gene transfer complicates strict “fully independent” claims (brown-algal alginate has a partial bacterial origin) without undermining polyphyly.

The chapter is in six parts.

PartTitleApprox. length
1The Monosiga genome moment~1100 words
2What a cadherin is, what a kinase is, and what they were doing in a single cell~1700 words
3Co-option, named and shown — Capsaspora, Salpingoeca, and the Volvox clock that became a switch~1600 words
4The three-tier hierarchy — adhesion, differentiation, death~3000 words
5Co-option plus invention — what was not already there~900 words
6What the chapter answers, what it doesn’t, and what comes next~700 words

Part 1 — The Monosiga genome moment

In 2008, a paper came out of Nicole King’s group at UC Berkeley that changed how biologists think about the origin of animals [King et al. 2008]. It was a genome sequence (a complete read of every gene) of a single-celled organism called Monosiga brevicollis. There was no press conference. The journal was Nature and the headline was technical. But the implication kept turning up in subsequent work, year after year, and a decade and a half later the field is still absorbing it.

Before the implication, two pieces of vocabulary.

What it means to sequence a genome

Every cell on Earth contains a string of instructions written in a four-letter chemical alphabet. That string is the cell’s DNA. The instructions are organised into discrete sections called genes, each of which can be read out by the cell’s protein-making machinery to produce a particular protein. A protein is a long molecule that does some particular job: sticking the cell to its neighbour, catching prey, copying DNA, photosynthesizing. Most of what a cell does is done by proteins, and which proteins a cell can make is set by which genes it carries.

To sequence a genome is to read off, letter by letter, the entire string of DNA in a single organism and identify, from the patterns in that string, the catalogue of genes it carries. You get the parts list. Sequencing tells you what is present, not what is in use, and certainly not why. But you know what is there.

By 2008, hundreds of organisms had been sequenced: bacteria, archaea, yeasts, fruit flies, Caenorhabditis elegans, humans, the mouse, Arabidopsis. Most of those genomes belonged to multicellular organisms, or to single-celled organisms whose multicellular cousins had already been read. What was missing, until King’s paper, was a careful read of a single-celled organism that sits very close to animals on the tree, close enough to tell you what was present in the toolkit before animals existed.

Who Monosiga is, and the cousin-not-ancestor point

Monosiga brevicollis is a choanoflagellate, a microscopic single-celled organism with a particular shape that turns out to matter. It has one long whip-like extension at the front of the cell (the flagellum), and around the base of that whip a collar of finger-like extensions of the cell membrane. The flagellum beats and drives a current of water past the collar; bacteria and other particles in the water get caught by the collar and consumed. A choanoflagellate is a single-celled bacterial filter feeder.

The shape is familiar to anyone who has ever looked at a sponge under a microscope. Sponges line their internal water channels with cells called choanocytes: cells with one flagellum surrounded by a microvillar collar, drawing water through the sponge and filtering bacteria out of it. Same architecture, same trick. Choanoflagellates and sponge choanocytes have been compared since the nineteenth century, and modern phylogenetic work confirms that choanoflagellates are the closest living single-celled relatives of animals [Carr et al. 2008; King et al. 2008; Brunet & King 2017].

The phrasing matters. Closest living single-celled relatives. Not ancestors. The choanoflagellate lineage and the animal lineage diverged from a shared single-celled ancestor somewhere around 700 to 800 million years ago; molecular-clock estimates have wide credible intervals, and the field is careful about treating any particular date as precise [dos Reis et al. 2015]. Since the split, both lineages have been evolving. The animal lineage’s descendants grew bodies. The choanoflagellate lineage’s descendants stayed single-celled and have been refining the filter-feeding life ever since. Monosiga is the modern endpoint of one of those refinements; it has been evolving for the same several hundred million years you have. It is your cousin, in the precise sense that you and a chimpanzee are cousins back to seven million years ago, except this cousin is a hundred times older and made of one cell.

Chapter 1 Part 7 made this point, and the chapter you are now in will make it again. Monosiga is not the urmetazoan, the ancestor of all animals dressed up for a 21st-century microscope. It is what 700–800 million years of independent evolution as a single-celled filter feeder produced on the cousin branch.

What was in the genome

What King’s group read out, when they sequenced Monosiga brevicollis, was a genome of roughly 41 megabases (41 million letters of DNA) with about 9,200 protein-coding genes [King et al. 2008]. Those are the kind of numbers you get from a moderately compact single-celled eukaryote: bigger than a yeast, smaller than a human. Nothing about the size of the genome was the surprise.

The surprise was the parts list.

Genes that biologists had assumed were inventions of the animal stem — genes for sticking cells together, for signalling between cells, for the kind of regulated communication a developing embryo needs — were already there. Not partial precursors, not vague homologs. Recognisable members of the same protein families animals use to build tissues. In a single cell, in a lineage that has never been multicellular in any sustained way, hundreds of millions of years before animals existed.

The two cleanest cases are the ones Part 2 will unpack: cadherins, which in animals hold neighbouring cells together at junctions, and tyrosine kinases, which work as molecular on/off switches inside the cell’s signalling networks. Both classes were present in Monosiga, in numbers comparable to or higher than what animals carry [Abedin & King 2008; Manning et al. 2008; King et al. 2008]. The honest reaction in 2008 was bafflement. What were those genes doing in a single cell with no neighbours to stick to and no developing embryo to coordinate? Whatever they were doing, it was not the multicellular job we had assumed they evolved for.

The toolkit was older than the body it would later build.

That is what changed in 2008. Biologists do not call it a revolution and neither does the documentary, but it is structural. It changes what you have to say about the origin of animals. The question stops being how did evolution invent the parts? and becomes what changed in how the parts were used?

→ Continue with Part 2 — What a cadherin is, what a kinase is, and what they were doing in a single cell.

What this part draws on:

  • Choanoflagellates as the closest single-celled relatives of animals, cousin-not-ancestor framing, Monosiga genome content: content/02-model-systems/choanoflagellates.md; [King et al. 2008]; [Carr et al. 2008]; [Brunet & King 2017].
  • Wide-credible-interval date for the animal–choanoflagellate split: [dos Reis et al. 2015].
  • The “toolkit predates multicellularity” headline: BIG-PICTURE.md §“What we are confident about” item 2.

Part 2 — What a cadherin is, what a kinase is, and what they were doing in a single cell

To say “the toolkit predates the body” is a claim about specific proteins. Two of them carry most of the chapter’s argument, and you need to know what they actually are before the surprise can land. The numbers along the way are approximate; the orders of magnitude matter, the exact figures less so.

A protein that sticks animal cells to their neighbours

Take any patch of your skin. Run a fingertip across it. The patch is made of cells: squarish, packed tight against each other in sheets, with very little space between them. The cells are not glued there; nothing on the outside is holding them in place. They are stuck to each other, directly, cell to cell, by proteins that span their membranes and reach out and grip the membrane of the next cell over.

The most important class of those proteins is called the cadherin family.

Mechanically, a cadherin works like this. The protein is anchored in the cell’s outer membrane, with most of its body sticking out into the space between this cell and the next one. The part that sticks out is shaped into a series of rigid little segments (cadherin domains, biologists call them) held in a specific geometry by tiny amounts of bound calcium. Calcium is the reason cadherins fail when their environment runs out of it: pull the calcium away with a chelating chemical, and the segments lose their shape and let go. This is what gives the family its name: cadherin is short for calcium adhesion herent. The outer end of one cell’s cadherin reaches across the gap and grips, in a specific geometry, the matching outer end of a cadherin on the next cell. The handshake holds the cells together. The metaphor is just a metaphor; the actual molecular interaction is a precise calcium-dependent interlocking of protein surfaces, not a clasp of two hands. On the inside of each cell, the cadherin’s tail is hooked into the cell’s internal scaffolding, called the cytoskeleton, by a small set of partner proteins (β-catenin, α-catenin) acting as the linkage [Brunet & King 2017].

The result is that the two cells, on either side of the cadherin handshake, are mechanically coupled all the way down to the structural skeleton of each cell. Pull one, and the other comes with it. Multiply that across millions of cell-cell contacts and you have something that holds together as a sheet, the way your skin does. Cadherins are how an animal stays an animal-shaped object instead of dissolving into a slurry.

The discovery that animals make heavy use of cadherins is old; vertebrate E-cadherin was characterised in 1984. What was not expected, until twenty-four years later, was that Monosiga brevicollis, a single cell with no neighbouring cells to stick to, would turn out to have about twenty-three different cadherin genes [Abedin & King 2008], a count comparable to a complex animal. Subsequent work showed that the partner proteins (β-catenin, α-catenin) that hook the cadherin tail to the cytoskeleton in animals were also present in choanoflagellates and in another close single-celled relative we will meet in Part 3 [Nichols et al. 2012]. The full classical cadherin/β-catenin complex, the unit that holds an animal tissue together, has components already in place in unicellular lineages that diverged from the animal lineage hundreds of millions of years ago.

What were those cadherins doing in a single cell? We don’t know exactly. Two of the cadherins in Monosiga localise to the cell’s feeding collar, the structure that catches bacteria from the water current [Abedin & King 2008]. The best current guess is that the ancestral function of the cadherin family was prey capture: grabbing a passing bacterium and holding on long enough to engulf it, and possibly substrate adhesion, sticking a free-swimming cell to a rock or a sand grain. Catching food. Holding on. There was no multicellular body to hold together yet. The chapter does not need that weaker claim to be settled. The point is the dissociation: the parts were assembled and refined in single cells, doing other work, before any lineage put them together to build tissue.

A molecular switch

The second protein family is more striking.

A complex animal is an enormous network of cells, each of which needs to know, moment by moment, what to be doing: skin cell making more skin or holding still, muscle cell contracting or relaxed, immune cell circulating or attacking. Coordination at that scale runs on signalling. One cell secretes a small molecule, another receives it and changes what it is doing. For the system to work, the receiving cell needs switches: parts of its internal machinery that can be flipped from off to on when a signal arrives, and back again later.

The most important class of those switches in animal cells is a family of enzymes called kinases.

A kinase is, mechanically, an enzyme that takes a small chemical group called a phosphate and sticks it onto another protein at a specific spot. Adding a phosphate usually changes the target protein’s shape just enough to switch its activity: turn it from off to on, or sometimes from on to off, or make it bind a different partner. Phosphorylation is the cell’s universal toggle. Almost every interesting decision an animal cell makes runs through kinase activity at some point.

The family this chapter cares about is the tyrosine kinases, a sub-family that adds phosphate specifically to tyrosine residues, one of the building-block amino acids proteins are made of. Animal cells are heavily decorated with tyrosine-kinase signalling: receptor proteins on the cell surface that, when a signal molecule binds them outside, switch on a tyrosine-kinase activity on the inside, which phosphorylates downstream targets and changes what the cell is doing. Tyrosine kinases are the workhorses of animal cell-to-cell communication.

How many does a human have? In the relevant subclass, the part of the family most directly responsible for cell-to-cell signalling, about ninety [Manning et al. 2008].

How many does Monosiga brevicollis have? Roughly one hundred and twenty-eight [Manning et al. 2008; King et al. 2008].

Pause on that. A single cell, in a lineage that never built a body, has more tyrosine kinases of this class than you do. The precise statement matters: Monosiga has more genes of one specific signalling-kinase category than your entire body does. It does not have more genes overall; you have something like 20,000 protein-coding genes, Monosiga has about 9,200. But this one category of switches is, in a single-celled choanoflagellate, more diverse than in any sequenced animal [Manning et al. 2008].

This was not the first hint of the pattern. A receptor tyrosine kinase had already been found in a choanoflagellate in 2001 [King & Carroll 2001], a single example pushing the origin of receptor tyrosine kinases out of animals and into their unicellular relatives. The 2008 genome paper showed that the entire family had been built out, in the single-celled lineage, on a scale that exceeds what animals carry.

What are those 128 tyrosine kinases doing in Monosiga? The honest answer is we don’t fully know. The diversity suggests a complex single-celled signalling lifestyle: integrating environmental cues, regulating internal state, timing behaviour. A free-living single cell, alone in the water, has to do all of its own state-regulation by itself; a tissue cell can outsource most of it to its neighbours. Whatever those 128 kinases are doing, the consequence for the chapter’s argument matches what cadherins showed. The switches were built and elaborated in unicellular life. The signalling machinery was already there, at scale, before there were multicellular bodies to signal across.

The thesis line, the first time

Adhesion proteins and signalling switches both follow the same pattern. The toolkit was assembled in single-celled relatives of animals, hundreds of millions of years before any of these molecules had a multicellular body to coordinate. The toolkit was redeployed.

The documentary has a line for it:

Multicellularity didn’t invent the toolkit. Multicellularity rewired it.

The rewiring has a technical name in evolutionary biology, and the next part is where the reader earns the vocabulary to use it.

→ Continue with Part 3 — Co-option, named and shown — Capsaspora, Salpingoeca, and the Volvox clock that became a switch.

What this part draws on:


Part 3 — Co-option, named and shown — Capsaspora, Salpingoeca, and the Volvox clock that became a switch

The pattern Part 2 set up needs a name. Evolutionary biologists call it co-option, and the definition is exactly the pattern you have already seen.

Co-option is when an old part, evolved to do one job, gets used for a different job by a later lineage, without a new part being invented. The cadherin handshake that probably caught prey in a single-celled ancestor is the cadherin handshake that, in the animal descendants of that ancestor, holds tissue together. Same protein family, same chemistry. Different role in the cell, different role in the body. The ancestor was not anticipating the animal’s needs; the animal lineage was working with whatever its single-celled ancestors happened to have lying around.

Co-option matters for the rest of this chapter because it predicts something specific. If multicellularity arose many times independently, in many different lineages, each starting from a different single-celled ancestor with a different set of parts available, then each multicellular lineage should have co-opted a different subset of whatever its ancestor was carrying. Different parts available, different parts reused. The shared pattern would be the move itself, not the parts.

This part shows that prediction holding, in two close relatives of animals that are not on the line to animals, and then in a lineage on the opposite side of the eukaryotic tree.

Capsaspora: same pantry, different recipe

Capsaspora owczarzaki is a second close single-celled relative of animals. It sits inside the same broad clade (the Holozoa) that contains both animals and choanoflagellates, but on a separate branch from the choanoflagellates. Its lineage is called Filasterea.

Under a microscope, Capsaspora does not look like a choanoflagellate. It is amoeboid, irregular in shape, sending out long thin membrane extensions (filopodia) to feel its way along surfaces and capture prey. Under most conditions it lives as single cells. Under others, the cells aggregate, build a small multicellular cluster, behave for a while as a coordinated group, then go back to being single cells [Sebé-Pedrós et al. 2013].

That aggregation event has been read out at the level of which genes turn on and off as the cell shifts from solitary to grouped. The result is what the co-option prediction calls for: a specific subset of the shared animal-style toolkit is switched on at aggregation. The integrins (a different family of adhesion proteins that animals also use heavily) show up, along with the partner proteins anchoring them to the cytoskeleton. So do a set of cytoplasmic tyrosine kinases. So do components of the extracellular matrix [Sebé-Pedrós et al. 2010; Sebé-Pedrós et al. 2013; Sebé-Pedrós et al. 2017]. All of it animal-style; all of it already there in the single-celled state.

And critically: it is a different subset from what shows up in choanoflagellates. Salpingoeca rosetta (we will come back to it in a moment) deploys cadherins and ECM modifiers and a Hippo signalling pathway component when it builds a rosette; it does not lead with integrins. Capsaspora leads with integrins and makes less use of cadherins. Different lineage, different multicellular phase, different slice of the shared toolkit.

The documentary’s image is worth borrowing: the pantry. The shared toolkit is a pantry, a stocked single-celled inheritance of adhesion proteins, signalling kinases, transcription factors, and cytoskeletal regulators, all already there in the common ancestor. (Pantry is a metaphor. The actual thing is a set of genes the common ancestor carried that descendant lineages inherited and elaborated.) Each lineage that becomes multicellular reaches into the pantry and pulls down whatever it needs. The pantry is shared. The recipes are different.

Salpingoeca rosetta — a clonal multicellular phase that is not on the road to animals

Salpingoeca rosetta is a modern choanoflagellate. Its rosette colonies form clonally: one founder cell divides, the daughters fail to separate, and the result is a sphere of four to fifty cells, all descended from the same starting cell, sharing a common extracellular matrix [Fairclough et al. 2010]. That is exactly the same developmental mode (cells stay together after dividing, rather than dispersing or aggregating) that animals use. It is the cleanest extant example of clonal multicellular development in a single-celled relative of animals.

It is also not a step on the way to animals. Salpingoeca rosetta is a present-day species at the tip of its own branch; its rosettes are induced by specific bacterial molecules its ancestors evolved to recognise, and they serve choanoflagellate functions (probably mating and feeding) rather than building any sustained body. The rosette mode is informative for what the common ancestor of choanoflagellates and animals might have been capable of, but the modern rosette is not the ancestor’s rosette. Both lineages have been evolving for ~700–800 million years since the split [dos Reis et al. 2015].

Volvox: a clock rewired into a switch

The cleanest single illustration of co-option is not in the unicellular relatives of animals at all. It is in a green alga.

Chapter 1 Part 4 introduced Volvox: a microscopic green ball, hundreds to thousands of small flagellated cells embedded in a transparent extracellular matrix, swimming together. Volvox’s signature trick is that not all of its cells do the same thing. Most of the cells (a couple of thousand of them in Volvox carteri, the lab species) are small, flagellated, somatic; they propel the colony, but they cannot divide. A much smaller number, sixteen or so, are large, non-flagellated, and germ; they are the only cells that make the next generation. Once a cell is committed to the somatic fate, it stays somatic, lives a few days, and dies. The germ cells live on.

The mechanism that locks somatic cells into the somatic fate is, surprisingly, a single regulatory gene called regA. The first several rounds of cell division in a Volvox embryo are roughly symmetric. At the sixth division, some cells divide asymmetrically, producing a large daughter and a small daughter. The small daughters (cell size is the trigger) activate regA. regA then represses the genes the cell would need to grow, photosynthesise vigorously, and reproduce. The cell is locked into a non-reproductive, soon-to-die somatic identity. The large daughters, with regA off, develop as germ cells [Kirk 2005].

Now the co-option part. Volvox descended from a single-celled green-algal ancestor very similar to a modern lab species called Chlamydomonas reinhardtii. The split between the Volvox lineage and the Chlamydomonas-like ancestor is much more recent than the animal-choanoflagellate split, somewhere in the range of 250 to 140 million years ago [Herron et al. 2009; Ma et al. 2023; Lindsey et al. 2024]. Volvox’s germ-soma split is, by evolutionary standards, young. Younger than the dinosaurs.

Comparing the two genomes turned out to be one of the cleaner natural experiments of the last twenty years. They are extraordinarily similar in gene content: about 14,500 protein-coding genes in Volvox, about 15,100 in Chlamydomonas, with very little net difference in what is encoded [Prochnik et al. 2010]. The multicellular Volvox did not arrive at germ-soma differentiation by inventing a flood of new genes. It arrived there by rewiring the genes its single-celled ancestor already had.

The detective work on regA itself made the rewiring concrete. In the single-celled ancestor of the lineage, regA’s precursor was tied to the daily light-dark cycle, part of how the single cell regulated its behaviour across day and night, switching between metabolically active and rest states [Nedelcu & Michod 2006]. When the lineage leading to Volvox built its germ-soma split, the same gene was rewired. The signal that activated it was no longer time of day; it was cell size at the end of embryonic cleavage. And what it switched off was no longer the cell’s nighttime metabolism; it was the cell’s reproductive program. The gene that, in a single cell, said now it is night, slow down, became the gene that, in a developing Volvox embryo, says you are small, you are somatic, you will not reproduce.

A subsequent comparison of cell-type-specific gene expression in Volvox (what the germ cells make, versus what the somatic cells make) extended the pattern beyond regA itself. Volvox somatic cells preferentially express the orthologs of Chlamydomonas night-active genes; Volvox germ cells preferentially express the orthologs of Chlamydomonas day-active genes [Matt & Umen 2018]. The diurnal cycle of a single-celled ancestor (alternating between two metabolic states across a 24-hour cycle) was spatialised. What used to be a temporal alternation in one cell became a stable difference between two co-existing cell types.

A clock became a switch. Same parts, different connections.

(It is tempting to say the Chlamydomonas version of regA “became” the Volvox version. Both are modern. The shared ancestor’s gene was used one way; in the lineage leading to Volvox, the gene was rewired into a different job. Neither modern species is an ancestor of the other.)

Three different lineages have now demonstrated the move (Capsaspora on its branch, choanoflagellates on theirs, Volvox on the green-algal side) and the pattern is the same in all three. The pantry was already stocked. Each lineage that went multicellular reached into it and pulled out a particular slice. The toolkit is shared by inheritance. The recipes are independent.

The next part asks a different question. Across all the multicellular lineages we know about, at each of the major problems multicellular life has to solve (adhesion, differentiation, cell death), how deeply is the toolkit actually shared?

→ Continue with Part 4 — The three-tier hierarchy — adhesion, differentiation, death.

What this part draws on:


Part 4 — The three-tier hierarchy — adhesion, differentiation, death

We can now ask the question this chapter has been building toward.

Part 2 showed that the parts predate the bodies. Part 3 showed that the same shared pantry is being drawn on by different lineages, each in its own way. The corpus this chapter sits on top of has been assembling, over a couple of decades of comparative genomics, a particular pattern in that reuse, across the major problems any multicellular life has to solve.

Before delivering the pattern, a flag on what kind of claim it is. Each individual piece (the lineage-by-lineage diversity of adhesion molecules, the partly-shared regulator families behind differentiation, the deeply conserved machinery of programmed cell death) is well-established science. The corpus’s contribution is the integration: pulling those three layers together and noticing that they form a hierarchy. As far as we can tell, the data assemble into a particular shape when you read them across all three layers at once, and the shape is informative.

Multicellular life reuses old parts, but how much it reuses them depends on the age of the underlying biological problem. Adhesion is a multicellular-only problem; every lineage solved it from scratch. Differentiation is an older problem, and the regulator families are partly shared. Cell death is older still, and the molecular machinery is deeply shared. The depth of molecular reuse tracks the age of the underlying problem.

Tier 1 — Adhesion is fully independent

The first problem any multicellular life form has to solve is the one Chapter 1 Part 1 flagged in passing: how do you stick cells of common ancestry together so they stay together?

Walk the row.

Animals stick their cells together with the cadherin family Part 2 introduced (plus integrins, immunoglobulin-superfamily adhesion molecules, and selectins; cadherins are the load-bearing family). These are transmembrane proteins, with cytoplasmic tails that hook into the cytoskeleton through partner proteins [Abedin & King 2008; Nichols et al. 2012; Brunet & King 2017]. Cells in your skin are held to each other by direct protein-protein contact at junctions.

Land plants use completely different chemistry. After a plant cell divides, the two daughter cells are joined by a layer of pectin, a structural sugar polymer, laid down between them and cross-linked by calcium [Daher & Braybrook 2015]. Pectin is in the cell wall, outside the membrane; the cells themselves are mortared into the sugar matrix. No cadherin, no integrin. The chemistry of plant adhesion is not the chemistry of animal adhesion.

Brown algae (kelps and their relatives) stick their cells together with a third sugar polymer, alginate, plus some related sulfated sugars [Cock et al. 2010; Michel et al. 2010]. Alginate is not pectin. The two are biosynthesised by different enzymes, polymerised from different sugars, cross-linked in different geometries. A kelp is not held together the way a tree is.

Fungi, when they go multicellular, stick their cells together with a wall composed of chitin plus β-(1,3)-glucan, two more structural polymers that are not pectin and not alginate [Gow et al. 2017]. Fungal hyphae are tubular cells joined end-to-end inside chitinous walls, and they bond to each other by anastomosis: two growing tips locate each other chemotropically, breach their walls, and merge cytoplasms. Different toolkit again.

Bacteria, in the lineages that go multicellular (cyanobacterial filaments, Streptomyces, myxobacterial fruiting bodies from Chapter 1 Part 2), stick their cells together by retaining shared peptidoglycan cell-wall material after division, supplemented by lineage-specific surface proteins, pili, and secreted matrix [Mariscal et al. 2007; Claessen et al. 2014]. Peptidoglycan is another polymer entirely, found nowhere outside the bacterial lineage.

That makes at least five non-homologous adhesion systems across the major complex multicellular origins: cadherins, pectin, alginate, chitin/glucan, peptidoglycan. Nothing matches across the row. Different category of molecule in every column. Adhesion was a new problem each time multicellularity arose, and each lineage solved it from scratch.

This is the toolkit thesis in its most precise form. Each lineage’s single-celled ancestors had different parts available: animal-stem ancestors had cadherins and integrins, brown-algal ancestors had alginate biosynthesis, plant ancestors had pectin biosynthesis. When selection pushed each lineage toward holding sister cells together, what it pulled down from the pantry was whichever adhesion-relevant molecules it already had. The “shared toolkit” is what each lineage’s particular single-celled ancestor happened to be carrying, not a single universal multicellular kit.

A pause on horizontal gene transfer — Chapter 1 Part 4 promised an unpacking

Chapter 1 Part 4 flagged that the alginate story complicates the clean adhesion-is-independent picture, and promised it would be unpacked here. This is where.

Horizontal gene transfer (HGT) is the phenomenon where, occasionally, a gene moves sideways between unrelated lineages rather than only being passed down from parent to offspring. It happens often in bacteria, rarely between distantly related eukaryotes, sometimes from bacteria into eukaryotes. Some of the genes brown algae use to synthesise alginate appear to have been acquired this way, from a bacterial donor, somewhere deep in brown-algal evolutionary history [Michel et al. 2010]. So a chunk of the brown-algal adhesion machinery is not strictly the brown algae’s own invention from their own single-celled ancestors. It was borrowed from elsewhere on the tree.

Does this break the toolkit thesis? It qualifies it without breaking it. The brown algae did not import a working multicellular wall; they imported biosynthetic genes and wired them into their own life cycle. The multicellular body is still their own. HGT complicates very strict claims about full toolkit independence; the broader pattern (different lineages reach for different molecules) holds.

Back to the spine.

Tier 2 — Differentiation reuses partly-shared parts

The second problem any multicellular life form has to solve is more subtle. Once you have stuck a lot of cells together, they cannot all be doing the same thing. Some need to be on the outside, some on the inside; some on feeding, some on dispersal; some on growth, some on reproduction. The cells need to be different from each other, even though they all carry the same genome. The technical name is cell differentiation.

To make the same genome produce different cell types in different places, the cell needs proteins that bind to its own DNA and switch particular genes on or off. Those proteins are called transcription factors, and they are the load-bearing molecules of cell identity. A transcription factor sits on a specific DNA sequence near a gene’s start; depending on which other factors are present, it pushes the gene’s transcription up or down. Networks of transcription factors regulating each other are what give a liver cell its liverness and a skin cell its skinness from the same DNA.

The transcription-factor families animals use have particular names: homeodomain proteins, bHLH proteins, T-box, NF-κB, p53, Runx, and others. Each family is defined by a protein domain (a specific 3D shape that recognises DNA in a particular way) with specific roles in animal development.

The comparative-genomics picture looks different from Tier 1.

Some of the major animal transcription-factor families turn out to be partly shared outside animals. The homeodomain family is present across eukaryotes. The bHLH family is broadly distributed. T-box, Runx, NF-κB, and p53-family proteins were thought to be metazoan-specific until Capsaspora and choanoflagellates were sequenced; they were there, in unicellular relatives of animals, long before animals existed [Sebé-Pedrós et al. 2011; de Mendoza et al. 2013; Sebé-Pedrós et al. 2017]. The Hippo signalling pathway, a regulatory network that controls organ size in animals, is also present in Capsaspora and choanoflagellates, where it appears to regulate cell morphology rather than tissue growth [Phillips et al. 2022; Combredet et al. 2025; Sebé-Pedrós et al. 2012].

So some of the differentiation toolkit is shared at the family level. Not all of it. Plants have transcription-factor families animals don’t (the AP2/ERF family, for one). Plants and animals share the homeodomain family but use different sub-classes for different jobs: animals use the Hox cluster to pattern the head-to-tail axis; plants use members of the same broad family to specify the parts of a flower [Coen & Meyerowitz 1991; Hake et al. 2004]. Plants and animals share the MADS-box family; plants used it to invent the ABC model of floral organ identity, with one combination of factors specifying sepals and another specifying carpels. Same actors, different scripts.

Fungi went a different way again. The mushroom-forming Dikarya have their own dominant transcription-factor families (Zn(II)2Cys6 zinc-cluster proteins, mating-type loci with their own logic) that do most of the developmental work in fruiting bodies [Nagy et al. 2018; Krizsán et al. 2019]. Brown algae use a TALE-class homeodomain pair (OUROBOROS and SAMSARA) to specify their sporophyte program [Arun et al. 2019]; the TALE class itself is ancient, but its use as a multicellular-life-cycle toggle is a brown-algal-specific recruitment.

The pattern at Tier 2: some regulator families partly shared by descent, some lineage-specific. Even where the families are shared, the networks (what regulates what, with what feedback, to specify what cell type) were wired independently in each multicellular lineage [Davidson & Erwin 2006; Sebé-Pedrós et al. 2017]. The parts are partly the same. The wiring is not.

Why is this less independent than adhesion? Cell-type regulation is older than multicellularity. Single-celled organisms already differentiate between life-cycle stages. A Capsaspora cell can be amoeboid one day and aggregated the next, with different transcription factors active in each state [Sebé-Pedrós et al. 2013; Sebé-Pedrós et al. 2016]. An ichthyosporean called Sphaeroforma arctica can be a multinucleate growing cell at one stage and a cellularised colony at another, with about 17% of its genes shifting transcription between stages [Dudin et al. 2019]. Choanoflagellate Salpingoeca rosetta can be a single swimmer or a clustered rosette, with stage-specific transcription [Fairclough et al. 2013]. Single cells already differentiate, temporally, between alternative states of themselves. So the toolkit for regulating cell-type-like differences was already partly assembled before any lineage had multiple cell types co-existing in the same body.

When a lineage goes multicellular, what it does to its differentiation machinery is the move you already met in Volvox: it takes a temporal differentiation system one cell ran across time and spatialises it. The temporal toolkit was older; what multicellularity invented was the spatial arrangement.

Older problem, older parts.

Tier 3 — Programmed cell death is deeply shared

The third problem is older still.

Single cells can die in two distinct ways. They can be killed from outside (physically lysed, starved, poisoned) which is just death. They can also kill themselves from inside, by activating a genetic program that demolishes the cell in a controlled, regulated way. The internal program is called programmed cell death (or apoptosis, in animals), and its molecular machinery is ancient.

The executioners in animal cells are a class of enzymes called caspases. Caspases are proteases (enzymes that cut other proteins) and they cut at very specific spots. Once a cell commits to dying, caspases are activated and the cell is dismantled rapidly: DNA cut up, organelles disassembled, membrane signalling immune cells to clear the debris. The sequence is regulated by other proteins, chief among them the Bcl-2 family, which acts on the cell’s mitochondria (its energy organelles) to decide whether the death program is triggered or held in check [Kale et al. 2018]. When the Bcl-2 balance tips toward death, the mitochondria release a small protein called cytochrome c, which triggers the cascade activating the caspases [Kale et al. 2018; Banjara et al. 2020]. That mitochondrial pathway is animal-specific in some details. Cytochrome-c-dependent triggering is mostly a mammal-and-echinoderm feature; the wider animal pattern is more variable than textbooks suggest [Coates et al. 2024]. The broader machinery is much older than animals.

Plants have their own programmed cell death: in their immune response when they wall off pathogens (the hypersensitive response), and developmentally when they build water-conducting xylem cells by programming the cells’ deaths and leaving the hollow tubes [van Doorn et al. 2011; Hara-Nishimura & Hatsugai 2011]. Where animals have caspases, plants have a closely related family called metacaspases, plus vacuolar enzymes that do similar protein-cutting work. The metacaspase and caspase families share a deep evolutionary origin; they diverged before plants and animals diverged. Fungi also have metacaspases, deployed during heterokaryon incompatibility, when hyphae of genetically unlike strains fuse and kill themselves rather than continue [Saupe 2011]. Brown algae have metacaspases too.

The mitochondria themselves carry shared death-trigger machinery in every eukaryote: AIF-like flavoproteins, EndoG, and related factors that, when released, contribute to the death program [Klim et al. 2018; Koonin & Aravind 2002]. There is a real argument about whether these shared factors are truly homologous between, say, animal apoptosis and Dictyostelium stalk-cell death, or whether their shared presence reflects the mitochondrion’s natural role as the choke point any death program would monitor [Aouacheria et al. 2013 vs Klim et al. 2018]. Either way, mitochondrial death machinery is shared at a depth no other tier comes close to.

Bacteria, finally, have something death-program-like: toxin-antitoxin pairs, where a stable toxin is kept in check by an unstable antitoxin, and if antitoxin synthesis fails the cell dies. Whether this is genuine “programmed cell death” in any adaptive sense is contested. The original proposal that the mazEF pair in E. coli mediates a population-level death program [Engelberg-Kulka et al. 2006] was challenged by replication work suggesting the population-killing effect may have been an artifact of plasmid-loss selection in the original setup [Ramisetty et al. 2016]. The honest version: TA modules can cause cell death; whether the death is selected for a colony-level benefit is system-specific and contested.

The same machinery has been retooled, across the major multicellular lineages, for very different jobs. Stalk cells in Dictyostelium slime moulds use programmed death to lift their spores skyward [Cornillon et al. 1994]. Animal embryos use it to sculpt fingers from a webbed paw, by killing the cells in the webbing. Plants use it to drop leaves in autumn and to build the dead, hollow tubes of their plumbing. Volvox uses it in the somatic cells whose mortality the regA program from Part 3 enforces. Myxobacterial fruiting bodies use it in the roughly 80% of cells that lyse to feed the survivors [Wireman & Dworkin 1977; Nedelcu et al. 2011]. Deeply conserved machinery, lineage-specific deployment. The killing apparatus is old; the uses of it are new.

Why is the machinery so deeply shared? Because programmed cell death is not a multicellular problem. Single cells were dying in a programmed way (under stress, under starvation, under DNA damage) long before there was anything multicellular to use it for. The machinery was already there. The multicellular lineages inherited it and put it to a new use.

Pre-multicellular problem. Pre-eukaryotic, in some of its parts. Deepest reuse.

The pattern, named

Adhesion is a new problem, brought into existence by becoming multicellular: every lineage built it from scratch, with five non-homologous solutions across the major origins. Differentiation is older: single cells already differentiate between life-cycle stages, so multicellular lineages had partly-assembled regulator families to reach for. Cell death is older still: single cells were already running internal demolition programs, so multicellular lineages inherited the executioner rather than inventing it.

The depth of molecular reuse tracks the age of the underlying problem.

The way to read that sentence is as a synthesis the corpus assembles, not a published finding. Each piece (programmed-cell-death conservation across eukaryotes, the partial sharing of transcription-factor families across multicellular lineages, the strict lineage-specificity of adhesion molecules) is individually well-established in the primary literature. The project’s contribution is the integration: assembling those three layers into one hierarchy and noticing that the depth of reuse at each layer tracks the age of the underlying biological problem. No published paper proposes the hierarchy in quite this form. The corpus suggests it; the individual papers support each piece.

The reason the pattern has the shape it does, once you see it, is almost mechanical. Selection cannot reuse what was not already there. Multicellular lineages reused the parts they inherited. Older problems left older inherited machinery. Newer problems had to be solved with whatever was lying around, which often was not enough, so the lineages built different lineage-specific solutions. The hierarchy is what you get when evolution can only work with what is available, run across problems of different ages.

The chapter has been doing one half of the work. Part 5 does the other half.

→ Continue with Part 5 — Co-option plus invention — what was not already there.

What this part draws on:


Part 5 — Co-option plus invention — what was not already there

The chapter has been making one half of a claim. Now the other half.

If the chapter stopped here, you might leave with the impression that everything in the animal toolkit had been borrowed from single-celled ancestors and rewired. That impression would be misleading. Some genes are genuinely animal, in the sense that they cannot be found in choanoflagellates, cannot be found in Capsaspora, cannot be found in any of the close single-celled relatives whose genomes have been read. They appeared on the animal stem after the split from those unicellular relatives, somewhere in the long window from roughly 700 to 800 million years ago [Richter et al. 2018; dos Reis et al. 2015].

The cleanest comparison is the one done by Daniel Richter and colleagues across 21 choanoflagellate species, quantifying how many gene families animals have gained, kept, and lost relative to their unicellular sister groups [Richter et al. 2018]. The result is the mixed picture the toolkit thesis predicts: most of what animals use, they kept from their unicellular ancestors; some they elaborated (the cadherin family expanded, the homeodomain family expanded); some they invented at the animal stem.

The major animal-stem inventions:

  • The full Wnt signalling pathway. Wnts are secreted lipid-modified ligand proteins that bind receptors on neighbouring cells and pattern body axes, telling cells which side of the body they are on, when to grow a structure, how to organise a tissue. Bits and pieces of the pathway turn up outside animals, but the full canonical pathway (secreted ligand, receptor, downstream transducer, transcription-factor output) is an animal-stem assembly [Sebé-Pedrós et al. 2017; Gazave et al. 2009].
  • The full TGF-β signalling pathway. TGF-β ligands (transforming growth factor beta, and the related BMP family) are another set of secreted signalling proteins. Animal embryos use them to lay out the body’s basic geometry: dorsal versus ventral, anterior versus posterior, which tissues become bone, which become skin. Like Wnt, the full pathway is an animal-stem innovation [Sebé-Pedrós et al. 2017].
  • Most neuron-specific gene families. The voltage-gated ion channels that animals use specifically for neuronal firing, the neurotransmitter receptors, the synaptic-vesicle machinery: all of it is largely animal. Some components have deeper roots (the ion-channel superfamily is ancient), but the constellation of neural-specific genes was assembled on the animal stem [Sebé-Pedrós et al. 2017].
  • The Hox cluster. Hox genes are a set of homeodomain-class transcription factors arranged in a tight cluster on a chromosome, where their order along the chromosome corresponds to their domain of activity along the head-to-tail body axis. Which Hox gene is active in which region tells cells whether they are head, thorax, or tail. The broader homeodomain class is ancient; the cluster organisation, with its strict colinearity between gene order and body-axis position, was assembled at the animal stem [Sebé-Pedrós et al. 2017].

There are others (protocadherins expanded specifically in vertebrate nervous systems, certain Notch pathway components elaborated in metazoans, vertebrate immune-system genes along the vertebrate stem) but Wnt, TGF-β, neural-specific genes, and Hox carry the weight of the animal-stem additions. They are not in the unicellular relatives. They appeared after the split.

The correct framing of what this chapter has shown, then, is the toolkit is co-option plus invention.

The two words do unequal work. The plus matters. Most of the animal molecular toolkit (adhesion molecules, signalling kinases, differentiation regulators, cell-death machinery) was co-opted from unicellular ancestors. Some of it (the full developmental signalling pathways, the body-plan-patterning gene clusters, the neural-specific families) was assembled on the animal stem after the split. The unicellular relatives reconstruct the common toolkit the ancestor was carrying. The animal stem added what makes animals specifically animal: the signalling logic that lets an embryo lay out a complex body plan, the cell types that build a nervous system, the regulatory cluster that turns a ball of cells into something with a front and a back.

This framing avoids two failures at once. The overclaim that everything was already there misses that the unicellular relatives recapitulate most of the kit but not all of it. The older popular framing, that evolution invented the animal toolkit in a burst at the animal origin, misses that the bulk of the kit predates animals by a long way.

Co-option plus invention.

→ Continue with Part 6 — What the chapter answers, what it doesn’t, and what comes next.

What this part draws on:

  • Gene-family innovation, conservation, and loss on the animal stem: content/02-model-systems/close-animal-relatives.md; content/02-model-systems/choanoflagellates.md (“The pre-animal molecular toolkit”); [Richter et al. 2018].
  • Wnt/TGF-β/Notch/Hox as animal-stem additions, neural-specific gene families: content/03-molecular-toolkit/cell-signaling-and-quorum-sensing.md; [Sebé-Pedrós et al. 2017]; [Gazave et al. 2009].
  • Animal–choanoflagellate split with wide credible intervals: [dos Reis et al. 2015].

Part 6 — What the chapter answers, what it doesn’t, and what comes next

The chapter began with a paper from 2008 and ends with a question the paper didn’t ask.

What it answers is where the molecular toolkit came from. It was already there. Different parts of it at different depths (adhesion machinery built from scratch by every lineage, differentiation regulators partly shared, cell-death machinery deeply shared) but in nearly every case the parts predate the bodies. Multicellular lineages reused what they inherited. They did some inventing — Wnt, TGF-β, the Hox cluster, the neural-specific families — but the bulk of the kit was older than the multicellular body it now builds. Multicellularity rewired what existed, more than it invented from scratch.

What the chapter does not answer is the next question, and the gap should feel like a gap.

If the parts were lying around in single-celled organisms across the tree of life, if many lineages had access to the cadherins or the integrins or the metacaspases or the equivalent, then why didn’t every lineage become multicellular? Why did some lineages reach into the pantry and put together a body, while others kept the same parts in stock and never built anything? And once a lineage did go multicellular, what made the change stick? Why did some lineages get permanently locked into multicellular life while others (choanoflagellates, Dictyostelium, Streptomyces) kept the multicellular state as a facultative phase that they could fall back out of into single-celled life?

That is the question Chapter 4 is built to answer.

The answer involves something that, in the corpus and the documentary, gets called a ratchet. A ratchet, in this evolutionary sense, is any change that makes going back harder than going forward: an evolutionary modification that is selectively favoured in the multicellular state and that makes a return to single-celled life costly. Once a lineage has accumulated enough of those changes, the multicellular state stops being a phase the lineage can drop in and out of. It becomes the only state the lineage can live in. The door, having swung open into multicellularity, has acquired traps and counterweights that prevent it from swinging back.

The next chapter unpacks what ratchets are, how they accumulate, and where they have been documented, including, for the first time since the cyanobacterial timeline in Chapter 1, a piece of footage-worthy science: a long-running yeast experiment in which the ratchets have been watched closing in real time, generation by generation.

That is Chapter 4. Traps. Doors that only open one way.

→ Continue with Chapter 4 — The Trap. (Not yet written.)

What this part draws on:

  • The unanswered question raised by the toolkit story (why did some lineages become multicellular and stick, while others didn’t?): content/00-framework/ratchet-mechanisms.md.
  • The Volvox regA example as a bridge between Chapters 3 and 4 (a co-option in Chapter 3 that is also a ratchet in Chapter 4): content/02-model-systems/volvox-and-volvocines.md.
  • The “toolkit predates multicellularity” synthesis bringing the chapter to rest: BIG-PICTURE.md §“What we are confident about” item 2.

End of Chapter 3 (draft state).

When complete, this chapter should be readable in one sitting (~9,000 words across six parts) by someone who has read Chapter 1, and should leave them with the three-tier picture as an image they can hold without consulting any other document. The chapter is concept-led where Chapter 1 was case-led, and the longest part (the three-tier hierarchy) is positioned exactly where the documentary positions Beats 3.7–3.10: as the analytical payoff, undivided, with the framing hedged on the page rather than only in tone. The next chapter, 04-the-trap.md, picks up the question of why some lineages locked their multicellularity in and others did not, and brings the yeast experiment promised in Part 6.