Federation Cannot Anchor Itself – GRUAN and the Hidden Stewardship Layer in Global Observing Systems
Status: Pattern/Provisional
SCENE: I went looking for the hole in my own argument.
The stewardship essay I have been working on makes a claim that feels right from operational experience but needs to survive scrutiny from people who know this domain from the inside: that federation is necessary but not sufficient for governing systems that require long-term calibration integrity. Federation coordinates contribution. It cannot audit itself against drift. Someone has to hold that obligation permanently, outside the federated structure, or the drift accumulates invisibly until the record is compromised.
The obvious challenge is the Global Observing System. If you want a counterexample to “federation cannot govern calibration,” GOS is the one to reach for. Roughly 193 member states contributing observations under WMO coordination, no single controlling authority, voluntary participation, running for decades. It is one of the most ambitious and durable examples of international federation in the history of science. And it works.
My first reaction, sitting with that, was straightforward: oh shoot. How could I have overlooked something that obvious?
So I dug deeper rather than deflecting. (I did deflect it first, I will admit. But not for long.)
A note on how I stress-test arguments. Years spent working in national critical infrastructure protection conditioned a particular habit of mind around failure analysis. The question is never “can I construct a scenario in which this breaks?” Almost anything breaks under sufficiently contrived conditions. The useful questions are different: How likely is this failure mode? Under what conditions does it obtain? How often would those conditions recur? What is the pattern, not just the edge case? That framing keeps stress-testing from becoming an exercise in pedantry — the hunt for the mathematically hack-proof system, the invulnerable infrastructure. Those do not exist and pursuing them is a distraction. What exists are systems with well-understood failure modes and systems with poorly understood ones. The goal is to move from the second category to the first. When I found the GOS counterexample, the infrastructure protection instinct kicked in automatically: not “does this break my argument” but “how likely is this, and what conditions would have to obtain for it to represent a genuine refutation rather than an edge case.”
The first crack appears when you separate the timescales…
GOS is an operational system feeding numerical weather prediction models that run every six hours. Quality requirements are calibrated to “good enough to improve the forecast.” If a radiosonde drifts slightly, the data assimilation system partially corrects for it, and tomorrow’s forecast is verified against reality within days. Errors are self-correcting because the feedback loop is fast. The federation is robust to imperfection in a way that actually makes sense given what it is trying to do.
GCOS (the Global Climate Observing System) operates under a categorically different obligation. It exists to detect signals that emerge over decades. A calibration drift invisible in a weather forecast becomes a false climate trend signal when you integrate it over thirty years. You cannot go back and re-fly a 1995 radiosonde. The tolerance for imperfection is orders of magnitude smaller, and the feedback loop is measured in generations, not days.
Same atmosphere. Fundamentally different quality obligation. The counterexample was not about climate-scale calibration at all. It sounded like it was because GOS and GCOS share physical infrastructure, but the quality regime governing climate records is categorically different from the one governing operational forecasting.
Then I found GRUAN, and the oh-shoot feeling reversed entirely (mostly).
The GCOS Reference Upper-Air Network is a small, deliberately non-federated network of roughly 30 stations built specifically to produce reference-quality upper-air observations for the climate record. Centralized data processing. Mandatory uncertainty characterization traceable to SI units. Strict protocols that individual stations cannot negotiate away. It exists because the WMO community looked at the federated operational network and concluded it could not anchor itself against long-term drift. So they built a separate non-federated structure to do it.
Nobody theorized this first. They built the network. GRUAN is an engineering solution to a governance problem, implemented in hardware before anyone wrote down the governance theory underneath it.
The North American analog makes the pattern harder to dismiss as a peculiarity of international governance. NOAA built the US Climate Reference Network starting in the early 2000s for exactly the same reason: the existing surface observing networks (the Cooperative Observer Program and the Automated Surface Observing System) were adequate for operational weather but not for long-term climate trend detection. COOP stations are sited wherever volunteers agree to host them (I know this from the inside: my neighbor, a professor of meteorology and former Mount Washington professional meteorologist, maintains one a few miles away, and I cover for him when he travels… including standing in his yard in sub-zero temperatures, melting frozen precipitation to take the reading. It is a meaningful addition to a winter morning. It is also a direct encounter with what a volunteer-dependent, distributed network actually looks like at the instrument level, which is not nothing when you are making claims about reference quality and calibration traceability).
ASOS stations are sited for aviation safety. Neither was designed with the stability and traceability requirements that climate records demand. So NOAA built a separate network of roughly 114 stations with pristine siting criteria, triple-redundant instrumentation, and centralized data processing alongside the existing federation rather than waiting for the federated network to develop that capability internally.
The instructive detail is that NOAA owns the entire US surface observing enterprise. No sovereignty negotiations required. No member state councils. And they still concluded that the operational network could not serve as its own reference baseline. If the stewardship layer is necessary even when a single institution holds full authority over the whole system, the case for it in a genuinely multi-sovereign federated environment like GRUAN or the satellite calibration chain is considerably stronger, not weaker.
The counterexample had become the evidence.
BREAK: Two things clarified that are worth holding separately.
The GOS/GCOS distinction is not a technical footnote. It is the load-bearing structure of the stewardship argument, and any version of the argument that does not make it explicit deserves the GOS objection it will receive. Federation’s self-correcting properties depend on a fast feedback loop between observation, output, and verification. Remove that loop (as you must when the signal takes decades to emerge) and the self-correcting mechanism fails. That is not a difference of degree. It is a difference in kind.
The second thing is what GRUAN establishes about the posture of the argument itself. The stewardship essay was never designed as a proof in the mathematical sense. It began the way most of my thinking does: from noticing that systems I had worked in, or adjacent to, seemed to require something federation alone could not provide. A fixed reference point outside the federated structure that the federation could be measured against.
Ground control points work this way in remote sensing. A GCP does not participate in the mosaic it anchors. It sits outside it with independently verified coordinates, and everything else is registered to it. Remove the GCPs and the mosaic is internally consistent but externally adrift. The calibration chain problem felt structurally similar when I was writing the essay. GRUAN turned out to be not just an analogy but the actual instantiation of the same principle in observing system architecture, arrived at by engineers solving a practical problem rather than by anyone theorizing governance.
That convergence matters to me. Remote sensing field practice and atmospheric reference network design arrived at the same structural solution independently. The essay is an attempt to name what GRUAN already is. Whether that naming is useful to the community is a legitimate question. Whether the underlying observation is correct feels considerably less open.
SCHEMA (Provisional)
The stewardship layer in a federated observing system is not optional infrastructure the federation can eventually develop internally. It is a precondition for the federation’s long-term validity at climate timescales.
Federation coordinates contribution. It cannot audit itself. A federation of autonomous entities can agree on data formats, exchange protocols, and minimum quality standards. It cannot independently verify whether any member’s contribution has drifted relative to an external reference, because doing so requires a reference that sits outside the federation and is not subject to the same institutional pressures that cause drift in the first place.
GRUAN is that reference for the upper-air climate record. The GCOS Surface Reference Network is being built to serve the same function for surface observations. Their existence is institutional acknowledgment that the community recognized this structural gap and built non-federated infrastructure to fill it. The satellite calibration chain presents the same structural requirement. The question is not whether a stewardship layer is needed. The question is whether it exists and whether it is adequately resourced.
Held provisionally pending engagement with practitioners who work inside GRUAN or GSICS, and pending peer review response on this dimension of the argument.
ADDENDUM: Anticipated Objections
The GSICS objection
The Global Space-based Inter-Calibration System is the most technically sophisticated counterexample available. GSICS inter-calibrates operational satellite sensors against reference instruments, producing correction coefficients that allow sensors from different agencies to be harmonized. A reviewer could reasonably argue this is exactly the cross-agency calibration coordination the essay claims is missing.
The distinction is between coordination and stewardship. GSICS harmonizes current observations across currently operating sensors. It does not hold the long-term climate data record. It does not maintain methodological continuity across instrument generations (the continuity that allows a 1985 Meteosat observation and a 2025 observation to sit on the same calibrated baseline). It produces correction coefficients. It does not hold the obligation to ensure those coefficients remain traceable across decades of sensor succession. That obligation sits inside individual satellite operators. Whether any given operator treats it as a permanent stewardship commitment or as a current operational task is precisely the governance question the essay raises. GSICS assumes the answer is satisfactory. GSICS provides the math to align the federation, but it is not the physical anchor point. It assumes the ground truth will always be there. The essay questions whether that assumption is warranted.
The “it already works” objection
A softer version: reanalysis products exist, C3S (the Copernicus Climate Change Service), produces long-term climate data records, so where exactly is the governance gap?
The existence of products does not confirm the existence of governance architecture needed to sustain them. The Meteosat climate data record exists because EUMETSAT chose to invest in its reprocessing and maintenance. That choice reflects institutional culture and current leadership priorities.
The stewardship argument is not that the record does not exist today. It is that its continued existence depends on an institutional commitment not formally embedded in the governance structure the way GRUAN’s protocols are embedded in its station requirements. Culture is not architecture. When culture changes, architecture holds. When culture changes and architecture is absent, the record is at risk.
Last Updated on April 3, 2026





