Evaluation’s Journey towards the future, Part 1: Ancient tributaries
<p><img src="https://www.zendaofir.com/content/images/2025/11/picture1.jpg" alt width="800" height="468"></p>
As we chart the future, the earliest beginnings of evaluation will continue to shape the field. This is something we need to honour and use.
Imagine a river …
"A river that forgets its source will dry up" (Akan in Ghana). "When drinking water, think of its source" (Zhou dynasty China). A tree that forgets its roots will not grow (Andean Quechua). The river's flow is our kinship (Indigenous Australian).
These proverbs stress recognising and respecting what has come before. In many cultures, water metaphors symbolise the bond between past and present. They all reflect the same concept: To thrive, one must honour the source-a spring, a river, the ancestors who sustain life's flow.
The field of evaluation, too, must remember its source.
Let us take a moment to imagine evaluation's journey through history as a vast river system, winding through time as it journeys to the ocean, fed by a diversity of tributaries all merging and flowing into the grand river of evaluation.
Each of these tributaries are shaped by unique aspirational, cultural, ecological and political landscapes. The river's journey therefore reflects not just geographic (that is, technical, methodological) shifts, but deeper currents-how societies define progress, success, values, power, ethics and more.
It emphasises that the origins of what we know today as formal, systematic evaluation practice are deep, varied and interconnected. Long before the river of modern evaluation carved its course, humanity's earliest civilisations nurtured trickling springs, brooks, streams, canals and aquifers that flowed into the currents of evaluation we recognise today. They differ from today's formalised field, but they reveal humanity's universal instincts to assess, to adapt, to govern, to strive to be successful.
The Middle East and North Africa (MENA): Accountability springs
These earliest trickles towards evaluation tributaries were source-driven, often resulting in oases-rooted in ancestry, land and cosmology. Here, evaluation arose organically from deep, often unseen cultural systems.
More than 6,000 years ago in Mesopotamia-the area that is now eastern Syria, southeastern Türkiye and most of Iraq-clay tablets recorded harvests and taxes, forming early springs of accountability that merged record-keeping with rudimentary practical assessments of "what works".
Around 4,500-5,000 years ago, ancient Egypt's Nile governance channeled labour quotas and worker welfare into currents of resource management, balancing pyramid ambitions with human endurance. Scribes meticulously recorded grain yields, labour quotas and tribute payments, evaluating compliance with pharaonic decrees. The Maat principle-truth, balance, order-judged rulers and citizens by ethical standards, reflecting a cultural evaluation framework.
Indigenous societies: Harmonious, reciprocal aquifers and brooks
These ancient practices reflect underground aquifers, sustaining societies through symbiosis with the land and cosmos. They nourished the river invisibly; colonial erasure obscured these contributions, relegating them to underground aquifers and some hidden natural brooks-flowing, adaptive and localised, shaped by terrain and emphasising the balance between humans and nature.
More than 4,000 years ago, ayni ("today for you, tomorrow for me", a concept fundamental to Andean cultures) wove mutual support and reciprocal exchange into communal well-being, merging data with relationality and establishing a guiding principle for communal labour, resource sharing and social obligations-a way of life promoting harmony with others, community and nature. It required an implicit, ongoing evaluative process where social relations and communal effectiveness were continually assessed and recalibrated to maintain balance and reciprocity.
Since more than 3,000 years ago, Polynesian navigators have read stars and waves, accumulating empirical knowledge passed down orally and refined over generations-oceanic evaluation in real-time, using stars, ocean swells and environmental signals, and adapting voyages to unseen tides.
Around 3,000 years ago the Maya developed one of the most sophisticated calendar systems in the ancient world. Their deep understanding of astronomy was tightly intertwined with agricultural cycles and ritual practices-using calendars and celestial observations that guided crop cycles and resource distribution. The periodic adjustments and calendar refinements reflected a continuous evaluation of natural cycles to harmonise cosmic events with agricultural and social needs.
More than 2,000 years ago, the existence of Amazonian terra preta, an anthropogenic soil created through Indigenous methods, indicates long-term, sustainable agricultural practices and ecological management. Their ecological evaluation involved ongoing practices of observing, testing and adjusting soil management techniques to maintain land fertility.
Around 900 years ago, the Iroquois Confederacy used a sophisticated consensus-based decision-making process encapsulated in the Great Law of Peace. It required continuous communal reflection and consensus-building, evaluating community needs and intertribal relations-a dynamic process that guided the political and social life of the Confederacy over centuries.
Around 600 years ago, the Incas' quipu was a sophisticated information system of record-keeping that involved knotted cords. It quantified and recorded harvests, census data and labour obligations. It provided a means for the Inca Empire to maintain administrative control, track economic activity and manage state resources, blending administrative precision with communal accountability.
Sub-Saharan Africa: Relational aquifers
Long suppressed by colonisation, these evaluation systems reflect intergenerational reservoirs of wisdom, rooted in place-hidden systems sustaining life over the ages.
From around 3,000-4,000 years ago, precolonial West African societies evaluated leaders based on moral character and communal well-being, emphasising wisdom and harmony. Leadership was not solely about power; it was relational and ethical, often rooted in consensus, restorative justice, and collective care. These empires understood governance holistically: Military power maintained borders. Economic systems ensured resource flows. Legal and spiritual codes grounded authority in ethics and divine legitimacy. And public consultation and scrutiny-often oral and face-to-face-meant rulers were not above critique. A good leader was expected to embody wisdom, self-control and fairness, promote communal harmony and be accountable to the people. Their legitimacy was sustained by communal validation.
From around 1,500 years ago, the powerful and highly organised precolonial Ghana and Mali empires-far from being "stateless" or informal societies, as colonial narratives often claimed-had complex bureaucracies, legal systems, and mechanisms of accountability. Their multi-layered governance systems were based on a mix of councils, public consultation, legal codes and economic audits. Councils and the public in the Mali Empire assessed leaders on their governance, military leadership and management of resources. The dina (legal code) evaluated regional governors' and local administrators' loyalty to central authority, justice and fairness in applying law, and resource stewardship. Trade audits assessed gold and salt flows, blending economic and administrative oversight. The Ghana Empire had similar practices in place, even earlier. Its rulers were also custodians of the land and spiritual authorities, which added a moral and symbolic dimension to their evaluations.
For more than 2,000 years in Southern Africa, the Ubuntu philosophy ("I am because we are") fostered a collective approach to conflict resolution, social justice and governance, and to the assessment of community wellbeing. Elders played a key role in evaluation, guiding decisions that balanced the interests of the community as a whole, often through dialogue and restorative practices. Instead of relying on individual metrics or rigid laws, evaluation was therefore relational, focusing on how actions affected the community's interconnected fabric.
Around 800 years ago in the powerful Kingdom of Great Zimbabwe, trade evaluation was necessary for ensuring fair barter and quality control in the exchange of goods like gold, ivory and salt. Merchants and traders engaged in systems of mutual trust, and decisions about trade were often made on the basis of personal reputation and community verification-an early form of supply-chain evaluation, where quality and fairness were key criteria. Trade was not just about economic exchange but about maintaining trust and social relationships. The evaluation process ensured that transactions were equitable and that the exchange benefitted the broader community, not just individual traders.
The East: Interconnected irrigation systems
The East's practices were strategic, engineered, interconnected, carefully constructed and state-regulated, with direct flows to specific outcomes (e.g. talent selection, policy monitoring), requiring maintenance and coordination that reflect centralised order and societal harmony.
In India, around 2,300 years ago, the Arthashastra treatise by scholar Chanakya systematised the evaluation of economic policy, intelligence gathering, taxation and economics, and public welfare and ethics. It introduced metrics for taxation, crop yields and bureaucratic performance, blending pragmatic governance with ethical accountability-a precursor to modern policy evaluation. Like a strategic irrigation network, it balanced resource allocation with state control.
Around 2,200 years ago until 1905, starting with early recruitment systems, China's Imperial Examination System was one of most sophisticated examples in ancient times of what can be considered a systematic, formal form of evaluation. It channeled talent into governance like a structured canal system, selecting government officials based on merit rather than on hereditary or aristocratic status, prioritising knowledge and judgment over birthright, and embedding evaluation into governance. After 1905 it evolved into the highly competitive National Civil Service Examination (guokao), up until today China's primary means of recruiting new civil servants.
Around 1,300 years ago the Khmer in Cambodia evaluated hydrological data to build vast and sophisticated canal and reservoir networks. Labour quotas, resource allocations and environmental flows were tracked to sustain Angkor Wat's construction. Directing dynamic responses to monsoon variability, it demonstrated early adaptive evaluation of ecological and human systems.
Around 1,000 years ago, influenced by China, Korea adopted their own merit-based systems. The gwageo examinations blended knowledge of Confucian classics, law and military strategy with local governance needs. It formalised meritocratic evaluation for bureaucratic roles, ensuring state efficiency and indirectly shaping Korea's modern emphasis on standardised testing.
The West: Hierarchical aqueducts and streams
Ancient Western systems reflected legalistic, formalised structures built to transport and control public resources. They were often segregated from natural systems, and infrastructure-like with clear roles, accountability and visibility. They leaned toward written documentation, legal accountability and hierarchical oversight, laying the groundwork for modern bureaucratic evaluation.
Around 2,500 years ago, in democratic Athens, leaders were evaluated through early civic evaluations. In public audits (euthyna), assemblies debated policies and military campaigns, and ostracism served as an early risk assessment tool. Citizens assessed officials before and after service, with mismanagement tried in public courts.
Around 2,300 years ago in the Roman Republic, formal offices like the censor tracked citizens' conduct, tax obligations and official expenditures, as well as citizens' moral standing and military readiness, shaping tax and conscription policies. Governors were subject to post-service legal scrutiny. Engineers assessed aqueducts and roads for public utility and durability.
Around 1,600 years ago, by the High Middle Ages, medieval guilds in Europe assessed apprentices through peer-reviewed skill tests, and the Catholic Church managed agricultural tithes through detailed fiscal records - linking spiritual duty to economic contribution.
Tributaries converge, flowing over the centuries
These flows converged into tributaries that fed the river of evaluation we know today. They gifted us with instincts-identifying, measuring, adapting, judging, governing in line with rules and expectations - that are still in use today. They remind us that evaluation is not a modern invention but an ancient human reflex. Although the distance between then and now is vast, like a river that begins as mountain springs and grows into a delta meeting the ocean, traces of these primordial waters still ripple in our instinct to seek and understand what works where, why and how; for whom, and under what timeframes and conditions.
MENA's Accountability Springs: Guided by ethics and locally grounded, influencing both administrative, structured governance and moral accountability.
Indigenous Harmonious, Reciprocal Aquifers and Brooks: Ecological, ceremonial, adaptive and communal, blending data with reciprocity, directly and indirectly inspiring participatory methods and ecological evaluation, challenging extractive frameworks.
Africa's Relational Aquifers: Subterranean and communal (e.g., Ubuntu, Mali's dina), they continue to emphasise harmony, localisation, participatory evaluation and oral dialogue and storytelling that emphasise the balance between humans and nature. Passed down through oral traditions and based on community approval, these concepts are now reflected especially in participatory and decolonial evaluation.
The East's Interconnected Irrigation Systems: Engineered, state-directed, order-maintaining, merging meritocracy with governance, their legacy rippled into Enlightenment-era European bureaucracies and technocratic governance. They remain visible in today's emphasis on standardised metrics, as well as on the need for connected systems for integrated management, adaptation and holistic approaches.
The West's Hierarchical Aqueducts and Streams: The West's aqueducts and dominant streams of information became the longest tributaries of technocratic evaluation-continuing to prioritise hierarchy, control and quantification, shaping modern evaluation's obsession with metrics, audits and evidence-based policy.
Colonisation and industrialisation dammed the river, privileging Western streams and leaving few direct traces of others. Yet their legacies resurface and nourish today's decolonial currents. They illustrate that non-Western frameworks of evaluation were not separate from life; evaluative thinking and practices were woven into the fabric of relationships, economy and governance. They were qualitative rather than purely quantitative, rooted in values like harmony, justice, fairness and collective wellbeing, and conducted through interpersonal processes, not impersonal reporting. Rather than written reports or numerical metrics, evaluation was often conducted through oral dialogue, observation, ritual and consensus, led by elders, councils or spiritual authorities. These processes emphasised moral character, social harmony, ecological balance and accountability to the community and ancestors. Standards of success were rooted in local cosmologies and cultural values, and decisions were made with consideration for intergenerational impact. Evaluation was not a separate function, but an integrated practice within systems of governance, trade, and social life-ensuring stability, reciprocity, and coherence among the collective.
These practices have led to the increasing emphasis today on community-led evaluations, participatory development, ethical trade and social auditing, restorative justice models.
The River's Lesson: Diversity as strength for the future
The river of evaluation flows onward, but it thrives because its tributaries reflect great diversity:
Springs: Ethical and ecological depth. Original, vital cultural wisdoms, signifying intrinsic, natural accountability, local groundedness, and ethical and ecological depth.
Aquifers and brooks: Relational, harmonious, reciprocal accountability. Hidden, often unacknowledged reservoirs and flows of wisdom and practice which, while not in plain view, are essential for sustaining the overall system of evaluation.
Irrigation systems: Structure, connectedness and meritocracy. Engineered, directed, strategic and interconnected mechanisms that channel talent or information systematically for quality statecraft.
Aqueducts and streams: Bureaucracy and scale. Continuous, visible flows over long distances, working to represent practices that are documented and regularly flowing within formal systems.
Today, as the river swells with the polycrisis-climate and biodiversity collapse, AI risks, severe inequality and many more-it must draw from all its sources.
The ocean ahead demands not just navigation tools but a reimagined voyage, where ancient aquifers, springs, irrigation systems, aqueducts and streams flow together to shape a river that feeds the ocean, where we can navigate guided by the stars of the dispositions and values that define each of our societies.
Member discussion