Our vision is of a world where efforts to prevent and alleviate animal suffering are as efficient and cost-effective as possible. Where decisions across various domains (economy, environment, investment, purchasing) are also informed by their impact on the welfare of animals. Achieving this vision depends critically on properly measuring, and mapping, the magnitude of animal suffering across conditions, systems and species.
Ultimately, we would like to help transform the understanding of animal and human suffering, shifting it from an abstract concept to a scientifically measurable and extensively mapped phenomenon across all sentient beings. In doing so, we seek to help empower society (legislators, environmentalists, economists, advocates, funders, investors, consumers) to make decisions and take actions more closely aligned with ethical values that aim to minimize harm.
To inform efforts and decisions for the prevention and alleviation of animal suffering. To this end, our research is aimed at providing consumers, funders, researchers, producers, investors, advocates and other decision-makers with a comparative, meaningful and easily interpretable metric (1) of the magnitude of negative and positive welfare embedded in animal-sourced foods produced from different species and under different conditions and (2) of the welfare impact of interventions, standards, reforms, policies and purchasing chioces. As a side-product of our research efforts, we also aim to promote more targeted research and funding in the animal welfare sciences.
The assessment of the time spent in affective states (negative and positive, physical and psychological) of different intensities in any scenario or species. The use of a universal metric with biological meaning (time in pain and pleasure) enables comparing and combining the cumulative load of affective experiences emerging from different environments, industry practices, production systems or interventions, and provides the basis for the development of welfare footprints of animal-sourced products.
For measurement purposes, the terms pain and pleasure are used as a shorthands, respectively, for any felt ‘negative (unpleasant) affective state’ or ‘positive (pleasant) affective state’. More simply, anything that ‘feels bad’ or ‘feels good’. States with a somatic origin are referred to as ‘physical pain’ (e.g. aches, hunger, injuries, thermal stress) and ‘physical pleasure’ (e.g., the taste of a preferred food, the pleasantness of a sexual activity), while those related to the primary emotional systems are referred to as ‘psychological pain’ (e.g. fear, frustration, boredom) or ‘psychological pleasure’ (e.g., the pleasantness associated with positive social interactions, play bouts).
Research and decisions aimed at improving the quality of life of animals are already guided by assumptions about their affective states and how they are modulated. For example, requirements for standards such as minimum space allowance, enrichment, outdoor access and stunning prior to slaughter (inputs) necessarily assume that these practices lead to improved well-being (i.e., reduce the intensity, frequency and/or duration of negative affective experiences). Other welfare assessments models also rely on indirect indicators of affective states, such as behavior, neurophysiology and responses to pain-relieving drugs. Ultimately, any form of welfare assessment or attempt to improve animal welfare is inherently based on an assumed knowledge of what positively or negatively affects their inner experiences.
The Welfare Footprint Framework makes these comparisons possible by measuring the impact of experiences rather than categorizing them by origin. For both physical pain (like lameness) and psychological distress (like frustration from inability to nest), the framework applies the same universal criteria to estimate intensity. For example:
By answering these questions through behavioral, physiological, and pharmacological evidence, the framework places diverse negative experiences on the same intensity scale (Annoying, Hurtful, Disabling, or Excruciating), regardless of their source. This approach recognizes that what matters to the animal is how the experience affects its life, not whether the suffering originates from a physical injury, social deprivation, or thwarted motivation.
A flowchart with the main analytical steps is provided in our methods page and in this video.
Subjectivity in the categorization of pain intensity is a problem that permeates all intensity scales, both numerical and nominal. The advantage of nominal scales lies in removing one layer of complexity in the attribution of intensity to the pain experience: ‘mild’, with all its problems, is self-explanatory to most individuals, whereas ‘5’ must be explained and externally referenced. Use of discrete categories also avoids the problem of forcing numerical ratings that may not necessarily be linear into a linear scale, and prevents the ceiling effects that are frequently observed with numerical ratings of pain (where ratings close to the limits of the scale are constrained, increasing in intensity by smaller amounts than the intensity effectively experienced).
We sought to reduce the ambiguity of widely used terminologies (mild, moderate, severe) by using instead terms which, in addition to being semantically more specific, are qualified by specific references and criteria. Use of these criteria enables establishing more specific thresholds in the attribution of pain intensity that reduce the likelihood of misclassification, facilitating comparisons across conditions and individuals. For example, the pain of a fracture at the time of rupture matches the definition of a Disabling pain, not because this is the verbal description preferred by patients, but because it matches the definition of this intensity category: it captures nearly all the individual’s attention, sufferers are unable to perform other activities and strong analgesia is commonly required. Additionally, in this process the intensity and duration of the pain experience are completely disentangled (in some other approaches, the pain associated with a condition may be considered ‘severe’ because it lasts longer), further reducing variability in the classification process.
The framework evaluates Affective Experiences across four discrete categories of intensity. Periods where the animal is in a neutral state are recorded as ‘No Pain’ or ‘No Pleasure’, representing the absence of these affective states rather than a level of intensity.
The four active categories (Annoying, Hurtful, Disabling, and Excruciating for Pain; Satisfaction, Joy, Euphoria, and Bliss for Pleasure) were not devised by arbitrarily splitting a numerical continuum. Instead, they are operationally defined based on the degree of disruption or engagement they cause to the animal’s attention and normal functioning. This approach is rooted in the evolutionary expectation that greater threats or rewards require stronger signals to capture the animal’s attention and prioritize adaptive responses over competing demands.
While a specific category might encompass a range of subtle subjective variations, these four active levels represent an optimal balance between precision and clarity. By anchoring each category to observable, empirically testable criteria—such as behavioral disruption, cognitive impairment, and pharmacological responses—the framework minimizes the risk of misclassification and avoids the ceiling effects and ambiguities common in continuous numerical scoring systems.
Although pain inherently concerns individuals, we operationally accept that the collective welfare of the members of a population can also be determined. Measuring cumulative pain at the population level is also necessary to account for the heterogeneity in the exposure of population members to different challenges. For example, while lameness is experienced by a large fraction of broiler chickens, fatal cases of ascites are only experienced by a few. Therefore, measurement efforts must consider the prevalence of each welfare challenge, so that pain is determined for the average member of the population (which may not necessarily correspond to any real organism). At the population level, the time spent at each level of pain intensity by the ‘average population member’ as a result of each challenge is determined by multiplying it by its prevalence. For example, if a condition causes 10 hours of Disabling pain and 70% of the population are affected, then the average member of this population could be said to experience 7 hours of Disabling pain due to the condition. Measurements at the population level enable comparing the impact of different practices and conditions across demographics, geographies, and time
To estimate the intensity of pain (or pleasure) animals experience, the Welfare Footprint Framework uses a systematic evidence synthesis approach that examines multiple indicators across disciplines. For each challenge, evidence from behavioral observations, neurophysiological measurements, pharmacological responses (like effects of pain-relieving drugs), and evolutionary considerations is collected and evaluated against defined intensity criteria. Rather than forcing a single intensity category, the framework assigns probability distributions across intensity categories (Annoying, Hurtful, Disabling, or Excruciating for pain; Satisfaction, Joy, Euphoria, or Bliss for pleasure) to reflect both assessment uncertainty and natural variation in how different individuals experience the same condition. This approach makes all evidence and assumptions explicit, highlighting knowledge gaps while providing transparent, scientifically-grounded estimates that can be updated as new evidence emerges.
In the Welfare Footprint Framework, translating evidence into probabilities for pain or pleasure intensity categories follows a structured evaluation process. First, each piece of evidence (behavioral, neurophysiological, pharmacological, or evolutionary) is systematically rated for its consistency with the criteria defining each intensity category (Annoying, Hurtful, Disabling, or Excruciating for pain; Satisfaction, Joy, Euphoria, or Bliss for pleasure). For example, evidence that an animal performs vigorous play behavior might be rated as consistent with higher pleasure intensities but inconsistent with mere Satisfaction. Each piece of evidence is marked as consistent (+), inconsistent (-), rejecting (R), or unclear (?) for each category. When evidence supports multiple categories equally, probabilities are divided between them (e.g., 40% each). When evidence more strongly supports certain categories but cannot definitively exclude others, higher probabilities are assigned to better-supported categories while maintaining lower probabilities for plausible alternatives. These probability distributions explicitly acknowledge both uncertainty in assessment and natural variation in affective experiences across individuals.
The intensity probabilities express how strongly the available evidence supports each level of affective intensity for a given segment of an experience. They represent both uncertainty in the evidence (how confidently we can identify the correct intensity) and variation among individuals (how differently animals may experience the same condition).
In practice, these two factors cannot be separated. Most welfare data come from population-level studies that already combine both—differences in individual sensitivity, coping ability, or healing rates are intertwined with the limits of available evidence, such as small sample sizes or incomplete indicators. When observations show that some animals react more intensely than others, we cannot tell whether this difference reflects real biological variation or measurement uncertainty.
Moreover, uncertainty about which intensity best fits a segment often arises from genuine variability within the population. If some animals experience more intense pain or pleasure than others under identical conditions, observers cannot assign a single fixed value without representing that variability probabilistically.
Separating these dimensions would require assumptions about the distribution of individual-level responses that are rarely supported by empirical data and could give a misleading sense of precision. For this reason, the Welfare Footprint Framework combines both into a single, transparent probability distribution—one that summarizes the expected proportion of individuals likely to experience each intensity given the available evidence.
In the unlikely event that no information was available, maximum uncertainty would be assumed, with the same probability (20%) attributed to each of the intensity categories. So far, however, we have not come across any scenarios in which information is completely unavailable.
Validation (or falsification, in Karl Popper’s sense) of estimates of the intensity and duration of affective states using direct empirical evidence is not yet possible for any welfare assessment framework (constraints imposed on the validation of hypotheses empirically are also present in other scientific areas, such as astronomy). Because the assessment of affective states still relies on indirect evidence, the attribution of probabilities of intensity and duration of an experience is an (always open) exercise of presenting evidence, from as many relevant lines of enquiry and sources as possible, to justify the values proposed. In every welfare assessment framework, hypotheses on parameter values are provided and/or reviewed by specialists in the field. In the Welfare Footprint Framework, estimated parameter values are always treated as hypotheses, each associated with an uncertainty level (represented by subjective confidence intervals), which may be reduced (1) as more evidence becomes available and/or (2) greater consensus is reached towards proposed values by independent evaluators. In this way, proposed figures are continuously updated and strengthened.
The Welfare Footprint Framework acknowledges inherent uncertainty in its estimates by using probability distributions and duration ranges rather than single values. This approach transparently represents both scientific uncertainty and natural biological variation. All estimates are based on synthesis of existing research (behavioral, neurophysiological, and pharmacological evidence), with assumptions and evidence gaps explicitly documented. Estimates of Cumulative Pain and Cumulative Pleasure take full advantage of relevant existing evidence. In other words, they are as accurate as the integration of existing practical and theoretical knowledge of animal welfare allows. While perfect accuracy in measuring subjective experiences remains unattainable, the transparent approach adopted enables meaningful welfare comparisons while allowing for sensitivity analysis and updates as new evidence emerges.
A single, aggregate metric of welfare would be more convenient for practical purposes. However, evidence to support inferences on the mathematical equivalence between the intensity categories is not yet available, so any integration exercise would still require relying only on subjective assessments of such an equivalence. Additionally, the validity of balancing different levels of pain or pleasure in comparisons involving different individuals or populations is not clear. For example, estimates of Cumulative Pain in a population with individuals enduring intense pain could be similar to that of a population where no individual suffers intense pain, but a larger fraction of individuals experience milder pain for a sufficiently long time. We believe that, so far, analyzing estimates of the total time spent in each intensity represents a more accurate and transparent approach, grounded in explicit parameters with biological meaning. It also enables the pursuit of different priorities, such as relieving the worst kinds of preventable suffering (those associated with the Disabling and Excruciating categories of pain). Moreover, presentation of the disaggregated estimates enables their integration by audiences with different views on the weight that should be given to each level of intensity (e.g. here and here).
A time-based equivalence between levels of pain intensity, based on scientific evidence, is not yet available. So far, the answer to this question is more of a personal, moral or philosophical nature. Because the focus of the Welfare Footprint Institute is the production of scientific and evidence-based knowledge, we have opted for keeping the results of the analytical process untransformed, and leave the subjective weighing of intensity categories for individuals and institutions that will make use of them, and which may hold different views on what those weights should be. Alternative approaches are also possible. For example, users of the metrics can determine how much worse a given pain level would need to be for two situations to be considered equivalent. Say that a typical bird kept in system ‘A’ experiences about 1,000 hours of Hurtful pain, whereas another living in a system ‘B’ experiences about 5 hours of Disabling pain. For the systems to be considered equivalent in terms of suffering, the Disabling intensity would need to be 200 times worse than the Hurtful intensity during all the time it is felt (1,000/5). If one finds this figure (200) too high, then system ‘A’ is judged to be worse than ‘B’. Otherwise, system ‘B’ is judged as better.
Animals often experience multiple conditions simultaneously, such as physical pain from a toothache alongside psychological frustration. Because monitoring the exact concurrence of every condition across a population is highly complex, the WFF evaluates each affective experience independently as a baseline.
We create separate Pain-Tracks or Pleasure-Tracks for each condition, and the resulting Cumulative Affect metrics are initially treated as additive. However, the WFF acknowledges the ‘attention effect’—the biological reality that an intensely painful state can block an animal’s awareness of weaker, concurrent states (A thorough discussion is available here). Applying this attention effect is an available, advanced refinement of the analysis that can be utilized when higher-resolution data on comorbidity patterns is available. To ensure complete transparency, all WFF results must explicitly state whether the final calculations are ‘strictly additive’ or ‘adjusted for concurrent attention effects.
Estimates of Cumulative Pain and Pleasure can exceed an individual’s lifespan because the framework adds up time spent in different affective states that can occur simultaneously. For example, an animal experiencing both chronic lameness and respiratory disease (both causing Hurtful pain) for its entire life would have cumulative suffering that’s twice its chronological lifespan. This approach reflects the total burden of welfare challenges rather than elapsed time. The framework currently uses a simplified additive model where one hour of exposure to two simultaneous Hurtful pain sources counts as two hours of Hurtful pain. Future refinements may incorporate attention allocation to different pain sources, which would reduce these estimates, but such changes await better evidence on attention dynamics during overlapping experiences.A full discussion of the share of attention that may be dedicated to pain, in each intensity category, is available here.
At any time, the subjective well-being of individuals depends on an integration between positive and negative affective states, namely an yet undetermined subjective calculation of pleasure relative to suffering that may be positive or negative. Integration of positive and negative states over a longer time frame is also at the core of the notion of a “life worth living” or “good life”, or one where the surplus of pleasure over suffering is positive. Although such an integration exercise would be more truthful to the range of experiences an animal endures, it is still hindered by the challenges to establish a mathematical equivalence between positively and negatively valenced states. For example, the extent to which positive and negative experiences are morally symmetrical, or pleasure can compensate for time in pain, is far from consensual (see previous question). Even among humans, who can verbalize their preferences, whether time in pain of a certain intensity can be compensated for time of healthy and joyful life will often come down as a personal and moral choice. What a person (or group) may consider an acceptable trade-off, others might condemn. Additionally, compensation between positive and negative experiences is particularly difficult in cases of extreme suffering: would there be any magnitude of pleasure that would compensate for torture-like experiences? Another drawback of the integration of positive and negative effects are the implications for the analysis at the population level. If welfare is measured as a surplus of pleasure over suffering, would severe suffering endured by a few individuals be compensated by barely positive welfare endured by a sufficiently large number of individuals?
The framework can be applied to any sentient species, including humans. Because the metric used is based on a phenomenon relevant to all sentient beings (time in pain of different intensities), it can be applied to virtually any sentient species to assess any context of relevance.
The Welfare Footprint Framework can measure wild animal welfare by quantifying both negative experiences (Cumulative Pain) and positive experiences (Cumulative Pleasure) that wild animals experience. This approach helps identify which natural circumstances cause the most suffering or enable the most positive experiences. By analyzing welfare challenges systematically across different species and habitats, researchers can better understand suffering hotspots in nature and raise awareness about wild animal suffering.
Do three hours of Disabling pain in mammals represent the same burden of pain or welfare loss as three hours of Disabling pain in fish, or shrimp? The answer to this question is not simple.
The extent to which different species differ in their capacity for affective experiences is one of the biggest questions of present and past times. Yet, it is one that must be addressed if Welfare Footprints are to be compared across species. Currently, the Welfare Footprint Framework is intentionally agnostic about these differences: core welfare estimates are produced without interspecific corrections. Any assumptions about differences in affective capacity must be applied explicitly and transparently as optional post-quantification adjustments (Module Ψ) when particular comparisons require them. The Welfare Footprint Institute team is also actively working on this line of enquiry to investigate what might be at least workable solutions to this question.
When transforming Cumulative Pain or Pleasure estimates into percentages of lifetime, several challenges emerge. While this conversion can be applied, we focus on absolute time in affective states because: (1) it’s unclear whether identical percentages across different lifespans (e.g., 10% of a 100-hour life vs. 10% of a 10,000-hour life) represent equivalent welfare impacts; (2) such conversion requires comprehensive accounting of all affective experiences throughout an entire lifetime; and (3) when multiple experiences occur simultaneously, cumulative estimates can mathematically exceed total lifespan. One alternative approach is to express welfare impacts as time in Pain or Pleasure during a typical day, which provides standardization while avoiding some of these complications.
Despite these limitations, the WFF’s structured approach helps by breaking down assessments into specific parts, allowing the use of various evidence types (including logical reasoning when data is absent). It highlights knowledge gaps to guide future research and makes all assumptions transparent. Advances in AI and automated monitoring technologies are also making it easier to gather and analyze data, reducing some barriers over time.
No. The Cumulative Pain framework does not assign a negative value to death, only to negative affective experiences (anything that ‘feels bad’).
A meaningful and universal metric of positive and negative welfare embedded in different animal-sourced products, practices and production contexts has different practical uses for different audiences:
The Welfare Footprint Framework was not designed to promote any one political or ideological outcome. Its role is to provide governments, regulators, and other decision-makers with a clear, quantitative language for animal welfare—one that allows welfare impacts to be compared, traded off, and integrated into policy just as environmental and health impacts already are.
At the policy level, that means three things. First, setting baselines and thresholds. Regulators can establish minimum welfare standards expressed in time-based units (for example, “no more than X hours of disabling pain per bird”). That makes standards transparent, enforceable, and adaptable across species and production systems.
Second, comparing systems and interventions. Because the Welfare Footprint expresses welfare in a common currency (time in affective states, i.e. pain or pleasure), it allows direct comparisons—between different production systems (for example, cage-free versus caged), different interventions, or even between welfare, environmental, and economic outcomes. This enables more rational policy debates, where trade-offs are explicit rather than hidden behind abstract scores.
Third, guiding long-term choices. By embedding welfare impacts into the same space as cost and carbon, policymakers can assess whether incremental reforms suffice or whether broader structural changes, including reductions in animal product reliance, deliver the greatest welfare gains per unit of cost or environmental burden. The Welfare Footprint itself is neutral: it does not prescribe whether society should improve current systems or transition away from them. What it does is make the consequences of each path transparent and comparable. So in short: the Welfare Footprint is a decision-support tool. It gives policymakers the ability to ground welfare debates in measurable evidence, set clear baselines, evaluate reforms, and understand the ethical cost of different food system trajectories. Whether this leads to higher-welfare production, stricter regulation, or broader shifts in how we use animals is ultimately a societal and political choice—but one that can now be made with clarity rather than guesswork.
What we hope is that people will use this information as a transparent basis for choice. When you tell someone that an extra dollar per kilo prevents 15–100 hours of intense pain, you are translating something abstract into a metric anyone can grasp and weigh against other considerations. For consumers, that might mean choosing products that demonstrably reduce negative experiences for animals. For companies, it provides a way to evaluate the cost-effectiveness of welfare standards alongside other sustainability goals. And for the broader public, it reframes animal welfare from a matter of subjective opinion into something that can be measured, compared, and improved. The Welfare Footprint doesn’t prescribe what people should value, but it gives them a clear, quantitative picture of what different choices mean for animals’ lived experiences. That clarity is the starting point for more informed decisions — whether at the level of an individual shopper, a corporate standard, or public policy.
In the Welfare Footprint Framework, these terms are closely related but represent different levels of analysis:
In short, the framework breaks down a single Affective Experience into its temporal segments to measure the total time an animal spends in different Affective States of varying intensities.
Accordion Content
