FREQUENTLY ASKED QUESTIONS​

What is your vision?

Our vision is of a world where efforts to prevent and alleviate animal suffering are as efficient and cost-effective as possible. Where decisions across various domains (economy, environment, investment, purchasing) are also informed by their impact on the welfare of animals. Achieving this vision depends critically on properly measuring, and mapping, the  magnitude of animal suffering across conditions, systems and species. Ultimately, we would like to help transform the understanding of animal and human suffering, shifting it from an abstract concept to a scientifically measurable and extensively mapped phenomenon across all sentient beings. In doing so, we seek to help empower society (legislators, environmentalists, economists, advocates, funders, investors, consumers) to make decisions and take actions more closely aligned with ethical values that aim to minimize harm.

What are the main goals of the Welfare Footprint Institute?

To inform efforts and decisions for the prevention and alleviation of animal suffering. To this end, our research is aimed at providing consumers, funders, researchers, producers, investors, advocates and other decision-makers with a comparative, meaningful and easily interpretable metric (1) of the magnitude of negative and positive welfare embedded in animal-sourced foods produced from different species and under different conditions and (2) of the welfare impact of interventions, standards, reforms, policies and purchasing chioces. As a side-product of our research efforts, we also aim to promote more targeted research and funding in the animal welfare sciences.

What is the essence of the Welfare Footprint Framework?

The assessment of the time spent in affective states (negative and positive, physical and psychological) of different intensities in any scenario or species. The use of a universal metric with biological meaning (time in pain and pleasure) enables comparing and combining the cumulative load of affective experiences emerging from different environments, industry practices, production systems or interventions, and provides the basis for the development of welfare footprints of animal-sourced products.

What are the key elements of the method?

  • Use of a universal (comparable) metric of welfare with real-word meaning (time in positive and negative affective states). 
  • Use of a time-based measure of welfare (as experiences unfold over time)
  • Assessment of cumulative load of negative and positive affective experiences over period of interest
  • Transparency in each analytical stage and provision of uncertainty associated with estimates
  • Use of multiple lines of evidence to reduce uncertainty around estimates
  • Continuous empirical updating of estimates

What are the main analytical steps of the Welfare Footprint Framework?

The framework evaluates animal welfare through seven interconnected analytical modules:

    1. Zootechnical Description (Module I): Mapping the specific physical, social, environmental, genetic, and management Circumstances that animals encounter, the Species-Specific Needs that determine how those Circumstances affect them, and the Productivity metrics of the system.
    2. Veterinary Inventory (Module II): Identifying the Biological Consequences, physical and perceptual, that result from the interaction between Circumstances and Species-Specific Needs.
    3. Affective Quantification (Module III): Evaluating the consciously felt Affective Experiences triggered by those Biological Consequences, assessing their intensity and duration using Pain-Tracks and Pleasure-Tracks.
    4. Epidemiological Review (Module IV): Estimating the population-level burden of each Affective Experience by combining individual-level estimates with prevalence and occurrence data across Plausible Scenarios.
    5. Econometric Calculation (Module V): Aggregating estimates across Life-Fates and standardizing by Productivity metrics to produce a Welfare Footprint per unit of output.
    6. Welfare Footprint Expression and Notation (Module VI): Presenting all results with explicit Analytical Boundaries, Scope Qualifiers, and declaration of whether estimates are Baseline or Attention-Adjusted.
    7. Interspecific Scaling (Module Ψ, optional): Applying transparent adjustments for potential differences in Hedonic Capacity and subjective time perception when comparing welfare burdens across species.

A flowchart with the main analytical steps is provided in our methods page .

Can the affective experiences of non-verbal beings be actually measured?

There is currently no method to directly access or measure subjective affective states or feelings. Instead, the Welfare Footprint Framework provides a transparent, evidence-based, and auditable method to make explicit inferences about an animal’s lived experience. It does this by synthesizing indirect empirical evidence—spanning behavioral disruption, physiological responses, pharmacological effects (like responses to painkillers), and evolutionary reasoning. While perfect accuracy in measuring subjective feelings remains unattainable, this framework ensures that all inferences are grounded in visible evidence, with all assumptions and uncertainties explicitly documented and open to revision.

What do you mean by ‘pain’ and 'pleasure'?

For measurement purposes, the terms pain and pleasure are used as a shorthands, respectively, for any felt ‘negative (unpleasant) affective state’ or ‘positive (pleasant) affective state’. More simply, anything that ‘feels bad’ or ‘feels good’. States with a somatic origin are referred to as ‘physical pain’ (e.g. aches, hunger, injuries, thermal stress) and ‘physical pleasure’ (e.g., the taste of a preferred food, the pleasantness of a sexual activity), while those related to the primary emotional systems are referred to as ‘psychological pain’ (e.g. fear, frustration, boredom) or ‘psychological pleasure’ (e.g., the pleasantness associated with positive social interactions, play bouts).

How does the framework measure positive welfare or pleasure?

The framework measures positive welfare using the Pleasure-Track notation system, calculating Cumulative Pleasure as the total time spent in positive affective states of different intensities. Just as Pain intensities are mostly defined operationally by the degree to which they disrupt an animal, Pleasure intensities (Satisfaction, Joy, Euphoria, Bliss) are mostly defined by the degree of the animal’s engagement with the positive experience. Importantly, the framework does not assume that pain and pleasure sit on a simple, common scale; they are driven by different cognitive processes and contribute differently to overall well-being. (https://welfarefootprint.org/2024/03/12/positive-animal-welfare/)

Why four categories of intensity? It seems some categories can encompass a wide range of intensities.

The framework evaluates Affective Experiences across four discrete categories of intensity. Periods where the animal is in a neutral state are recorded as ‘No Pain’ or ‘No Pleasure’, representing the absence of these affective states rather than a level of intensity. The four active categories (Annoying, Hurtful, Disabling, and Excruciating for Pain; Satisfaction, Joy, Euphoria, and Bliss for Pleasure) were not devised by arbitrarily splitting a numerical continuum. Instead, they are operationally defined based on the degree of disruption or engagement they cause to the animal’s attention and normal functioning. This approach is rooted in the evolutionary expectation that greater threats or rewards require stronger signals to capture the animal’s attention and prioritize adaptive responses over competing demands. While a specific category might encompass a range of subtle subjective variations, these four active levels represent an optimal balance between precision and clarity. By anchoring each category to observable, empirically testable criteria—such as behavioral disruption, cognitive impairment, and pharmacological responses—the framework minimizes the risk of misclassification and avoids the ceiling effects and ambiguities common in continuous numerical scoring systems.

Why not use terms such as 'mild', 'moderate' and 'severe' pain intensity?

We sought to reduce the ambiguity of widely used terminologies (mild, moderate, severe) by using instead terms which, in addition to being semantically more specific, are qualified by specific references and criteria. Use of these criteria enables establishing more specific thresholds in the attribution of pain intensity that reduce the likelihood of misclassification, facilitating comparisons across conditions and individuals. For example, the pain of a fracture at the time of rupture matches the definition of Disabling Pain, not because this is the verbal description preferred by patients, but because it matches the definition of this intensity category: it captures nearly all the individual’s attention, sufferers are unable to perform other activities, and strong analgesia is commonly required.

A further problem with terms like “severe” is that they can conflate intensity with other dimensions of welfare, particularly duration and prevalence. A condition might be called “severe” because it is extremely intense, but equally because it lasts a very long time or affects a large proportion of animals, even if the intensity itself is moderate. By defining categories that refer exclusively to the subjective magnitude of the experience at a given point in time, the WFF keeps intensity analytically separate from duration and prevalence, which are estimated independently. This separation reduces a significant source of variability and inconsistency in welfare classification, and ensures that each dimension of welfare burden can be examined and compared on its own terms.

Why not use previously existing numerical scales of pain intensity (where intensity ranges from e.g. 0 to 10)?

Subjectivity in the categorization of pain intensity is a problem that permeates all intensity scales, both numerical and nominal. The advantage of nominal scales lies in removing one layer of complexity in the attribution of intensity to the pain experience:  ‘mild’, with all its problems, is self-explanatory to most individuals, whereas ‘5’ must be explained and externally referenced. Use of discrete categories also avoids the problem of forcing numerical ratings that may not necessarily be linear into a linear scale, and prevents the ceiling effects that are frequently observed with numerical ratings of pain (where ratings close to the limits of the scale are constrained, increasing in intensity by smaller amounts than the intensity effectively experienced).

How do you estimate the intensity of the pain (or pleasure) of an affective experience?

To estimate the intensity of pain (or pleasure) animals experience, the Welfare Footprint Framework uses a systematic evidence synthesis approach that examines multiple indicators across disciplines. For each challenge, evidence from behavioral observations, neurophysiological measurements, pharmacological responses (like effects of pain-relieving drugs), and evolutionary considerations is collected and evaluated against defined intensity criteria. Rather than forcing a single intensity category, the framework assigns probability distributions across intensity categories (Annoying, Hurtful, Disabling, or Excruciating for pain; Satisfaction, Joy, Euphoria, or Bliss for pleasure) to reflect both assessment uncertainty and natural variation in how different individuals experience the same condition. This approach makes all evidence and assumptions explicit, highlighting knowledge gaps while providing transparent, scientifically-grounded estimates that can be updated as new evidence emerges.

In a Pain or Pleasure track, what do the intensity probabilities mean? Uncertainty about intensity, or variability in how intensity is felt?

The intensity probabilities express how strongly the available evidence supports each level of affective intensity for a given segment of an experience. They represent both uncertainty in the evidence (how confidently we can identify the correct intensity) and variation among individuals (how differently animals may experience the same condition).

In practice, these two factors cannot be separated. Most welfare data come from population-level studies that already combine both—differences in individual sensitivity, coping ability, or healing rates are intertwined with the limits of available evidence, such as small sample sizes or incomplete indicators. When observations show that some animals react more intensely than others, we cannot tell whether this difference reflects real biological variation or measurement uncertainty.

Moreover, uncertainty about which intensity best fits a segment often arises from genuine variability within the population. If some animals experience more intense pain or pleasure than others under identical conditions, observers cannot assign a single fixed value without representing that variability probabilistically.

Separating these dimensions would require assumptions about the distribution of individual-level responses that are rarely supported by empirical data and could give a misleading sense of precision. For this reason, the Welfare Footprint Framework combines both into a single, transparent probability distribution—one that summarizes the expected proportion of individuals likely to experience each intensity given the available evidence.

What is the difference between welfare indicators and welfare metrics?

It is essential to distinguish between these two concepts. Welfare indicators are specific empirical measurements of traits or states—such as behavioral changes, physiological responses, vocalizations, or neurological activity—that correlate with an animal’s affective state. They are pieces of evidence that are typically specific to a certain species and context. Welfare metrics, on the other hand, are the standardized, universal quantitative constructs derived from those indicators. Metrics, such as Cumulative Pain and Cumulative Pleasure, integrate the indicator data to evaluate and compare the overall magnitude of welfare loss or gain across different situations and species (https://welfarefootprint.org/2023/05/31/welfare-metrics-vs-welfare-indicators/).

Where welfare indicators of intensity are unavailable, how to estimate the intensity of a welfare experience?

In the unlikely event that no information was available, maximum uncertainty would be assumed, with the same probability (20%) attributed to each of the intensity categories. So far, however, we have not come across any scenarios in which information is completely unavailable.

How do you go from evidence to a probability that pain or pleasure belongs to an intensity category?

In the Welfare Footprint Framework, translating evidence into probabilities for pain or pleasure intensity categories follows a structured evaluation process. First, each piece of evidence (behavioral, neurophysiological, pharmacological, or evolutionary) is systematically rated for its consistency with the criteria defining each intensity category (Annoying, Hurtful, Disabling, or Excruciating for Pain; Satisfaction, Joy, Euphoria, or Bliss for Pleasure). For example, evidence that an animal performs vigorous play behavior might be rated as consistent with higher pleasure intensities but inconsistent with mere Satisfaction. Each piece of evidence is marked as consistent (+), inconsistent (-), rejecting (R), or unclear (?) for each category. When evidence supports multiple categories equally, probabilities are divided between them (e.g., 40% each). When evidence more strongly supports certain categories but cannot definitively exclude others, higher probabilities are assigned to better-supported categories while maintaining lower probabilities for plausible alternatives. These probability distributions explicitly acknowledge both uncertainty in assessment and natural variation in affective experiences across individuals.

Why don’t you aggregate all categories of intensity into a single one?

The framework deliberately avoids collapsing welfare into a single aggregate score because there is currently no empirical basis to establish a definitive mathematical equivalence between different intensities of pain, nor a way to seamlessly balance positive experiences against negative ones. For example, deciding if a brief moment of extreme agony is worse than months of moderate, chronic aching ultimately requires subjective value judgments. By keeping the estimates disaggregated (reporting exact time spent in each distinct category, such as Hurtful vs. Disabling pain), the data remains transparent and empirically grounded. This allows different stakeholders, policymakers, and advocates to apply their own moral weightings and priorities to the evidence without the framework imposing hidden subjective trade-offs.

So how to decide which is worse: less intense pain for a longer time or more intense pain for a shorter time?

A time-based equivalence between levels of pain intensity, based on scientific evidence, is not yet available. So far, the answer to this question is more of a personal, moral or philosophical nature. Because the focus of the Welfare Footprint Institute is the production of scientific and evidence-based knowledge, we have opted for keeping the results of the analytical process untransformed, and leave the subjective weighing of intensity categories for individuals and institutions that will make use of them, and which may hold different views on what those weights should be. Alternative approaches are also possible. For example, users of the metrics can determine how much worse a given pain level would need to be for two situations to be considered equivalent. Say that a typical bird kept in system ‘A’ experiences about 1,000 hours of Hurtful pain, whereas another living in a system ‘B’ experiences about 5 hours of Disabling pain. For the systems to be considered equivalent in terms of suffering, the Disabling intensity would need to be 200 times worse than the Hurtful intensity during all the time it is felt (1,000/5). If one finds this figure (200) too high, then system ‘A’ is judged to be worse than ‘B’. Otherwise, system ‘B’ is judged as better.

How would positive experiences compensate for negative ones?

At any time, the subjective well-being of individuals depends on an integration between positive and negative affective states, namely an yet undetermined subjective calculation of pleasure relative to suffering that may be positive or negative. Integration of positive and negative states over a longer time frame is also at the core of the notion of a “life worth living” or “good life”, or one where the surplus of pleasure over suffering is positive. Although such an integration exercise would be more truthful to the range of experiences an animal endures, it is still hindered by the challenges to establish a mathematical equivalence between positively and negatively valenced states. For example, the extent to which positive and negative experiences are morally symmetrical, or pleasure can compensate for time in pain, is far from consensual (see previous question). Even among humans, who can verbalize their preferences, whether time in pain of a certain intensity can be compensated for time of healthy and joyful life will often come down as a personal and moral choice. What a person (or group) may consider an acceptable trade-off, others might condemn. Additionally, compensation between positive and negative experiences is particularly difficult in cases of extreme suffering: would there be any magnitude of pleasure that would compensate for torture-like experiences? Another drawback of the integration of positive and negative effects are the implications for the analysis at the population level. If welfare is measured as a surplus of pleasure over suffering, would severe suffering endured by a few individuals be compensated by barely positive welfare endured by a sufficiently large number of individuals?

Why does the WFF separate 'Perceptual Biological Consequences' from 'Affective Experiences'?

The WFF deliberately separates an animal’s cognitive perception of an event from the resulting emotion to ensure accurate welfare tracking. For example, when a calf is separated from its mother, the physical act of moving the calf to a different pen is merely an external Circumstance. The negative affective experience does not come from the physical separation itself, but from the calf’s perception that the mother is no longer there. In fact, if the calf does not realize the mother is gone, the negative affective state is not triggered.
Therefore, the WFF maps this as a three-step causal chain: the physical separation (Circumstance) leads to the cognitive realization of maternal absence (perceptual Biological Consequence). This perception then directly engages the neurological mechanisms in the brain that generate the consciously felt psychological pain, such as distress or sadness (Affective Experience).

What do you mean by 'average population member'?

Although pain inherently concerns individuals, we operationally accept that the collective welfare of the members of a population can also be determined. Measuring cumulative pain at the population level is also necessary to account for the heterogeneity in the exposure of population members to different challenges. For example, while lameness is experienced by a large fraction of broiler chickens, fatal cases of ascites are only experienced by a few. Therefore, measurement efforts must consider the prevalence of each welfare challenge, so that pain is determined for the average member of the population (which may not necessarily correspond to any real organism). At the population level, the time spent at each level of pain intensity by the ‘average population member’ as a result of each challenge is determined by multiplying it by its prevalence. For example, if a condition causes 10 hours of Disabling pain and 70% of the population are affected, then the average member of this population could be said to experience 7 hours of Disabling pain due to the condition. Measurements at the population level enable comparing the impact of different practices and conditions across demographics, geographies, and time.

What do you do when prevalence data are not available?

Direct epidemiological data from commercial populations are unavailable for most welfare conditions in most farmed species. This is not a limitation specific to the WFF — it is the normal situation across animal welfare science. When direct data are absent, prevalence is estimated drawing on biological plausibility, knowledge of the mechanisms giving rise to the condition, empirical values from related conditions or species, operational parameters from certification schemes. These are not arbitrary assumptions; they are forms of knowledge that constrain what counts as a plausible scenario.

Estimates constructed this way are expressed as ranges reflecting this uncertainty and the likely natural variability in the parameter being estimates, and the basis for each range is explicitly stated. Critically, the framework then tests whether welfare conclusions hold across the full range of those estimates, from Best Practice to Failure scenarios, using sensitivity analysis. When conclusions are robust across the full range, the missing data does not affect the decision at hand. When they are not, the analysis identifies precisely what data would need to be collected, at what resolution, and why. This is always more informative than a qualitative label being attributed to a condition, like “common” or “uncommon” which rests on the same uncertain evidence base but makes its assumptions invisible.

How does the WFF handle animals experiencing multiple welfare issues at the same time?

Animals often experience multiple conditions simultaneously, such as physical pain from a toothache alongside psychological frustration. Because monitoring the exact concurrence of every condition across a population is highly complex, the WFF evaluates each affective experience independently as a baseline. We create separate Pain-Tracks or Pleasure-Tracks for each condition, and the resulting Cumulative Affect metrics are initially treated as additive. However, the WFF acknowledges the ‘attention effect’—the biological reality that an intensely painful state can block an animal’s awareness of weaker, concurrent states (A thorough discussion is available here). Applying this attention effect is an available, advanced refinement of the analysis that can be utilized when higher-resolution data on comorbidity patterns is available. To ensure complete transparency, all WFF results must explicitly state whether the final calculations are ‘strictly additive’ or ‘adjusted for concurrent attention effects.

How does the framework handle the complexity of aggregating many welfare conditions that may interact with each other?

When a welfare assessment covers multiple conditions simultaneously, such as fear, respiratory distress, and metabolic exhaustion occurring together during a handling procedure, those conditions are not independent. The experience of one may affect the intensity or duration of another, and their combined effect on the animal may differ from the sum of their individual effects. A further challenge is that the number of potentially relevant conditions, interactions, Life-Phases, and Life-Fates in any real system is in principle unbounded. No assessment can represent all of them explicitly, and the choice of what to include is itself a methodological decision that shapes the output.

The WFF addresses this in three ways. First, by defaulting to the Baseline approach, in which each condition is analyzed independently and results are treated as additive. This is an explicit simplification, not an unexamined one. It is maintained as the practical standard because the data needed to characterize interaction structures across multiple concurrent conditions at the population level are rarely available. The Baseline is therefore a transparent upper bound, not a claim that interactions do not exist.

Second, the framework uses Plausible Scenarios and sensitivity analysis to test whether welfare conclusions hold across the full range of plausible conditions. In a system with many interacting variables, this approach shifts the question from “what exactly is happening?” to “does the conclusion change across the space of plausible combinations?” When the answer is no, the interaction structure is not decision-relevant. When the answer is yes, the analysis identifies which specific conditions or interactions are important for the conclusion, directing research to where it would most reduce uncertainty.

Third, conditions that are excluded from an assessment are not thereby assumed to be absent. Excluding a condition fixes it at an implicit value of zero, which is itself a strong assumption and often a stronger one than any explicit uncertain estimate would be. The WFF requires that analytical boundaries be stated explicitly, so that readers can see what is and is not included and assess what the exclusions imply. This makes the dimensionality problem visible rather than hiding it.

Would qualitative assessments not be more honest and cautious when evidence is thin?

Qualitative assessment is sometimes proposed as a more epistemically humble alternative to numerical estimation when evidence is limited. The intuition is that describing a harm as “severe” or “of major concern” makes fewer claims than assigning it a numerical range, and is therefore less likely to mislead.

This intuition does not hold on examination. Qualitative labels do not avoid quantitative assumptions; they embed them invisibly. To describe a harm as “severe” or “of major concern,” a person has implicitly weighted some combination of intensity, duration, and prevalence to arrive at that judgment. But different people weight these parameters differently: one expert may call a condition severe because it affects a large proportion of animals, another because it reaches extreme intensity in a smaller number, and a third because it persists for a very long time even at moderate intensity. The label looks like a shared conclusion but conceals these differences entirely. When Bracke et al. (2008) asked 24 expert scientists to independently score the same welfare hazards, concordance was extremely low. This is not because experts disagreed about the evidence; it is because qualitative labels allow each expert to embed different implicit weightings of the same underlying parameters without those weightings becoming visible or comparable.

Qualitative assessments also cannot tell you whether the missing data matters for the decision at hand. A numerical range combined with sensitivity analysis can reveal that a conclusion holds whether prevalence is 5% or 50%, making the uncertainty in that parameter irrelevant to the decision. A qualitative label cannot do this: it provides no mechanism for determining which parameters are most uncertain, which comparisons are sensitive to unknown values, or what data would most change the conclusion. Research priorities flagged qualitatively as “important” cannot be ranked with enough precision to direct specific data collection at a defined resolution and sample size.

Qualitative uncertainty statements are also uniquely fragile in practice. A label like “probably common but uncertain” tends to be shortened to “common” as it moves through organizations and into policy decisions, stripping away the caveat entirely. A numerical range does not suffer from this in the same way: “5% to 50%” cannot be summarized as “50%” without visible distortion. The uncertainty is built into the form of the result, not added as a qualifier that can be dropped.

Finally, qualitative assessments share every evidential limitation of quantitative ones, and add a further one: they provide no roadmap for improvement. A stated range of 5% to 50% for a given condition invites specific questions about what data would narrow it, at what sample size, and in what geographies. A qualitative label that a condition is “of concern” rests on the same uncertain evidence but offers no equivalent specification of what would resolve the uncertainty or how close the field is to resolving it.

How do you validate your estimates?

Validation (or falsification, in Karl Popper’s sense) of estimates of the intensity and duration of affective states using direct empirical evidence is not yet possible for any welfare assessment framework (constraints imposed on the validation of hypotheses empirically are also present in other scientific areas, such as astronomy). Because the assessment of affective states still relies on indirect evidence, the attribution of probabilities of intensity and duration of an experience is an (always open) exercise of presenting evidence, from as many relevant lines of enquiry and sources as possible, to justify the values proposed. In every welfare assessment framework, hypotheses on parameter values are provided and/or reviewed by specialists in the field. In the Welfare Footprint Framework, estimated parameter values are always treated as hypotheses, each associated with an uncertainty level (represented by subjective confidence intervals), which may be reduced (1) as more evidence becomes available and/or (2) greater consensus is reached towards proposed values by independent evaluators. In this way, proposed figures are continuously updated and strengthened.

How accurate are the estimates?

The Welfare Footprint Framework acknowledges inherent uncertainty in its estimates by using probability distributions and duration ranges rather than single values. This approach transparently represents both scientific uncertainty and natural biological variation. All estimates are based on synthesis of existing research (behavioral, neurophysiological, and pharmacological evidence), with assumptions and evidence gaps explicitly documented. Estimates of Cumulative Pain and Cumulative Pleasure take full advantage of relevant existing evidence. In other words, they are as accurate as the integration of existing practical and theoretical knowledge of animal welfare allows. While perfect accuracy in measuring subjective experiences remains unattainable, the transparent approach adopted enables meaningful welfare comparisons while allowing for sensitivity analysis and updates as new evidence emerges.

What are the main limitations of this analytical approach?

The Welfare Footprint Framework (WFF), like any tool for assessing animal welfare, faces challenges due to the complexity and subjectivity of measuring affective states. Here are the key limitations:

  • Data gaps and quality issues: Estimates of the intensity, duration, and prevalence of affective experiences rely on empirical evidence, which is often sparse, incomplete, or missing. Research has largely overlooked the timing of welfare issues, such as when they start, how long they last, or how they change over time. Prevalence studies are typically limited to specific locations and may not reflect real-world commercial conditions.
  • Challenges with low-arousal states: It’s harder to assess subtle, long-lasting feelings like mood, which lack clear triggers and are influenced by many small events. These states may be underrepresented in evaluations.
  • Greater certainty for negative states: Most validated indicators focus on harms (e.g., injuries or diseases), while indicators for positive welfare are more scarce. This leads to greater uncertainty in estimating benefits.
  • Resource demands: Applying the WFF requires detailed system descriptions and expert knowledge, which can be difficult in settings with limited data.

Despite these limitations, the WFF’s structured approach helps by breaking down assessments into specific parts, allowing the use of various evidence types (including logical reasoning when data is absent). It highlights knowledge gaps to guide future research and makes all assumptions transparent. Advances in AI and automated monitoring technologies are also making it easier to gather and analyze data, reducing some barriers over time.

Can estimates of Cumulative Pain and Pleasure be longer than the typical lifespan of individuals?

Estimates of Cumulative Pain and Pleasure can exceed an individual’s lifespan because the framework adds up time spent in different affective states that can occur simultaneously. For example, an animal experiencing both chronic lameness and respiratory disease (both causing Hurtful pain) for its entire life would have cumulative suffering that’s twice its chronological lifespan. This approach reflects the total burden of welfare challenges rather than elapsed time. The framework currently uses a simplified additive model where one hour of exposure to two simultaneous Hurtful pain sources counts as two hours of Hurtful pain. Future refinements may incorporate attention allocation to different pain sources, which would reduce these estimates, but such changes await better evidence on attention dynamics during overlapping experiences.A full discussion of the share of attention that may be dedicated to pain, in each intensity category, is available here.

Why not transform estimates of Cumulative Time in Pain or Pleasure into a percentage of lifetime, to facilitate comparisons among species with different lifespans?

When transforming Cumulative Pain or Pleasure estimates into percentages of lifetime, several challenges emerge. While this conversion can be applied, we focus on absolute time in affective states because: (1) it’s unclear whether identical percentages across different lifespans (e.g., 10% of a 100-hour life vs. 10% of a 10,000-hour life) represent equivalent welfare impacts; (2) such conversion requires comprehensive accounting of all affective experiences throughout an entire lifetime; and (3) when multiple experiences occur simultaneously, cumulative estimates can mathematically exceed total lifespan. One alternative approach is to express welfare impacts as time in Pain or Pleasure during a typical day, which provides standardization while avoiding some of these complications.

What is the difference between an "Affective Experience" and an "Affective State"?

In the Welfare Footprint Framework, these terms are closely related but represent different levels of analysis:

  • Affective Experience: This is the complete, dynamic emotional event or episode that arises from a Biological Consequence. It is the overarching phenomenon being evaluated (for example, the overall experience of tissue damage from tail docking, or the experience of a play bout).
  • Affective State: This refers to the specific condition or level of intensity the animal occupies at a given point in time during that experience (e.g., Annoying, Hurtful, Joy, Euphoria).

In short, the framework breaks down a single Affective Experience into its temporal segments to measure the total time an animal spends in different Affective States of varying intensities.

Can we compare estimates among different species?

The same assessment methodology can be applied across species because it relies on universal parameters — duration and intensity — and on shared functional criteria, such as the degree to which an experience disrupts normal biological functioning.

Nevertheless, a major unresolved challenge remains: the intensity of affective experiences (hedonic capacity), and even the subjective perception of time, can vary significantly across sentient beings, and this variation remains one of science’s most profound unknowns. This difficulty is not unique to the WFF; it applies to any welfare metric.

To address this issue transparently, the WFF includes an optional module called Ψ. Interspecific Scaling. This module does not impose a single rigid assumption. Instead, it provides a flexible mechanism that allows users to apply explicit, post-quantification corrections when interspecific comparisons are needed. These corrections prioritize the “ceiling question” — the plausible upper bound of human-anchored intensity (e.g., whether a species can reach levels comparable to Excruciating(h) or is more plausibly capped at lower categories such as Disabling(h) or Hurtful(h)) — while treating adjustments to subjective time as secondary and, by default, conservative.

Importantly, the framework preserves the integrity of species-specific analyses by distinguishing between species-internal intensity categories and human-anchored reference levels [Annoying(h), Hurtful(h), Disabling(h), Excruciating(h)]. Any interspecific adjustments must be explicitly reported, including the assumptions used to map each species’ internal categories to these reference points. 

Does the method apply only to a few species?

The framework can be applied to any sentient species, including humans. Because the metric used is based on a phenomenon relevant to all sentient beings (time in pain of different intensities), it can be applied to virtually any sentient species to assess any context of relevance.

How can one apply this method to measure wild animal welfare?

The Welfare Footprint Framework can measure wild animal welfare by quantifying both negative experiences (Cumulative Pain) and positive experiences (Cumulative Pleasure) that wild animals experience. This approach helps identify which natural circumstances cause the most suffering or enable the most positive experiences. By analyzing welfare challenges systematically across different species and habitats, researchers can better understand suffering hotspots in nature and raise awareness about wild animal suffering.

Can we compare estimates among different species?

Do three hours of Disabling pain in mammals represent the same burden of pain or welfare loss as three hours of Disabling pain in fish, or shrimp? The answer to this question is not simple.

The extent to which different species differ in their capacity for affective experiences is one of the biggest questions of present and past times. Yet, it is one that must be addressed if Welfare Footprints are to be compared across species. Currently, the Welfare Footprint Framework is intentionally agnostic about these differences: core welfare estimates are produced without interspecific corrections. Any assumptions about differences in affective capacity must be applied explicitly and transparently as optional post-quantification adjustments (Module Ψ) when particular comparisons require them. The Welfare Footprint Institute team is also actively working on this line of enquiry to investigate what might be at least workable solutions to this question.

Do you attribute any negative value to death, or to premature death?

No. The Cumulative Pain framework does not assign a negative value to death, only to negative affective experiences (anything that ‘feels bad’).

Can you give examples of practical uses of this method?

A meaningful and universal metric of positive and negative welfare embedded in different animal-sourced products, practices and production contexts has different practical uses for different audiences:

  • The possibility to estimate, quantitatively, the impact of different welfare campaigns, laws and standards enables animal protection organizations, funders, advocates and the EA community to be more effective, per unit of resource invested, in their efforts to reduce animal suffering.
  • Welfare scientists can estimate how much suffering is associated with different welfare challenges and production conditions and identify key research gaps, hence focus research efforts on hotspots of suffering and neglected research areas.
  • Veterinarians can describe, estimate and compare the pain from different diseases and injuries with a friendly notation method (the Pain-Track) and universal metric, so they can study and treat animal pain more effectively.
  • Animal scientists can compare quantitatively the welfare impact of different nutritional, genetic and management practices, so they can conduct cost-benefit analysis of animal welfare improvements.
  • Environmental analysts can estimate the animal welfare costs of different environmental policies, and therefore establish standards and legislation that do not harm animals as a consequence.
  • Certification bodies can establish objective thresholds on how much loss of welfare is acceptable of animal-sourced products associated with different industry practices and production systems.
  • Legislators can compare objectively the suffering associated with different challenges and systems, so they can establish appropriate standards and legislation.
  • Consumers can understand the suffering embedded in different animal-sourced products so they can effectively align their purchasing choices with their ethical values.

How do you envision this tool being used at the policy level?

The Welfare Footprint Framework was not designed to promote any one political or ideological outcome. Its role is to provide governments, regulators, and other decision-makers with a clear, quantitative language for animal welfare—one that allows welfare impacts to be compared, traded off, and integrated into policy just as environmental and health impacts already are.

At the policy level, that means three things. First, setting baselines and thresholds. Regulators can establish minimum welfare standards expressed in time-based units (for example, “no more than X hours of disabling pain per bird”). That makes standards transparent, enforceable, and adaptable across species and production systems.

Second, comparing systems and interventions. Because the Welfare Footprint expresses welfare in a common currency (time in affective states, i.e. pain or pleasure), it allows direct comparisons—between different production systems (for example, cage-free versus caged), different interventions, or even between welfare, environmental, and economic outcomes. This enables more rational policy debates, where trade-offs are explicit rather than hidden behind abstract scores.

Third, guiding long-term choices. By embedding welfare impacts into the same space as cost and carbon, policymakers can assess whether incremental reforms suffice or whether broader structural changes, including reductions in animal product reliance, deliver the greatest welfare gains per unit of cost or environmental burden. The Welfare Footprint itself is neutral: it does not prescribe whether society should improve current systems or transition away from them. What it does is make the consequences of each path transparent and comparable. So in short: the Welfare Footprint is a decision-support tool. It gives policymakers the ability to ground welfare debates in measurable evidence, set clear baselines, evaluate reforms, and understand the ethical cost of different food system trajectories. Whether this leads to higher-welfare production, stricter regulation, or broader shifts in how we use animals is ultimately a societal and political choice—but one that can now be made with clarity rather than guesswork.

What do you hope people will do with Welfare Footprint estimates?

What we hope is that people will use this information as a transparent basis for choice. When you tell someone that an extra dollar per kilo prevents 15–100 hours of intense pain, you are translating something abstract into a metric anyone can grasp and weigh against other considerations. For consumers, that might mean choosing products that demonstrably reduce negative experiences for animals. For companies, it provides a way to evaluate the cost-effectiveness of welfare standards alongside other sustainability goals. And for the broader public, it reframes animal welfare from a matter of subjective opinion into something that can be measured, compared, and improved. The Welfare Footprint doesn’t prescribe what people should value, but it gives them a clear, quantitative picture of what different choices mean for animals’ lived experiences. That clarity is the starting point for more informed decisions — whether at the level of an individual shopper, a corporate standard, or public policy.

Does the Welfare Footprint Institute use Artificial Intelligence or visualization tools?

Yes, but these are utilized strictly as supporting infrastructure, not as the conceptual heart of the framework itself. The WFI has developed specific tools like Processograms, which are interactive diagrams used to map the spatial and temporal configuration of animal production lifecycles. Additionally, the institute utilizes AI Large Language Models (such as the Hedonic-Track Custom GPT) to accelerate literature synthesis, scale data generation, and help rapidly draft preliminary analyses. However, while AI helps researchers think faster and process vast amounts of data, accountable human expertise and judgment remain the ultimate anchor for the framework. AI systems do not replace the rigorous human scrutiny required to validate the final published estimates.

Does measuring welfare within an existing system mean that WFI is endorsing that system?

No.
The Welfare Footprint Institute’s role is to quantify welfare impacts as rigorously and transparently as possible within a clearly defined scope. A welfare assessment is not a blanket endorsement, a certification, or a final moral verdict; it is an evidence-based attempt to describe what animals are likely to experience under particular conditions.

The Welfare Footprint Framework was developed precisely to make those impacts more visible, comparable, auditable, and usable in decision-making. Its metrics can be used by many kinds of stakeholders — including advocates, policymakers, producers, regulators, investors, and consumers — each of whom may interpret the same evidence through different ethical or strategic frameworks. WFI’s institutional contribution is therefore to improve the welfare-relevant evidence base, make assumptions and uncertainties explicit, and support better-informed decisions that affect animals (see also https://welfarefootprint.org/2026/03/04/we-measure-the-evidence-you-make-the-call-heres-why/).