Responding to the Demand for Quicker Evaluation Findings
On this Page:
Heather Nunns1
Independent contractor
Abstract
Some public sector stakeholders are demanding evaluative findings within a short timeframe. Although evaluators want to be responsive to such requests, there are a number of barriers that hinder their ability to produce evaluative information more quickly. This paper describes the results of an investigation into ways to help evaluators respond to such evaluation "timeliness" issues. It examines the factors that underpin the issue and the barriers to addressing it. A review of the literature identifies three approaches evaluators can use to address the timeliness issue. An unintended result of the investigation is also presented. Based on the findings of the literature review, a tool (named the "time/resource matrix") has been developed for responding to and managing stakeholder demand for quicker evaluative findings.
The Issue
This investigation is the result of my experiences as an evaluator in a public sector organisation. Evaluation stakeholders (notably policy and programme managers) are requesting evaluations with short timeframe for the reporting of findings. An examination of requests for proposals (RFPs) posted on the Government Electronic Tendering (GETS) website in 2007 indicated that such demand is occurring across the public sector. Some RFPs have a period of 6 to 12 weeks between the awarding of an evaluation contract and the reporting deadline.
Timeliness is an important consideration for evaluators. A "utility standard" is the first of four evaluation standards of the American Evaluation Association. The utility standard refers to the importance of the timeliness of evaluation findings "so that they can be used in a timely fashion" (Sanders 1994:53--57). In New Zealand, the Social Policy Evaluation and Research Committee (SPEaR) has identified timeliness as one of four features that make research and evaluation useful for social policy purposes (Bedford et al. 2007).
The importance of the timeliness of evaluative findings is expressed succinctly by Grasso (2003:512): "Timing is almost everything". Other authors stress the relationship between use and timeliness:
The timeliness of information is no less critical than its accuracy, as exigencies often force program managers to make decisions before thorough analyses can be completed. In some instances, less rigorous analyses delivered at the right time may be superior to exhaustive analyses delivered too late. (McNall et al. 2004:287)
Given the relationship between the timeliness of evaluative findings and their subsequent use, it is appropriate for evaluators to consider how they can respond to requests for a quicker turn-around of evaluation results.
Contributing Factors in the Policy Environment
Three factors in the policy environment appear to contribute to this demand for quicker evaluation findings: the policy-making process, the conceptualisation of evaluation in the policy process, and misaligned timeframes.
Policy-making Process
New Zealand's policy environment is characterised by "a degree of volatility" (Williams 2003:199). This is due in part to New Zealand's three-year electoral cycle, which means that major policy changes can occur very rapidly (Williams 2003:199) and may also result in "political exigencies that leave policy analysts with little time or incentive to track down and digest evidence" (Baehler 2003:37). Baehler's observation is congruent with my experience as a public sector evaluator. The demand from policy analysts for quick turn-around of evaluative findings is often the result of a Minister's request for new or revised policy within a very short timeframe.
Conceptualisation of Evaluation in the Policy Process
The policy process is typically portrayed as a cyclical, unidirectional process with evaluation the end-stage, providing information for decision-making for the next cycle of the process (Baehler 2003, McKegg 2003). Both Baehler and McKegg challenge this conceptualisation. Baehler (2003:32) argues:
The typical textbook portrayal diverges from reality … single-loop models of policy making … fail to recognise the different arenas in which decisions are shaped and the different stages of the cycle where key actors may be more and less open to learning from evaluation results.
McKegg (2003) notes that this linear conceptualisation fails to address the wide range of possible purposes and uses of evaluation. As a result, it fails to capture the many ways in which evaluation can interact with and inform policy development and review, along with programme design and delivery.
Misaligned Timeframes
There is an inherent mismatch in the timeframes associated with the policy process and evaluation activity (Baehler 2003, Williams 2003). Policy processes are aligned with the electoral cycle, with policy making and funding for new initiatives typically occurring at the beginning of the three-year period. However, the funding of evaluations via the Government Budget process commences at the beginning of the financial year (1 July), which often does not fit with key decision-making cycles (Williams 2003). For example, policy makers require evaluative information approximately 12 months before the end of the funding of an existing programme in order to gain ongoing programme funding through the annual Budget-setting process. If an evaluation is scheduled towards the end of the policy cycle (as in the traditional conceptualisation described above), its findings will be too late to be used for decision-making purposes.
It should be noted that two of the factors described above -- the conceptualisation of evaluation in the policy process and the misalignment of policy and evaluation timeframes -- are not only contributing to the demand for quicker evaluation findings, but are also having a negative impact on evaluation use generally (McKegg 2003). These factors will only be addressed through structural change (such as the alignment of policy/programme funding allocation with evaluation funding allocation) and strategies to increase understanding about evaluative activity. Such strategies could include educating public sector managers about the ways evaluation can be used for decision-making, policy development, and programme design and development.
Additional Barriers
Within the context of the policy environment identified above there are other factors that further limit the ability of evaluators to respond to requests for rapid evaluation findings. These barriers relate to resourcing, evaluator supply, the evaluability of some programmes and policies, and stakeholder expectations. Each of these is briefly discussed below.
The first barrier concerns internal resourcing limitations. Many public sector evaluation teams are small (with perhaps three to eight staff) in comparison with other teams in the organisation who commission evaluations (for example, policy teams, which may comprise 20--100 staff). Given an evaluation team's size and its annual work programme, it has limited capacity to respond to requests for "quick" evaluations, particularly if such requests are unplanned or ad hoc, as is often the case. The notion of an evaluation team maintaining spare capacity for such unplanned, urgent work is not feasible given its routine work demands.
The second barrier concerns the supplier market. Just as internal evaluation teams have limited spare capacity to respond to requests for quick evaluations, the number of evaluators in New Zealand is similarly constrained. This is attributed to two factors -- the amount of work available for private sector evaluators and the limited capacity of the supplier market (Baehler 2003).
A third potential barrier concerns the evaluability of policies or programmes. This refers to "a process that helps evaluators to identify evaluations that might be useful, explore what evaluations would be feasible, and design useful evaluations" (Wholey 2004:33). One of the considerations when assessing the evaluability of a policy or programme is the length of time it has been running, and whether this is sufficient for the stated purpose of the evaluation (for example, to measure outcomes)2. The evaluability of new policies or programmes is particularly problematic for outcome and impact evaluations. A new programme requires time to become established. A sufficient number of clients need to flow through the intervention and the results of the programme need to be manifested before an evaluation assessing the programme's outcomes can begin. Baehler (2003:31) notes that a minimum period of 18 months is required for an evaluation to be reported on the success of a new programme. These time requirements are often at odds with the demands from policy or other stakeholders for rapid evaluation results.
Another potential barrier concerns stakeholder expectations about methodology. In my experience, some of those who are commissioning evaluations have unrealistic expectations about the type of evaluation method that is feasible given the time and resources available. The preferred methodology of such stakeholders is the result of their desire for a level of evidence or certainty about the effectiveness of an intervention, despite limited evaluation funding and a short reporting timeframe. Discussion with evaluators in other government organisations suggests that such unrealistic expectations about methodology are not uncommon.
Solutions in the Literature
A review of the literature suggests three potential ways to meet the demand for quicker evaluation findings.
Time-saving Approaches
The first solution is provided by Bamberger et al. (2006). They describe two ways of thinking about time saving: by reducing the total amount of evaluator resources and effort, or by reducing the time involved in data collection, analysis and reporting, which reduces the overall length of the evaluation. Bamberger et al. argue that this resource/duration distinction is important because the best approach for a particular evaluation needs to be determined: is it more appropriate to reduce the level of evaluator effort and resourcing, or the duration of the evaluation? Each has different implications for how an evaluation will be designed, resourced and executed.
Evaluation design
An alternative way for evaluators to save time is to adopt a simplified evaluation design, and Bamberger et al. (2006) identify a number of ways to achieve this. These include prioritising information needs to focus on critical issues (thereby eliminating the collection of non-essential data), using existing data sets or secondary data, and reducing the sample size.
Bamberger et al. (2006:54) stress that evaluation designs that are simplified in response to timing constraints involve a "methodological compromise". Consequently, evaluators must give sufficient consideration to the potential threats to the validity of the evaluation findings.
Rapid Evaluation Methods
A third potential solution lies with research and evaluation methods referred to as "rapid evaluation and assessment methods" (REAM) (McNall and Foster-Fishman 2007). There has already been a substantial effort to address the demand for rapid findings in the international aid and development fields through the development of REAM methods.
A review of the REAM literature indicates that there is a large family of rapid research techniques, with a bewildering array of acronyms. For example, Beebe (2001) identifies more than 20 REAM approaches, which have originated from different disciplines and areas of practice. From the literature available it is difficult to identify which approach evolved from which other approach, and what features distinguish one approach from another. For the purposes of this paper, five REAM approaches have been selected for investigation: real-time evaluation (RTE), rapid feedback evaluation (RFE), rapid assessment (RA), rapid evaluation method (REM), and participatory rural appraisal (PRA). Table 1 provides an overview of each of these approaches.
McNall and Foster-Fishman (2007:159) identify the common features of REAM approaches. These include a short duration (from a few weeks to a few months), use of mixed methods, collaborative team-based arrangements, iterative research designs in which data are analysed as they are being collected and preliminary findings are used to guide additional data collection. McNall and Foster-Fishman also describe REAM methods as being participatory, such that "representatives of local populations and institutions are involved in the planning and implementation of the research" (2007:159). However, this claim is not upheld in the case of rapid feedback evaluation (Wholey 1983), which does not appear to have any participatory features.
In fact McNall and Foster-Fishman have not necessarily captured all of the common features of REAM approaches. Another common feature is summarised by Beebe's (2001) "sound enough" principle. The underlying principle of REAM methods appears to be that of adequacy for purpose; that is, adequate evidence for the level of precision that is needed for the purpose to which the findings will be used (Anker et al. 1993). REAM studies also appear to be emergent, requiring a high degree of flexibility. They also appear to be designed for instrumental use, such as decision-making and problem-solving, rather than for conceptual or process use3. Finally, there is rapid turn-around of findings, which appear to be reported incrementally, with increasing levels of content and analysis. An initial findings report is often provided before evaluators leave the field.
The literature was searched to identify the strengths and weaknesses of each of the five REAM approaches described in Table 1. This search proved largely unsuccessful, which may be attributable to the lack of substantive differences among the REAM approaches (McNall and Foster-Fisherman 2007:159):
The contrasts between rapid evaluation and assessment/appraisal techniques should not be overdrawn and are likely artifacts of their distinct intellectual lineages and historical contexts of application. At this point in their developmental history, there has been enough cross-fertilization between the variants of REAM that they have become almost indistinguishable.
The only significant difference appears to be that some REAM approaches (for example, REM and PRA) are more participatory than others. The participatory nature of REM and PRA could therefore be regarded as strength of these approaches.
Bamberger et al. (2006:76) identify a significant shortcoming of REAM studies when they note that "such studies do not systematically address the increased threats to validity from being implemented in a quick fashion" . They suggest that further work is needed to address "the trade-offs between time, quality and validity" (2006:76). The validity issue is also noted by McNall and Foster-Fishman (2007), who suggest the use of an adequacy criterion from an evaluation quality framework to assess the quality and rigour of the data and findings.
Use of REAM in the New Zealand Public Sector
Although REAM appears to offer New Zealand evaluators a way of producing rapid evaluative findings, its use must be approached cautiously. For REAM approaches to be credible for public sector purposes, they must be applied appropriately and in ways that do not compromise rigour. This is particularly important given the variable level of knowledge about evaluation among public sector managers and staff (McKegg 2003). REAM could be misused as an "easy and cheap" way of obtaining evaluative information. If undertaken inappropriately or incorrectly, REAM could therefore compromise the integrity of public sector evaluative activity.
If REAM is to be regarded as a credible evaluation approach for public sector purposes it is essential that it is fully understood by evaluators and those commissioning evaluations so that decisions can be made about whether it is an appropriate approach to use in a particular situation. This involves understanding the trade-offs or compromises involved in using REAM. The most significant compromise is that of time versus the type and amount of evaluative evidence required for credible findings. McNall and Foster-Fishman (2007:166) describe this compromise as involving "a balance between speed and trustworthiness." Beebe (2001) and Anker et al. (1993) shed further light on this compromise: Anker et al. describe REM as involving adequate evidence for the level of precision that is needed for the purpose for which the findings will be used, and this adequacy for purpose is echoed by Beebe's (2001) "sound enough" principle, which underpins RA.
As indicated by both Anker et al. (1993) and Beebe (2001), evaluation purpose is a major determinant of whether REAM is an appropriate approach to use in a particular situation. Evaluation purpose determines the nature and amount of evidence required to produce credible evaluative findings. In the public sector context, REAM methods appear to offer an appropriate approach for some evaluations with a formative4 purpose, such as increasing understanding of an evaluand5 and/or participants, diagnostic and/or problem solving, learning and development, design or process improvement, or state of play assessment. REAM is not appropriate for evaluations that aim to provide a summative6 assessment of performance, worth or merit, and for other evaluations where robust evidence is required for attribution or other purposes.
The Time/Resource Matrix
To summarise, a review of the literature suggests three potential ways of addressing the demand for "quick" evaluation findings: controlling the evaluation's duration, effort and resources; determining the evaluation's design; and REAM methods. However, rather than being unrelated factors, each is an individual component of a larger picture. I have developed a matrix (Figure 1) to illustrate the relationships between these factors. The matrix has two axes: the horizontal axis represents time and the vertical axis represents resource. There are four quadrants, each of which represents different stakeholder priorities, and involves evaluation designs and methods of differing levels of complexity, together with different time and resource requirements, as described in Table 2.
Figure 1 The Time/Resource Matrix
Table 2 Characteristics of each quadrant
Quadrant A Higher resource + quicker time |
When to use: for evaluations that are time critical for stakeholders. Evaluation design: the design is more likely to be straightforward, to involve a team approach, and may use rapid evaluation methods. Examples: evaluations with a formative purpose, such as increasing understanding of an evaluand and/or participants, diagnostic and/or problem-solving, learning and development, process improvement, or state-of-play assessment. |
Quadrant B Higher resource + longer time |
When to use: for evaluations that are important to stakeholders but are less constrained by time, or evaluations where time is critical but in a different way to quadrant A; for example, evaluations that require a period of time to pass before the evaluand can be evaluated (such as a new initiative), or for programme results/outcomes to be achieved, or to enable time intervals to be compared. Evaluation design: designs are more likely to be more complex (e.g. longitudinal, action research, baseline/post, outcome, impact). Examples: evaluations that measure change over time; evaluations where the level of attribution is important; outcome and impact evaluations; evaluations involving a summative judgement. |
Quadrant C Less resource + longer time |
When to use: for evaluation projects with fewer time constraints and less resource, which are more likely to be less important or “nice to do” evaluations. |
Quadrant D Less resource + quicker time |
When to use: where there is sound existing data that involves minimum additional effort or resource. If there is no existing data, or the quality of the data is questionable, evaluators should not be working in this quadrant as doing so will lead to issues of evaluation quality. |
Use of the Matrix
Although the matrix does not solve the timeliness issue or address all of the identified barriers, it does provide evaluators with a means of conceptualising the timeliness issue and communicating with stakeholders. The matrix is a useful educational tool for explaining the time/resource/method relationships to stakeholders, thereby helping to manage their expectations and negotiate their requirements. The matrix also helps evaluators to explain to stakeholders any evaluability issues relating to time. Further, the matrix is a useful tool for planning and managing annual work programmes in public sector evaluation contexts. Care can be taken during the annual business planning process to ensure that work projects are distributed across quadrants A, B and C, so as to avoid over-committing evaluator resources and preventing bottlenecks.
Limitations of the Matrix
Despite its usefulness for the above purposes, the matrix does not address the factors in the policy environment that contribute to the demand for quicker evaluative findings, particularly the conceptualisation of evaluation in the policy process, and the misalignment of policy and evaluation timeframes. As noted above, these factors will only be addressed through structural change (such as alignment of policy/programme funding allocation with evaluation funding allocation) and strategies to increase understanding about evaluative activity. Such strategies could include educating public sector managers about the ways evaluation can be used for decision-making, policy development, and programme design and development.
Conclusion
This paper describes an investigation into ways of meeting the demand from public sector evaluation stakeholders for "quicker" findings. The literature describes three potential solutions to enable evaluators to respond to such demand. The third of these solutions -- rapid evaluation and assessment methods (REAM) -- is suggested with some caution. Although REAM offers evaluators a means of providing rapid evaluative findings, this approach must be used appropriately and applied correctly. Failure to do so could compromise the credibility of evaluative activity in the public sector.
The three potential solutions to the timeliness issue are presented in the literature as unrelated elements, but I have developed a matrix to illustrate their relationships with each other. This Time/Resource Matrix provides a useful tool for managing stakeholder expectations and requirements.
References
Anker, M., R. Guidotti, S. Orzeszyna, S. Sapirie and M. Thuroax (1993) “Rapid evaluation methods (REM) of health service performance: Methodological observations” Bulletin of the World Health Organisation, 71(1):15--21.
Baehler, K. (2003) “Evaluation and the policy cycle” in N. Lunt, C. Davidson and K. McKegg (eds.) Evaluating Policy and Practice: A New Zealand Reader, Pearson Prentice Hall, Auckland.
Bamberger, M., R. Rugh and L. Mabry (2006) RealWorld Evaluation -- Working under Budget, Time, Data and Political Constraints, Sage, Thousand Oaks, CA.
Bedford, R., C. Davidson, R. Good and M. Tolich (2007) “Getting all the ducks all in a row: And then what?” paper presented at the Social Policy, Research and Evaluation Conference, Wellington, 3--5 April 2007.
Beebe, J. (2001) Rapid Assessment Process: An Introduction, Altamira Press, Oxford.
Chambers, R. (1994a) “Participatory rural appraisal: Analysis of experience” World Development,22(9):1253--1268.
Chambers, R. (1994b) “The origins and practice of participatory rural appraisal” World Development, 22(7):953--969.
Davidson, E.J. (2005) Evaluation Methodology Basics: The Nuts and Bolts of Sound Evaluation, Sage Publications, Thousand Oaks, CA.
Grasso, P.G. (2003) “What makes an evaluation useful? Reflections from experience in large organizations” American Journal of Evaluation, 24(4):507--514.
Jamal, A., and J. Crisp (2002) Real-Time Humanitarian Evaluations: Some Frequently Asked Questions,Evaluation and Policy Analysis Unit, United Nations High Commissioner for Refugees, Geneva.
McKegg, K. (2003) “From margins to mainstream: The importance of people and context in evaluation utilization” in N. Lunt, C. Davidson and K. McKegg (eds.) Evaluating Policy and Practice: A New Zealand Reader, Pearson Prentice Hall, Auckland.
McNall, M.A., and P.G. Foster-Fishman (2007) “Methods of rapid evaluation, assessment and appraisal” American Journal of Evaluation, 28(2):151--168.
McNall, M.A., V.E. Welch, K.L. Ruh, C.A. Mildner and T. Soto (2004) “The use of rapid-feedback evaluation methods to improve the retention rates of an HIV/AIDS healthcare intervention” Evaluation and Program Planning, 27:287--294.
Patton, M.Q. (1997) Utilization-focused Evaluation (3rd ed.), Sage Publications, Thousand Oaks, CA.
Sanders, J. R. and The Committee on Standards for Educational Evaluation (1994) The Program Evaluation Standards: How to Assess Evaluations of Educational Programs (2nd ed.), Sage Publications, Newbury Park, CA.
Sandison, P. (2003) Desk Review of Real-time Evaluation Experience, United Nations Children’s Fund, New York, www.unicef.org/evaldatabase/files/FINAL_Desk_Review_RTE.pdf
Scriven, M. (1991) Evaluation Thesaurus (4th ed.), Sage, Thousand Oaks, CA.
Vincent, N., S. Allsop and J. Shoobridge (2000) “The use of rapid assessment methodology (RAM) for investigating illicit drug use: A South Australian experience” Drug and Alcohol Review, 19: 419--426.
Williams, B. (2003) “Getting the stuff used” in N. Lunt, C. Davidson and K. McKegg (eds.), Evaluating Policy and Practice: A New Zealand Reader, Pearson Prentice Hall, Auckland.
Wholey, J.S. (2004) “Evaluability assessment” in J.S. Wholey, H.P Hatry and K.E. Newcomer (eds.), Handbook of Practical Program Evaluation, Jossey-Bass, San Francisco.
Footnotes
1Email for correspondence: heather.nunns@paradise.net.nz
2Note that the length of time a programme has been running is not an issue for formative evaluations; that is, evaluations that attempt to identify improvements to programme design, implementation or delivery.
3Instrumental use occurs when a decision or action follows, at least in part, from an evaluation. Conceptual use refers to the use of evaluation to influence thinking about issues in a general way. Process use refers to, and is indicated by, individual changes in thinking and behaviour, and programme or organisational changes in procedures and cultures, that occur among those involved in evaluation as a result of the learning that occurs in the evaluation process (Patton 1997:90).
4Evaluation for the purpose of improvement (Scriven 1991).
5That which is being evaluated (Davidson 2005).
6An overall assessment or evaluative judgment (Patton 1997).