Improvement Cycles

Active Implementation relies upon three improvement cycles: the plan-do-study-act (PDSA) cycle, usability testing, and practice-policy communication loops. Essentially, the improvement cycles in implementation work are based on the plan-do-study-act (PDSA) cycle developed by Bell Labs in the 1920s to improve quality and reduce errors in design and manufacturing (Shewhart, 1931). The PDSA cycle has been used successfully in many applications in human services (Joyce & Showers, 2002; Varkey, Reller, & Resar, 2007; Weick, Sutcliffe, & Obstfeld, 1999).

PDSA

The plan-do-study-act cycle involves a “trial-and-learning” approach in which the PDSA steps are conducted over iterative cycles designed to discover and solve problems, and eventually leads to achieving high standards while eliminating error.  For example, the “plan” can be the innovation as it is intended to be used in practice.  To carry out the plan, the plan needs to be operationalized (what to “do” and say to enact the plan).  This compels attention to the core innovation components and provides an opportunity to a) begin to develop a training process (e.g. here is how to do the plan) and b) create a measure of fidelity (e.g. did we do the plan as intended).  The budding fidelity measure can be used to interpret the outcomes in the “study” part of the PDSA cycle (e.g. did doing the plan produce desired results).  If the plan was done as intended and there are poor results (an innovation problem), the innovation needs to be changed before the next cycle.  If the plan was not done as intended (an implementation problem), the preparation of practitioners needs to be improved before the next cycle so they can “do” the innovation as planned.  Varkey, Reller, & Resar (2007) provide a vivid example of the use of the PDSA cycle to improve health care in one month at the Mayo Clinic.

Usability Testing

As noted in the section on the innovation, an innovation (whether “evidence-based” or not) needs to meet the criteria for a “program.”  There is some evidence that the more clearly the core components of a program or practice are known and defined, the more readily the innovation can be implemented successfully (Bauman, Stein, & Ireys, 1991; Dale, Baker, & Racine, 2002; Winter & Szulanski, 2001). The core components specify, “which traits are replicable, how these attributes are created, and the characteristics of environments in which they are worth replicating” (Winter & Szulanski, 2001, p. 733).

Thus, the specification of the program is very important to the process of preparing for implementation.  McGrew, Bond, Dietzen, & Salyers (1994) caution that (1) most program models are not well defined conceptually, making it difficult to identify core intervention components, (2) when core intervention components have been identified, they are not operationally defined with agreed-upon criteria for implementation, and (3) only a few models have been around long enough to study planned and unplanned variations.  Thus, usability testing methods will be used by Implementation Teams to establish the core components of ill-defined evidence-based programs and other innovations.

Usability testing consists of a series of tests of an innovation.  Usability testing grew out of the computer software and website design professions where hundreds of thousands of lines of code were written by teams of programmers to develop such things as new word processing and spreadsheet programs.  Given the sheer volume of code, errors were expected and a testing method needed to be established to quickly and accurately detect and correct the errors.  As researchers studied these issues, it became apparent that maximum benefit was derived from running multiple tests because the real goal of usability testing is to improve the programs, not just document their weaknesses (e.g. Allen, 1996; Frick, Elder, Hebb, Wang, & Yoon, 2006; Genov, 2005; Nielsen, 2000; 2005; Rubin, 1994).  They found that one user might detect about 30% of the errors but 4 or 5 users typically found 85% of the problems.  More than 4 or 5 users in a test group produced redundancy but did not uncover more errors.  The problems detected in the first test would be corrected and the modified program would be tested again with a new group of 4 or 5 users.  The second test provides information on whether the corrections worked or not.  Sometimes the corrections themselves cause new problems that are detected in the second test.  Often, the first usability test detects more obvious “surface problems” (e.g. the software did not load properly so the user had no opportunity to use the program itself; entries on a spreadsheet could not be combined in any useful way).  The second usability test finds more of the original usability problems that were not detected in the first test, and tests the adequacy of the corrections made prior to the second test.

With the more obvious problems out of the way, the third test probes deeper into the usability of the fundamental operations of the program and the match with user needs. These important issues often are obscured in the first two rounds of testing where the users are stumped by obvious surface-level usability problems that prevent them from really digging into the program.  Thus, the third test serves as quality assurance regarding the changes made after the first two tests and helps provide deeper insights as well. The third test will lead to a new (but smaller) list of usability problems to fix in a redesign. Once again, not all the fixes will work and more and deeper issues will be uncovered. Thus, a fourth test or fifth test may be needed as well.  Researchers have found that the ultimate user experience is improved much more by 4 tests with 5 users than by a single test with 20 users.  This is a major difference between pilot testing and usability testing.

Practice-Policy Communication Loop

Implementing effective innovations requires change at the practice, organization, and system levels.  But what needs to change and how much change is needed to achieve desired outcomes?  Answering this question is the purpose of the practice-policy communication loop. 

Ulrich (2002) has named current systems producing current outcomes “legacy systems” that are the result of “[d]ecades of quick fixes, functional enhancements, technology upgrades, and other maintenance activities [that] obscure application functionality to the point where no one can understand how a system functions” (pp. 41-42).  Ulrich was describing computer software applications but he might as well be describing human service systems.  Every system has problems with fragmentation, dysfunction, and wasted resources.  Still, when legacy software systems containing billions of lines of code were reinvented to more efficiently and effectively produce desired new results, computer scientists found that about 80% of the “old” system still remained in the “new” system when the transformation was complete.  The good news is that a few changes can produce a very large difference – changes do not have to be revolutionary to be transformative (Crom, 2007).  The bad news is that no one can look at a system and know with certainty in advance what to keep and what to change.  Those decisions came from making changes, seeing the results, then modifying or keeping the components being examined (Morgan & Ramirez, 1983; Senge, 2006). 

The goal is to create systems that "are able to learn from their own experience and to modify their structure and design to reflect what they have learned" (Morgan & Ramirez, 1983; p. 4).  Part of that learning comes from the SEA soliciting, receiving, and responding to feedback from schools and districts regarding barriers and facilitators to implementation.  The use of this practice-policy improvement cycle helps to create the ability of the system to monitor and question the context in which it is operating and to question the rules that underlie its own operation.  The system has the capacity to search for errors and faulty operating assumptions, the capacity to learn from them, and the ability to make needed changes to improve intended outcomes. 

Increasingly, Federal and State governments are supporting and occasionally insisting upon the use of evidence-based programs and other reforms in human services. They are developing policies and funding structures to encourage organizations and staff to use evidence-based programs and other reforms.  In our examination of the literature and discussions with system change agents globally, there are many examples where policy makers have mandated the use of evidence-based programs and other reforms with little impact on service delivery (e.g., Chapin Hall for Children, 2002; Jerald, 2005; Nutt, 2002; O’Donoghue, 2002; Stuit, 2011).  There are some examples where evidence-based programs or reforms were used with good outcomes for a period of time then abandoned (e.g., Bryce, Gilroy, Jones, Hazel, Black, & Victora, 2010; Glennan, Bodilly, Galegher, & Kerr, 2004).  There also are a few examples of success where policies encouraged the use of effective innovations or reforms at the practice level, the innovations were supported with effective implementation efforts, and systems changed to encourage widespread use of the innovation (e.g., Glennan et al., 2004; Khatri & Frieden, 2002; Ogden et al., 2005; Rhoades Bumbarger, & Moore, 2012). 

What differentiated the large-scale successes from those with temporary or no outcomes?  The successes had direct and frequent communication from the practice level to the executive level (the practice-policy communication loop).  The policy-practice (top down) support was present in the successes and failures.  What was missing in many cases is the practice-policy (bottom up) communication with the leaders who initiated the process of change.  Our analyses suggest the combination of the practice to policy communication loop (detect problems, identify leverage points) and the policy to practice supports (solve real problems in real time) are critical features of successful efforts to implement evidence-based programs and other reforms on a socially significant scale (Fixsen et al. in press).

In successful system change efforts, executive management teams frequently (at least monthly) to hear about what is helping or hindering efforts to make full and effective use of evidence-based programs at the practice level.  The information may consist of descriptions of experiences and include data collected with reasonable precision.  They are engaged in system change informed by the practice-policy communication loop. 

Implementation Teams are essential to the practice-policy communication loop and the organization and system change processes.  First, the Implementation Teams have the knowledge, skills, and abilities to help practitioners and staff actually make full and effective uses of the innovations enabled by policy.  This capacity to implement with fidelity and good outcomes is essential to the system change process.  If the policies or innovations are not being used as intended, or are being used as intended but not producing desired outcomes, those implementation and intervention issues need to be resolved at the practice level before asking the executive leadership to intervene in how the system functions.  Second, the Implementation Team members have firsthand experience with the facilitators and barriers to making full and effective uses of those innovations in human service settings.  At this point in the process it is clear to the Implementation Team (and others) that the innovations can be used as intended and can produce desired outcomes.  However, the experience demonstrated that in some ways the system itself is not supporting the effective use of the innovation. Given the competence of the Implementation Teams, if they are encountering difficulties, their concerns will have credibility with the executive management team and the leaders will have sufficient information and confidence to change the system to better support improved outcomes for students.