Fidelity to Motivational Interviewing and subsequent cannabis cessation among adolescents

This study tested whether differences in cannabis cessation 3 months after a single session of Motivational Interviewing (MI) may be attributable to fidelity to MI. All audio-recordings with necessary 3-month follow-up data (n=75) delivered by four individual practitioners within a randomised controlled trial (RCT) were used. Participants were weekly or more frequent cannabis users aged 16-19 years old in Further Education colleges. All tapes were coded with the Motivational Interviewing Treatment Integrity (MITI) scale Version 2 by 2 coders. Satisfactory inter-rater reliability was achieved. Differences between and within practitioners in fidelity to MI were consistently detected. After controlling for practitioner effects, Motivational Interviewing spirit and the proportion of complex reflections, were independently predictive of cessation outcome. No other aspects of fidelity were associated with outcome. Two particular aspects of enhanced fidelity to MI are predictive of subsequent cannabis cessation 3 months after a brief intervention among young cannabis users.


INTRODUCTION
Drug and alcohol treatment outcomes have long been known to vary according to which practitioners actually deliver counseling interventions (Luborsky, McLellan, Woody, O'Brien, & Auerbach, 1985;Miller, Taylor, & West, 1980;Valle, 1981). Such variability persists in large well designed clinical trials with rigorous training and other quality control procedures (Project Match Research Group 1998). This appears to be less due to the attributes of the practitioners themselves and more to do with their behavior within sessions (Project Match Research Group 1998).
This need for dedicated process study applies not only to MI but also to other psychological treatment interventions (see (Morgenstern J, 2000) for example), and to brief adaptations of MI. These are often undertaken for preventive rather than treatment purposes, and target an increasingly wide range of health behaviors (Dunn, De Roo, & Rivara, 2001). Without process study it may be difficult to know to what extent so-called brief motivational interventions really adhere to both the principles and practice of MI. Process study may also assist determination of the extent to which it is MI itself, or other features of brief adaptations which are associated with better outcomes.
An early contribution to study in this area was made by Miller and colleagues, (Miller, Benefield, & Tonigan, 1993) who found a large effect of greater practitioner confrontational behavior on poorer drinking outcomes 12 months later. Process study has developed rapidly in recent years, with a particular focus on fidelity to MI. The Motivational Interviewing Skills Code (MISC) was initially developed as a comprehensive process instrument (T. Moyers, Martin, Catley, Harris, & Ahluwalia, 2003). This permitted in-depth investigation of practitioner behavior, client behavior and their interaction. A subsequent briefer instrument, the Motivational Interviewing Treatment Integrity (MITI) scale, was derived from the MISC to focus only on practitioner behavior (T. Moyers, Martin, Manuel, Hendrickson, & Miller, 2005).
Studies using the MISC have robustly identified fidelity variables to be related to targeted client responses within sessions in interventions for alcohol treatment and smoking cessation purposes T. B. Moyers & Martin, 2006;T. B. Moyers, et al., 2007;T. B. Moyers, Miller, & Hendrickson, 2005). Both global ratings of aspects of MI counseling style, as well as specific verbal behaviors which are consistent with the MI approach have been found to be predictive of these within-session outcomes.
MI is fundamentally concerned with the nature of the interaction between practitioner and client . Change talk by the client is evoked and strengthened by the actions of the practitioner within the session specifically to increase the probability of subsequent behavior change being maintained (Miller & Rose, 2009). Evidence for the hypothesized causal chains has recently been evaluated across studies (Apodaca & Longabaugh, 2009). One way to conceptualize client within-session responses is as existing on a causal pathway between practitioner behavior and subsequent post-session outcome (Miller & Rose, 2009). Another pathway for the effects of MI is that practitioner behavior impacts behavioral outcome directly, unmediated by withinsession client verbal response. In both scenarios effects of practitioner behavior on post-session outcomes are identifiable in principle, and have been identified in practice (Apodaca & Longabaugh, 2009).
MI process variables derived from validated measures have also been investigated in relation to subsequent behavioral outcomes. Thrasher and colleagues (Thrasher, et al., 2006) identified some relationships between MISC variables and adherence to antiretroviral therapy 8 weeks later. The effects of questions and reflections have been evaluated, with some evidence of impact on 3month alcohol outcomes in brief intervention among university students (Tollison, et al., 2008). More recently, in the most sophisticated such study to date, Gaume and colleagues (2009) used the MISC to study relationships between a brief intervention delivered in the emergency department and 12-month outcomes. We thus included a dedicated process study within a trial of a single-session adaptation of MI among young cannabis users.

Overview of parent trial and participants
The parent trial took place in 11 London Further Education (F.E.) Colleges (McCambridge, Slym, & Strang, 2008). These are non-traditional educational and training institutions catering mainly for older teenagers, giving access to large numbers of young people. Eligibility criteria were age 16-19 years old, weekly or more frequent cannabis use, literacy sufficient for questionnaire completion and English language. Potentially eligible students were identified by college staff as well as being directly approached by researchers in informal areas. Those eligible and consenting completed baseline assessments and were randomized to a single session intervention of either MI or a standardized advice intervention (McCambridge, et al., 2008). The effectiveness trial was designed to evaluate possible differences in cannabis and other substance use outcomes after 3 and 6 months, in order to establish whether the specific content of the MI approach provided additional benefit to that which may be obtained in an individualised advice discussion. Study procedures were approved by the Maudsley/Institute of Psychiatry Ethical Committee.
After 3 months 83% (269/326) were followed-up, the interval at which behavioral outcome was assessed in the present study (McCambridge, et al., 2008). A bogus pipeline saliva collection procedure (Werch, Lundstrum, & Moore, 1989) at 3 months follow-up took place prior to selfcompletion questionnaire. Preliminary analyses in the form of cross-tabulations of outcome by practitioner identified that cannabis cessation outcomes in the dataset of 75 sessions (see 2.3) were broadly comparable to those for the parent trial as a whole. The sociodemographic and prior substance use characteristics of the 75 participants in the present study were also representative of the parent trial study population. The 75 participants had a mean age of 18 years, 65% were male, and approximately 45% were Black, 11% White, and 44% Asian or other. 93% had ever smoked cigarettes and 83% had previously drunk alcohol. Other drug use was rare; ecstasy was the most common other drug ever used, by approximately 7%.

Brief intervention details
We built upon our earlier work adapting MI to develop an early intervention model in this setting for preventive purposes (Gray, McCambridge, & Strang, 2005;McCambridge & Strang, 2003Slym, Day, & McCambridge, 2007). This intervention is a direct attempt to implement the counseling style of MI, albeit not in a formal counseling context. The earlier adaptation of MI was structured by a series of conversational topics, selections of which were made according to progress of the discussion during a single-session of no more than one hour duration (McCambridge & Strang, 2003). We simplified this structure, so that after rapport building, consideration of the benefits and costs of drug use was followed by discussion of; values and goals; risks, problems and concerns; decision-making; and self-monitoring or change as appropriate.
Discussions of cannabis use formed the bulk of sessions delivered, and although other drug use was also considered, this particular behavior was the primary target of the intervention (McCambridge, et al., 2008). Study participants may have benefited from consideration of their cannabis use in various ways, and for example decided to reduce the overall frequency or quantity of consumption, made some other alteration to pattern of use, ceased use or decided that there was no need for change.

Session dataset and cessation outcome
Although we did not make it a condition of participation in the parent trial, we invited participants to give permission for sessions to be audio-recorded, with a view to coding with the MITI (T. . Of the 164 participants who were allocated to MI, 17 did not attend. Approximately 64% (94/147) of MI sessions delivered were recorded, with all non-recorded sessions due to lack of research participant consent. Eight of the 94 participants for which there were audio-recorded sessions were not located for 3-month follow-up, leaving 86 with outcome data. Of these 5 refused to provide a saliva sample. As these refusers were found to have different self-reported outcomes than non-refusers, they were excluded from this study as a result of concern about the reliability of their outcome data (McCambridge, et al., 2008). Participants achieving cessation outcome in this study therefore both reported no cannabis use within the 30 days prior to three-month follow-up, and at that time provided a saliva sample on request prior to data collection. We specifically selected cessation as the outcome to be investigated in the present study due to these features even though reduced use frequency was the primary outcome for the parent trial. Of the 81 audio-recorded sessions for which 3-month follow-up data collection was satisfactorily completed, 6 sessions were delivered by 3 practitioners, providing insufficient data for study of between and within practitioner variability. The remaining 75 sessions were delivered by 4 individual practitioners, and these comprise the dataset for this study.

Practitioners
Practitioner 1 (JM) organized and facilitated a 2 day workshop with the assistance of Prof.
Stephen Rollnick for the specific purposes of the trial. The other three practitioners were psychology graduates, having first degree (Practitioner 3), masters (Practitioner 2), and doctorallevel (Practitioner 4), qualifications respectively. They participated in this workshop and in later supervision sessions, which were organized ad hoc in response to need as perceived by the practitioner. These were delivered by Practitioner 1 who was not trained as a pyschologist, though had some years previous experience of adapting and delivering MI interventions, and delivered all interventions in the original trial. Just under 50% of sessions were audio-recorded by Practitioner 4, as compared to more than 80% of sessions delivered by practitioners 1-3. There is thus a possibility that data for this particular practitioner were a biased sub-sample of the sessions delivered.

MITI data and coding
MITI Version 2 comprises 2 global ratings (scored 1-7) of MI spirit and empathy and the 7 behavior counts identified in Table 1. All 9 items focus upon practitioner behavior. The MI spirit global rating summarizes the extent to which the practitioner has a collaborative style, evokes the use of personal reasons for change and supports their autonomy (T. . These are the foundational principles of MI and this rating summarises the extent to which the practitioner delivers MI as it has been developed. Empathy is not specific to MI and the empathy global rating refers to the extent to which practitioners make attempts to elicit the clients own views about their situation (T. . In addition to these ratings, each utterance made by the practitioner is coded to one of 7 verbal behavior categories, with the total number of each of these counts being transformed into the 4 summary measures contained within Table 2. Closed questions provide a fixed number of response options, whilst open questions allow the client to elaborate when choosing how to respond directly to a question. A simple reflection is a statement which conveys understanding of what has been said, whilst a complex reflection is one which specifically adds substantial meaning to what has been said (T. . For example, they can communicate understanding of the feelings involved in what has been said and use tone and other verbal techniques in an exploratory fashion. Complex reflections encapsulate practitioner insights into what is going on for the client. When such statements are made they encourage a response by the client either to confirm and expand on what the practitioners had said or to correct it. Two categories define other verbal behaviors as consistent or inconsistent with a MI approach. Finally, there is a general information category. Full definitions for all these measures and more detailed guidance on the actual process of coding are contained within the freely available manual (http://casaa.unm.edu/download/miti.pdf).
In order to establish the inter-rater reliability of the MITI in this study, we double coded a 20% random sample (n=19) of the initial dataset of 94 audio-recordings of MI sessions. The first coder (MD) attended a 3-day workshop facilitated by an original author of the MITI from the University of New Mexico (Jennifer Knapp-Manuel). The second coder (BT) was directly trained by the first coder. Data from second codings of the same session were not further used after the establishment of inter-rater reliability. As this was a brief intervention with mean session duration of 27 minutes, we decided to code the entire tape for all sessions rather than the usual 20-minute segment, following discussion with the lead author of the MITI (Theresa Moyers). The two coders were not involved in interventions delivery and worked closely together to discuss issues as they arose and to prevent coder drift.

Data analyses
Intra-class correlation coefficients were computed for the inter-rater reliability analyses. We used standard guidance to evaluate these data (Ciccetti, 1994). Cross-tabulated categorical data were analyzed using Chi-squared tests. Mean differences in continuous data were compared by t test for 2 groups, as for cessation outcome, and by F test for multiple groups, as in analyses of practitioner differences. Two sided significance tests were used throughout. Analyses were undertaken in both SPSS and STATA.
To study fidelity, it is necessary to separate the extent to which the practitioner is adhering to MI in any given session (within-practitioner variability) from differences between practitioners (between-practitioner variability) in overall skill levels. This latter variable has previously been identified to be pivotal to securing client involvement within MI sessions (T. B. . Following examination of bivariate associations between process data and cessation outcome, a multi-level logistic regression model of cessation outcome was fitted, to take account of the fact that the fidelity data in individual sessions are nested within practitioners. The practitioner effects evaluated in the multi-level model were, however, too small to be meaningfully estimated (see Results). For this reason a simpler model was deemed more appropriate, in which standard errors were adjusted to take account of non-independence of observations as a consequence of clustering of individual session data within practitioners. The STATA command "cluster", which uses the Huber/White Sandwich estimator of variance to control for the effects of clustering was thus used.

Inter-rater reliability
The inter-rater reliability data are presented in Table 1. The levels of agreement of the global ratings compare well to those obtained in the original MITI validation study (T. . Higher correlations on global ratings were obtained in this study. The behavior count correlation data are broadly similar, with lower levels of agreement for reflections in both studies. Altogether the reliability of 6/9 ratings are excellent, 2/9 ratings good and 1/9 fair (Ciccetti, 1994) . The MITI can be reliably applied to this brief adaptation of MI. ***Insert Table 1 about here***

Bivariate associations between practitioners, fidelity variables and subsequent cessation
There was some evidence of variability in achievement of cessation outcome between practitioners (? 2 [3] = 4.97, p=0.174), with Practitioner 1 having higher cessation rates; 50% (10/20) compared to 25% (5/20), 21% (5/24), 27% (3/11), for practitioners 2-4 respectively. When Practitioner 1 was compared with all three other practitioners together this difference attained statistical significance (? 2 [1] = 4.79, p=0.029). The MITI data for the practitioners is presented in Table 2. It is clear that there was no simple sense in which Practitioner 1 differed from other practitioners in fidelity data. Statistically significant differences across all 4 practitioners were observed for all fidelity measures with the exception of the residual category of proportion of other behaviors coded as MI adherent, in which there were sparse data. Table 2 about here*** There were some differences in MI fidelity data between those who ceased cannabis use and those who did not, without taking account of which practitioner was involved (see Table 3). Among the behavior count summary measures, the observed data are generally very similar for both groups. The exception to this pattern is the proportion of complex reflections which is different between those who continued to smoke cannabis and those who did not. This difference is of approximately 0.5 standard deviations in magnitude. Differences were also apparent for both empathy and MI spirit global ratings, again in the predicted direction, though only the MI spirit difference was statistically significant. These variables are measured upon a 7 point scale and the differences amount to approximately 0.5 and 0.75 points respectively. ***Insert Table 3 about here***

Regression analyses
The findings on practitioners, fidelity variables and outcome considered thus far do not take account of the nested structure of the data. The intraclass correlation coefficient for practitioner was found to be very small, in fact too small to be estimated in the multi-level model. This indicates considerable variability in the behavior of the practitioners. In other words, practitioners appear to behave very differently in different sessions. After taking account of this practitioner variability in the multi-level logistic regression model, both MI spirit and the complex reflections measure remained statistically significant predictors of 3-month cessation outcome (MI spirit logOR=0.5 [0.04-0.96], p=0.033; complex reflections logOR=3.9 [0.81-6.98], p=0.013).
In the regression model with appropriately adjusted standard errors, the addition of empathy global score, nor any other variable including the interaction term for MI spirit and complex reflections were predictive of outcome nor improved the fit of the model. This procedure leaves the estimates of effect unchanged whilst reducing the associated p-values. The model comprising only MI spirit and complex reflections is thus the final model and is presented in Table 4. Table 4 about here***

DISCUSSION
After controlling for within and between practitioner effects, two specific aspects of fidelity of Motivational Interviewing, MI spirit and the proportion of reflections which were complex, were both predictive of cannabis cessation three months after brief intervention. No other aspects of fidelity to MI had any relationship with this outcome. This study adds to recent evidence of fidelity impact extending beyond within-session target outcomes to subsequent behavior. In particular, these findings extend those derived from MISC on the predictive effects of MI spirit and complex reflections on subsequent adult drinking by Gaume and colleagues (2009) to young drug users.
Caution is warranted by the correlational rather than experimental nature of these findings, particularly in relation to vulnerability to confounding by unmeasured variables (Francis, et al., 2005;Miller, et al., 1993). Specifically, it may be that it was more straightforward for practitioners to manifest MI spirit and formulate more complex reflections in sessions with study participants who were more likely to subsequently cease cannabis use. The self-reported nature of the cessation outcome, notwithstanding the use of the bogus pipeline procedure, adds another dimension to this issue. It is possible that those who were best engaged by the practitioner may have been more likely to later report cessation when this was not actually the case. The specificity of these findings suggests otherwise. Also, although participants were randomly allocated to study condition, they were not randomly allocated to practitioners. This means that it is possible or likely that there were differences between practitioners in participant levels of receptivity to intervention.
Findings which incorporate within-session client verbal response as a mediator of subsequent outcome would make this type of evidence more compelling, as would randomised designs in studies of between-practitioner variability. The recent meta-analytic review indicates the level of progress made in relation to the basic meditational hypothesis (Apodaca & Longabaugh, 2009). In this review the two previous studies which provided specific data on practitioner effects on behavioral outcomes included our earlier process study (McNally, Palfai, & Kahler, 2005;. Both these studies did not use validated MI fidelity measures, and in our case a brief post-session rating questionnaire was used . We applied the same questionnaire within this study and did not find that it was predictive of outcome. At the outset we planned this fidelity study to examine outcome at first follow-up in light of the known tendency of brief intervention effects to diminish over time, as in our previous work. Perhaps unsurprisingly, we repeated the analyses reported here at the 6 month interval and found weaker effects which were not statistically significant (data not reported).
This study uses the MITI as a research instrument, although it was not designed for this purpose (T. . The inter-rater reliability data obtained in this study provides further validation support for this instrument. Here this relatively simple instrument has demonstrated potential to capture significant hypothesized elements of the means by which MI obtains its effects. The validation data are impressive, not only in their cross-national nature, but also in the context of the present investigation of a brief intervention with an overwhelmingly minority ethnic study population. The use of MITI for research purposes beyond the establishment of fidelity has also been supported. Cross-cultural validity of the MITI and the MISC may, however, be worthy of further attention, particularly in light of the data in Table 2. In the only other U.K. study using the MITI, Bennett and colleagues (Bennett GA, 2007) identified a particular problem in coding questions as closed when they functioned within the discussion as open questions, contributing to lower levels of measured competence. Here we encountered the exact same problem. Questions such as "could you tell me more?", "anything else?", and "can you tell me a little bit about yourself?" are indeed technically closed questions, and were coded as such, yet they rarely produced closed responses. This clearly has an impact upon the questions measure, and it is interesting that Practitioner 3 for whom this was less of a problem, was the only non-U.K. national involved.
The apparent levels of competence observed here are low (McCambridge, et al., 2008), and may be in part due to cross-cultural differences. They may also be to do with the adaptation characteristics of this intervention and the circumstances of the trial, which was an effectiveness rather than an efficacy study, designed to provide indications of possible benefit in naturalistic conditions rather than employing already highly skilled and experienced practitioners. Practitioner 1 was much more likely to discuss drug use other than the target behavior captured by the MITI (data not reported). Notwithstanding these comments, practitioner preparation for MI delivery in this study has been clearly sub-optimal in view of the observed data, and the lack of a true program of ongoing supervision is likely to be responsible (Miller, Yahne, Moyers, Martinez, & Pirritano, 2004).
Notwithstanding the cautionary remarks, the findings of the present study are important, as they provide direct evidence that greater fidelity to MI is associated with improvement of brief intervention outcome. There is sparse evidence currently available in relation to the content of brief interventions and their possible mechanisms of effect (Gaume J., 2008). The underdeveloped nature of this evidence-base extends beyond the particular drug use outcome investigated in this study to other forms of substance use and other behaviors. The findings of the present study, therefore, assist opening the "black box" of brief intervention to identify candidate components of effect. Gaume and colleagues (2009) similarly found MI spirit and complex reflections were predictive of behavioral outcome, and that the proportion of all questions which were open was not. They also identified a possible much weaker effect of empathy independent of MI spirit. In contrast, however, MI adherence and the ratio of reflections to questions were also found to be predictive in that study. These discrepancies may be partly explained by the fact that the brief intervention being evaluated was not a direct attempt to implement MI as was the case in the present study, in addition to the possible effects of different populations and settings. The London Further Education college setting and the study population comprised mainly Black and Asian older adolescents should be borne in mind when considering the generalisability of these findings. It is also possible that study participants may have accessed other sources of support for cannabis cessation before or during the follow-up period. Although we did not measure this, we expect it unlikely given the non-help seeking nature of the study population.
The findings provide prima facie empirical support for the emphasis given to the combination of spirit and technique by Miller and Rollnick (2002). The specificity of these data is interesting and the lack of association between other MITI variables and cessation outcome in this study should also be carefully considered. It is an attractive possibility that more skilful complex reflection and greater embodiment of MI spirit together may have a particularly important contribution to make in boosting the effectiveness of MI. The interaction between the two was not, however, statistically significant in this study, where they were thus found to have independent effects on outcome. Further study exploring possible relationships between MI spirit and specific verbal behaviors is thus needed.
An important null finding in this study should not be overlooked: the lack of any consistent practitioner effect due to highly variable performance within sessions. To some extent this is likely to result from the application of a highly person-centred approach in a population with very heterogeneous needs and motivations to change behavior who were pro-actively recruited to study participation. Similar effects have also been reported in the wider psychotherapy literature (Baldwin, Wampold, & Imel, 2007). Attaining more consistency in MI fidelity would seem an obvious route to boosting the effects of brief intervention, and is certainly worth pursuing. We also need to understand better the distinct features of any particular brief intervention, and how these features may help or hinder effectiveness. Practitioners necessarily say and do things differently when delivering MI, and it does matter what exactly they say and do. The specific words uttered, in the form of complex reflections, and the way they are uttered, to the extent that they manifest MI spirit, have been found here to be predictive of behavior change in the form of cannabis cessation among young people, after three months.