Correlations between arm motor behavior and brain function following bilateral arm training after stroke: a systematic review

Abstract Background Bilateral training (BT) of the upper limb (UL) might enhance recovery of arm function after stroke. To better understand the therapeutic potential of BT, this study aimed to determine the correlation between arm motor behavior and brain structure/function as a result of bilateral arm training poststroke. Methods A systematic review of quantitative studies of BT evaluating both UL motor behavior and neuroplasticity was conducted. Eleven electronic databases were searched. Two reviewers independently selected studies, extracted data and assessed methodological quality, using the Effective Public Health Practice Project (EPHPP) tool. Results Eight studies comprising 164 participants met the inclusion criteria. Only two studies rated “strong” on the EPHPP tool. Considerable heterogeneity of participants, BT modes, comparator interventions and measures contraindicated pooled outcome analysis. Modes of BT included: in‐phase and anti‐phase; functional movements involving objects; and movements only. Movements were mechanically coupled, free, auditory‐cued, or self‐paced. The Fugl‐Meyer Assessment (UL section) was used in six of eight studies, however, different subsections were used by different studies. Neural correlates were measured using fMRI and TMS in three and five studies, respectively, using a wide variety of variables. Associations between changes in UL function and neural plasticity were inconsistent and only two studies reported a statistical correlation following BT. Conclusions No clear pattern of association between UL motor and neural response to BT was apparent from this review, indicating that the neural correlates of motor behavior response to BT after stroke remain unknown. To understand the full therapeutic potential of BT and its different modes, further investigation is required.


Introduction
Stroke is the second highest cause of death and the leading cause of disability globally (Di Carlo 2009). Of those who survive, only a third regain some functional use of the upper limb (UL) (Kwakkel et al. 2003), which impacts on independence, mood and participation (Levy et al. 2001;Langhorne et al. 2009;Morris et al. 2013). Considering that most activities of daily living (ADL) involve the UL, it is crucial to improve UL motor behavior after stroke.
UL motor rehabilitation focuses mostly on unilateral training (i.e., training the affected UL only) (Winstein et al. 2004). Interventions include: electromyography biofeedback (Woodford and Price 2007), electrostimulation (Pomeroy et al. 2009), robotic-assisted training (Mehrholz et al. 2012) and constraint-induced therapy (CIT) (Sirtori et al. 2009). In contrast, bilateral UL training (BT) is a form of training where both ULs perform identical movements simultaneously, yet independently (e.g., carrying a box) (Mudie and Matyas 1996;Stewart et al. 2006). BT can be undertaken in different modes. In the in-phase mode, both ULs move in the same direction at the same time (e.g., bending both elbows). In the antiphase mode, one UL moves in one direction (e.g., bending the elbow) as the other moves in the opposite direction (e.g., extending the elbow). BT is not to be confused with bimanual training, where both limbs move simultaneously but perform different movement patterns (e.g., tying shoelaces, opening a jar).
BT emerged from motor control theories and observations in nonimpaired people that, during rhythmic movement of both limbs, a coupling effect occurs in which both limbs adopt similar spatial and temporal movement characteristics, leading to a stable form of coordination (Kelso 1984;Swinnen 2002). Beneficial effects of BT are thought to arise from this interlimb coupling effect, which in people with stroke may lead to facilitation of paretic arm movement by the nonparetic arm (Mudie and Matyas 1996;Swinnen 2002;Stewart et al. 2006). It is postulated that the simultaneous activation of both hemispheres facilitates activation of the damaged hemisphere (Cauraugh et al. 2008;Stinear et al. 2008) through rebalancing of interhemispheric inhibition that has been disrupted following a stroke. BT is thought to reduce transcallosal inhibition from the nonaffected hemisphere to the affected hemisphere, thereby increasing output of the latter (Stinear and Byblow 2004;Cauraugh and Summers 2005). Additional pathways that may be facilitated during BT include ipsilateral uncrossed corticospinal pathways (Cauraugh and Summers 2005) and spared indirect corticospinal pathways, which receive input from bilateral reticulospinal and rubrospinal pathways (Mazevet 2003).
Individual studies with stroke survivors have reported benefits of BT including: increased velocity and smoothness of movement (Cunningham et al. 2002;Harris-Love et al. 2005), long-term functional recovery (Whitall et al. 2000;Luft et al. 2004;Cauraugh et al. 2008;Stinear et al. 2008) and changes in brain activation. BT has been recommended in the UK National Clinical Guidelines for Stroke for patients with persistent UL impairment (Intercollegiate Stroke Working Party, 2012). However, there is as yet no conclusive evidence from systematic reviews to show significant benefits of BT over unilateral UL training, placebo or no training (Stewart et al. 2006;Langhorne et al. 2009;Cauraugh et al. 2010;Coupar et al. 2010;Latimer et al. 2010;Van Delden et al. 2012;Pollock et al. 2014a). A Cochrane overview of systematic reviews found unilateral UL training to be more beneficial than bilateral UL training in terms of UL motor behavior and ADL, but no more beneficial in terms of arm motor behavior. However, the evidence was of moderate GRADE quality only (Pollock et al. 2014a).
The lack of conclusive evidence of the efficacy of BT has been attributed to the inadequate matching of diverse BT methods to patient characteristics and/or the use of inappropriate outcome measures (Sleimen-Malkoun et al. 2011). It is conceivable that BT may have a restorative effect in some stroke survivors. Assessing the potential of BT as an intervention in stroke rehabilitation, therefore, requires a better understanding of the neural correlates of motor behavior response to BT. Moreover, given that stroke recovery is associated with reorganization of neural networks (Cramer and Bastings 2000;Calautti and Baron 2003), it is important to understand how BT impacts on neuroplasticity in order to advance knowledge of how to match rehabilitation interventions to individual patients. Over the last decade, several reviews have provided evidence for associations between arm motor recovery and neuroplastic changes in brain structure/function, using methods including transcranial magnetic stimulation (TMS) or Functional Magnetic Resonance Imaging (Calautti and Baron 2003;Ward 2006;Buma et al. 2010).
While there are other reviews of BT (Stewart et al. 2006;Langhorne et al. 2009;Cauraugh et al. 2010;Coupar et al. 2010;Latimer et al. 2010;Van Delden et al. 2012;Veerbeek et al. 2014), they have not addressed changes in neuroplasticity alongside changes in UL motor behavior. The aim of the systematic review reported here was to identify the relationship between arm motor behavior and brain structure/function in response to BT after stroke. To the authors' knowledge, this is the first systematic review to address the question of how BT affects UL motor behavior and neural function after stroke.

Methods Design
A systematic review with pre-set inclusion criteria, independent identification of studies and data extraction, and narrative synthesis.

Type of study
Quantitative studies of any design were included as this is a novel topic (Armstrong et al. 2007) and as studies reporting on neuroplasticity may include case series or small cohort studies.

Type of participants
Adults (≥18 years old) with a clinical diagnosis of stroke (WHO Monica Project Principal Investigators, 1988). Studies with participants at any time since stroke, with any type and locality of stroke, initial UL impairment, previous stroke(s), and comorbidities were included. without objects. In the absence of a minimum standard for the dose of BT, we accepted studies that included at least a single session of any intensity, similar to the Cochrane systematic review on BT (Coupar et al. 2010). Only studies allowing the effects of BT to be analyzed as a single intervention were included.

Type of outcomes
Studies had to include both a measure of the effects of BT on: (1) UL motor behavior (e.g., Action Research Arm Test [ARAT], Fugl-Meyer Assessment [FM]), and a measure of (2) neuroplasticity (change in brain structure or function, e.g., number of active voxels during a movement task, excitability of the corticospinal pathway).

Exclusion criteria
Studies not available in English or in full text were excluded.

Search terms
A combination of controlled vocabulary (MeSH) and free-text terms relating to the condition "Stroke," intervention "Simultaneous bilateral upper limb training" and body part "Upper limb" were used in the search strategy. These key words were modified to suit each database (see Appendix S1).

Study identification
One review author (PLC) conducted the literature search and eliminated obviously irrelevant titles and duplicates. Two authors (PLC and FvW) independently read the abstracts of the remaining studies and applied the above inclusion and exclusion criteria to classify each as "definitely relevant," "possibly relevant" or "definitely irrelevant." Those labeled "possibly relevant" were classified through discussion. The authors then independently reviewed the full texts of the "definitely relevant" and "possibly relevant" studies and used the same criteria to classify each as "include," "unsure," or "exclude." Studies which both reviewers classified as "include" and "exclude" were included and excluded in the review, respectively. The remaining studies were classified through discussion. Where a decision could not be agreed between the authors, the opinion of a third author (HG) was sought and consensus reached through discussion.

Assessment of methodological quality
Two authors (PLC and FvW) independently assessed the methodological quality of the included studies using the Effective Public Health Practice Project (EPHPP) tool (Thomas et al. 2004). This tool was selected after considering the research questions and recommendations from four reviews (US Department of Health and Human Services, 2002;Deeks et al. 2003;Sanderson et al. 2007;Crowe and Sheppard 2011). Where necessary, a third author (HG) was involved to resolve any disagreement between PLC and FvW through discussion.

Data extraction
Data extraction pertaining to transcranial magnetic stimulation (TMS) was independently performed by two authors (PLC and VP). The same applied to fMRI data (PLC and HG). The remaining data were independently extracted by two authors (PLC and FvW). Reports of adverse events were also documented. All data extraction was conducted using standardized forms. If there were queries about the published reports then authors were contacted for clarification.

Data synthesis
Included studies were clustered according to whether they utilized either a single phase mode or a combination of in-phase and anti-phase modes of BT. If data were suitable for pooling, all outcome measures were to be analyzed as continuous data. Standardized mean differences (SMD) and 95% confidence intervals (CI) were also to be calculated. Heterogeneity was planned to be determined using the I 2 statistic. Meta-analysis was planned as both fixed-effect and random-effects modeling to assess sensitivity to the choice of modeling approach. We also intended to undertake subgroup analyses for time since stroke, type of stroke, severity of arm impairment, and mode of BT. However, considerable heterogeneity of participants, BT modes, comparator interventions and measures emerged from the findings, which contraindicated the pooled outcome analyses, as well as the planned subgroup analyses. A narrative synthesis was therefore performed. Publication bias was assessed where there were more than ten studies included for review.

Characteristics of included studies
From a total of 41,438 titles, eight studies comprising 164 participants were identified for inclusion in this systematic review. The PRISMA flow chart is presented in Figure 1. Characteristics of the included studies are described in Table 1, while the TMS and fMRI methodologies are described in Tables 2 and 3, respectively.

Design
Four of the eight studies were reported as RCTs (Luft et al. 2004;Wu et al. 2010;Whitall et al. 2011;Stinear et al. 2014). Four studies did not report study design (Lewis and Byblow 2004;Stinear and Byblow 2004;Lewis and Perreault 2007;Summers et al. 2007). Of these, one used an interrupted time series (Lewis and Byblow 2004) and two used an RCT design (Stinear and Byblow 2004;Summers et al. 2007). The design for the remaining study was unclear (Lewis and Perreault 2007).

Participants
A total of 164 participants were involved in the included studies. Their demographic and clinical characteristics can be found in Table 1. A mean of 21 participants were included per study (range 6-57). The reported mean age varied between 52.8 and 65.3 years (range 31-97), 55% were male and 62% were more than 3 months after stroke. Side of stroke was reported in seven studies (Lewis and Byblow 2004;Luft et al. 2004;Stinear and Byblow 2004;Lewis and Perreault 2007;Summers et al. 2007;Wu et al. 2010;Whitall et al. 2011), for 107 participants. Of these, 42 (39%) had a left hemisphere stroke, 64 (60%) had a right hemisphere stroke and 1 (1%) had a bilateral stroke. One study reported that 23 (40%) participants had a stroke in the dominant hemisphere (Stinear et al. 2014). Information relating to the type of stroke could not be synthesized due to varied methods used for classification, while precise lesion sites were rarely reported ( Full-text articles excluded (n = 77) Reasons:not a quantitative trial (n = 1), not BT (n = 58), BT with other intervention (n = 1), not stroke population (n = 8), not adult population (n = 1), not including both motor and neurophysiological outcomes (n = 8)   (   of seven of the eight studies had different initial levels of UL severity (Lewis and Byblow 2004;Luft et al. 2004;Stinear and Byblow 2004;Lewis and Perreault 2007;Summers et al. 2007;Whitall et al. 2011;Stinear et al. 2014), while those in the remaining study had only mild UL impairment (Wu et al. 2010).

Content
The mode of BT in the included studies varied considerably: three studies involved only the in-phase mode of BT (Lewis and Byblow 2004;Summers et al. 2007;Stinear et al. 2014), four involved both in-phase and anti-phase modes (Luft et al. 2004;Lewis and Perreault 2007;Wu et al. 2010;Whitall et al. 2011) and one study used inphase or anti-phase BT (Stinear and Byblow 2004). Of these five studies, only three specified the sequencing of BT practice modes (Stinear and Byblow 2004;Lewis and Perreault 2007;Whitall et al. 2011).
Three studies involved functional tasks, including reaching and grasping objects (Wu et al. 2010), peg targeting/inversion (Lewis and Byblow 2004) and dowel placement (Summers et al. 2007). The remaining five focused on specific UL movements. One of these involved "free" forearm pronation/supination (Lewis and Perreault 2007), while in the remaining four studies, the ULs were mechanically coupled (Luft et al. 2004;Stinear and Byblow 2004;Whitall et al. 2011;Stinear et al. 2014). Two of these studies used bilateral arm training with rhythmic auditory cueing (BATRAC), during which the participant bilaterally pulled and pushed two T-bar handles upon auditory cues (Luft et al. 2004;Whitall et al. 2011). During Active-Passive Bilateral Therapy (APBT, Stinear and Byblow 2004) and Active-Passive Bilateral Priming (APBP, Stinear et al. 2014), the participant used a purpose-built manipulandum to passively flex/extend their paretic wrist through active, rhythmical flexion/extension of their nonparetic wrist. In contrast to APBT (Stinear and Byblow 2004), which was a stand-alone intervention, APBP (Stinear et al. 2014) was utilized as a priming technique before UL physiotherapy.
The interventions of three of the eight studies involved more than one UL movement or task (Lewis and Byblow 2004;Lewis and Perreault 2007;Wu et al. 2010), whereas the other five studies involved only a single movement or task throughout the intervention period (Luft et al. 2004;Stinear and Byblow 2004;Summers et al. 2007;Whitall et al. 2011;Stinear et al. 2014).

Dose
The dose of BT varied considerably across the included studies. Aside from the single session study (Lewis and   Perreault 2007), the overall duration of the intervention periods in the remaining seven intervention studies ranged from 6 days (Summers et al. 2007) to 6 weeks (Luft et al. 2004;Whitall et al. 2011). The frequency of intervention ranged from 3 days/week (Luft et al. 2004;Whitall et al. 2011) to 7 days/week (Stinear and Byblow 2004). The amount of practice per training session ranged from 33 (Lewis and Byblow 2004) to 50 repetitions/day (Summers et al. 2007), while a target of 500-1500 repetitions was set in the APBP study (Stinear et al. 2014). In other studies, active practice time for the affected UL ranged from 1 h (including 20 min of actual practice) (Luft et al. 2004;Whitall et al. 2011) to 2 h a day (Wu et al. 2010). Lewis and Perreault (2007) quantified the amount of practice for the motor session by time (i.e., 105 sec).

Comparison interventions
There was considerable variation between the comparison conditions. One study (Lewis and Perreault 2007)

UL outcome measures and time points
The included studies used different measures at different time points. The Fugl-Meyer Assessment (FM) was the most common UL outcome measure, used in six of the seven intervention studies (Lewis and Byblow 2004;Luft et al. 2004;Stinear and Byblow 2004;Wu et al. 2010;Whitall et al. 2011;Stinear et al. 2014). However, its use was not standardized; only three of these studies used the entire FM UL assessment (Luft et al. 2004;Wu et al. 2010;Whitall et al. 2011). For the remaining two studies, one (Lewis and Byblow 2004) included only the hand and forearm subsection, while the other (Stinear and Byblow 2004) used the wrist, hand and coordination subsections. In one study, it was unclear which part of the FM UL section had been used (Stinear et al. 2014). Two studies used motion analysis to measure UL kinematics (Lewis and Perreault 2007;Summers et al. 2007). Only two of the intervention studies included a followup assessment: at 12 and 26 weeks after stroke (Stinear et al. 2014), and 4 months after end of intervention (Whitall et al. 2011). One study included a follow-up assessment for only those participants (n = 5 of 9) who had more than 10% improvement in FM score (Stinear and Byblow 2004).

Neurophysiological measures
Three studies used fMRI (Luft et al. 2004;Wu et al. 2010;Whitall et al. 2011) and five used TMS (Lewis and Byblow 2004;Stinear and Byblow 2004;Lewis and Perreault 2007;Summers et al. 2007;Stinear et al. 2014). The TMS studies utilized a wide range of variables to assess cortical excitability. Variables used for fMRI and TMS are described in Table 1.

Methodological quality
The methodological quality of the studies is detailed in Table 4. The global rating for methodological quality was "strong" for two studies (Luft et al. 2004;Stinear et al. 2014), "moderate" for two studies (Lewis and Byblow 2004;Whitall et al. 2011) and "weak" for the remaining  For rating of task performance, significant improvement within 2 days of switching to BT were found in simulated drinking (n = 2), rapid transfer (n = 1), peg targeting (n = 1) (P < 0.05) only. Significant reduction in progression was found in block placement task (n = 1) with the commencement of BT (P < 0.05).
Contralateral motor pathway (lesioned hemisphere to paretic UL): MEPs could only be elicited in 3 participants. Complete data for only 1 of the 3 participants who could elicit MEPs during all sessions: This participant had a decrease in map area of ECR from assessment 1 to 2 (i.e., beforeafter UT) and a small increase in the same map area from assessment 2 to 3 (i.e., before-after BT). The map area of FDI was relatively consistent across all 3 assessments.
Ipsilateral motor pathway (contralesional hemisphere to paretic UL): MEPs could be elicited in 5 of the 6 participants. No significant difference in the total number of iMEPS from both muscles across the 3 assessment times (P = 0.8).

TMS Effect of task conditions:
For the paretic arm, post hoc analysis found significant larger CV of movement amplitude in AP compared to UT (P = 0.004) only. Significantly higher uniformity of relative phase values in IP compared to AP (P = 0.001). Complete data for only 13 of the 15 participants due to contraindications to TMS Contralateral MEP amplitude: In the paretic arm, no significant difference in MEP amplitude between activation of arm contralateral to stimulation and activation of bilateral arms (F 1,12 = 2, Ipsilateral MEPs: In the paretic arm, no significant difference in number of iMEPs activated between activation of arm ipsilateral to stimulation and activation of bilateral None of the correlations were significant (P > 0.05). (Continued) Complete data for only 6 of the 12 participants due to technical difficulty, unusable data and contraindications.

Neural correlates of bilateral training
This section describes the effects of BT on UL motor outcomes, neural function and the associations between these two types of variables. As the results could not be pooled for reasons explained earlier, only a narrative synthesis will be presented. The effects of BT in the included studies are described in Table 5.

BT including either in-phase or anti-phase
This cluster of studies included those featuring practice of functional tasks (Lewis and Byblow 2004;Summers et al. 2007) or UL joint movements (Stinear and Byblow 2004;Lewis and Perreault 2007;Stinear et al. 2014). One study, in which participants commenced with unilateral training (UT) of three functional tasks, followed by bilateral training (BT) of the same tasks, found mixed results (Lewis and Byblow 2004). There were no significant differences in FM scores between the unilateral and bilateral phases (P = 0.05). However, comparing video ratings of performance of functional tasks, within-subject improvements were reported following switching from UT to BT in a minority of participants, while one participant's performance had deteriorated. TMS was used to map the neurophysiological responses to the interventions, but data were complete for one of the six participants only (Table 5).
An RCT comparing UT with BT of a dowel placement task (Summers et al. 2007) found a significant improvement in the Modified Motor Assessment Scale-but not in any other measures-in the BT compared to the UT group (P = 0.0094). TMS data were complete for only three of six participants in each group and differences between the two groups were not analyzed. In terms of the association between changes in UL motor outcomes and neurophysiological measures, a significant negative correlation was reported between change in map volume for the contralesional hemisphere and change in MAS score (rho = À0.883, P = 0.02)-but not for the ipsilesional hemisphere. This finding related to all six participants, from both UT and BT groups, for whom TMS data were available.
Two other studies comprised APBT (Stinear and Byblow 2004) and APBP (Stinear et al. 2014). Findings from these two studies do not permit synthesis, as one study utilized APBT as a stand-alone intervention (Stinear and Byblow 2004), whereas the other used APBP as a priming technique (Stinear et al. 2014). In the randomized APBT trial comparing in-phase with anti-phase bilateral wrist flexion/extension (Stinear and Byblow 2004), between-group differences were not presented for UL motor or TMS data. The average FM score of participants in both groups improved over the intervention period (P = 0.02) as well as over the preceding baseline period (P = 0.04). Using TMS, the authors found no significant difference in mean map volume over the intervention period (P = 0.07). The correlation between change in FM score and change in map volume was not significant, however, correlations were not reported for each group separately. The RCT comparing APBP with a control group given Transcutaneous Electrical Nerve Stimulation (Stinear et al. 2014) reported a significantly greater number of participants in the APBP group attaining a plateau of UL recovery at 12 weeks than in the control group (P = 0.039). Those in the APBP group were also three times more likely to achieve this plateau over the same period than those in the control group (OR 3.32, 95% CI 1.1-10.7), however, actual UL outcome data were not presented. No other significant differences were reported. Compared to the control group, where no significant differences were found following the intervention, intrahemispheric cortico-motor excitability in the APBP group was reported to be significantly increased in the ipsilesional hemisphere (P < 0.028) and significantly decreased in the contralesional hemisphere (P = 0.010). Interhemipsheric inhibition in the ipsilesional hemisphere showed an increase in the APBP group and a decrease in the control group (P < 0.028) after the intervention. However, only 30 of 57 participants (52.6%) were able to maintain paretic wrist extension against gravity, a requirement for this test. The authors did not report any statistical associations between UL motor and neurophysiological outcomes of BT.
The single-session study comparing different modes of unilateral and bilateral UL forearm pronation and supination found a significantly larger variation in movement amplitude in the bilateral antiphase condition compared to the unilateral condition (P = 0.004) and significantly higher uniformity of relative phase values in bilateral inphase compared to antiphase tasks (P = 0.001) (Lewis and Perreault 2007). The correlations between FM score and measures of cortical excitability were all nonsignificant.

BT including a mix of in-phase and anti-phase
This cluster of studies included those featuring practice of functional tasks (Wu et al. 2010) or UL joint movements (Luft et al. 2004;Whitall et al. 2011).
In an RCT comparing BT, involving a range of functional tasks, with distributed Constraint-Induced Therapy (dCIT), all UL motor outcomes improved for both the BT (n = 4) and the dCIT group (n = 2) (Wu et al. 2010). Using fMRI during movement of the affected hand, the authors reported an increase in the number of activated voxels in both cerebral hemispheres for both groups. As the sample size was small, only descriptive data were reported.
Two RCTs comparing BATRAC with dose-matched therapeutic exercise (DMTE) used a comparable methodology (Luft et al. 2004;Whitall et al. 2011). Both identified a significant increase in activated voxels in the ipsilesional precentral gyrus in the BATRAC compared with the DMTE group, however, some of the other findings did not concur ( Table 5). One of these RCTs (Luft et al. 2004) found no significant difference in any of the UL motor outcomes between the two groups. Using fMRI, a significant increase was found in the BATRAC compared to the DMTE group in terms of the number of activated voxels in the ipsilesional cerebellum (P < 0.001), ipsilesional medial precentral gyrus (P = 0.02), and contralesional medial precentral gyrus (P = 0.03) for paretic arm movement. A correlation between UL motor and neurophysiological measures was not reported. Subgroup analysis of six of nine BATRAC participants who showed before-after differences in brain activation (i.e., recruitment of premotor area and primary motor cortex) found a significant increase in FM scores (P = 0.02). The second of these RCTs (Whitall et al. 2011) did report a significant improvement in paretic wrist extension for the BATRAC group compared to the DMTE group (P < 0.05). Not all participants underwent fMRI and changes in UL motor outcomes of the sub cohorts of both groups that did undergo fMRI were not available, however. A significant increase was reported in the number of activated voxels in the contralesional superior frontal gyrus (P = 0.012) and ipsilesional precentral gyrus (P = 0.011) for paretic arm movement. Additionally, a significant negative correlation was reported between changes in time taken for the paretic arm to complete the Wolf Motor Function Test (WMFT) and an increase in number of activated voxels in the contralesional superior frontal gyrus, bilateral anterior cingulate Figure 1 cortex, and bilateral supramarginal gyrus for BATRAC partici-pants (r ≤ À0.62, P ≤ 0.01). No correlation was found in the DMTE group between these variables.

Adverse events
Adverse events were not reported in any of the eight included studies.

Discussion
The key findings of the present systematic review are that the neural correlates of UL motor behavior in response to BT after stroke remain unclear. A quantitative synthesis was not possible due to heterogeneity across the included studies for types of BT interventions, comparator interventions and measures. The narrative synthesis is limited further because: (1) only two studies were rated as "strong" for methodological quality and none were "strong" for selection bias or blinding, and (2) only 164 people participated in the eight included studies. Detailed discussion of these key findings is provided below in sections for neural correlates, methodological quality of the evidence base, strengths and limitations of the present review and implications for research and practice.
Correlations between arm motor behavior and brain structure/function following BT Drawing conclusions from this body of evidence was hampered by the heterogeneity which contraindicated meta-analysis. This was compounded by the lack of data available for analysis from individual studies, either due to technical difficulties with TMS (Lewis and Byblow 2004;Lewis and Perreault 2007;Summers et al. 2007;Stinear et al. 2014), lack of outcome data (Wu et al. 2010), or incomparable data sets (Lewis and Byblow 2004;Stinear and Byblow 2004). Statistical associations between changes in UL motor performance and neural function as a result of UL interventions were reported in three of eight studies (Stinear and Byblow 2004;Summers et al. 2007;Whitall et al. 2011). However, only two of those (Stinear and Byblow 2004;Whitall et al. 2011) reported an association particular to BT: one study (Stinear and Byblow 2004) found no significant correlation between FM score and change in map volume, but this included data from both the in-phase and the anti-phase BT groups. The other study found a significant correlation between the speed at which the Wolf Motor Function Test was completed and an increase in activation of the contralesional superior frontal gyrus following BATRAC but not DMTE (Whitall et al. 2011).
It is clear from this review that, in order to elucidate the neural processes associated with responses to BT in people with arm impairment due to stroke, several methodological issues (especially data collection methods and reporting) need to be addressed in future studies. These will be discussed in subsequent sections.
It remains unclear whether there are any differences in neural activation between in-phase BT or anti-phase BT; the only study in this review to have systematically compared unilateral and bilateral in-phase and anti-phase modes (Lewis and Perreault 2007) did not examine brain activity patterns associated with the same arm movements. Evidence was found for more variability in affected arm motor performance during bilateral anti-phase compared to unilateral arm movement, suggesting differences in neuro-muscular control between the two modes. Findings were also indicative of stronger coupling between the two arms in the bilateral in-phase compared to the anti-phase condition. These findings are not surprising, as healthy individuals perform bilaterally identical (i.e., in-phase) and unilateral tasks with ease (Franz et al. 1991;Fontaine and Lee 1997;Swinnen 2002). However, bilaterally different (e.g., anti-phase) movements have demonstrated weaker temporal and spatial coupling compared to bilaterally identical movements (Kelso et al. 1979;Serrien and Swinnen 1999;Swinnen 2002). The rationale for using bilaterally different movements in interventions aimed at improving UL function might therefore be questioned. Another comparison requiring further investigation is auditory-cued versus nonauditory-cued BT, as neither of the BATRAC studies (Luft et al. 2004;Whitall et al. 2011) differentiated between the effects of BT and those of auditory cueing, which is known to result in an immediate reduction in spatiotemporal variability of unilateral reaching patterns (Thaut et al. 2002).
Comparing these findings with those of other evidence syntheses of BT after stroke (Stewart et al. 2006;Langhorne et al. 2009;Cauraugh et al. 2010;Coupar et al. 2010;Latimer et al. 2010;Sleimen-Malkoun et al. 2011;Van Delden et al. 2012;Pollock et al. 2014a) is limited, as only those studies reporting measures of both UL motor behavior and neural function were eligible for the present review. Other reviews that included measures of UL motor behavior and neural function did not focus exclusively on BT as an intervention (Kreisel et al. 2006;Buma et al. 2010) but also concluded that-with the exception of reactivation of the lesioned motor areas as the most consistent predictor of recovery (Calautti et al. 2001;Tombari et al. 2004)-there is no single pattern of neuroplasticity during stroke recovery (Kreisel et al. 2006;Buma et al. 2010). Current evidence from this review indicates that the neural correlates of even a specific intervention (i.e., UL BT) are poorly understood and require further investigation. Essentially, the review reported here contributes to a robust indication that there is currently insufficient evidence on which to delineate any specific pattern of UL motor recovery and brain activity in response to BT after stroke.

Methodological quality of included studies
The assessment of methodological quality of the studies included in this review clearly indicates the need to strengthen the evidence, as ratings were "strong" for only two of eight studies (Luft et al. 2004;Stinear et al. 2014). Reporting of withdrawals and dropouts, as well as adverse events, was poor and often not in accordance with the CONSORT guidelines (Schulz et al. 2010). Blinding was not rated "strong" in any of the studies, but this could reflect an issue with the EPHPP tool itself, where a "strong" score can only be obtained if, in addition to the assessor being blinded, study participants are not aware of the research question. It is usually not possible to ascertain the latter, resulting in a downgrade of the score on this item.
Another potential contributor to the assessment of methodological quality could be that the studies meeting the inclusion criteria for this systematic review are mostly early phase trials or experimental studies rather than definitive clinical trials. However, in deriving evidence from any study it is still important to consider the potential risk of bias in the results.
A large proportion of missing data, as shown in Table 5, threatens the internal validity of these studies and questions the suitability of the methods used in a stroke population. Using TMS in this population appeared to be particularly challenging and methods that support lower attrition rates than the studies included in this review are required to improve the quality of research in this area. Furthermore, the measures selected to assess neurophysiological variables should be shown to be valid and based on a clear rationale.

Strengths and limitations of the present review
The methodology for study selection, quality assessment, and data extraction for this review, using independent assessors, was systematic and rigorous. The quality assessment tool used in this review has been methodically chosen for (1) its purpose of assessing primary research designs in systematic reviews, (2) applicability for randomized and nonrandomized studies, (3) its form of a checklist with a summary score, (4) its key domains relevant to the research questions, and (5) evidence of careful development including validity and reliability. There are, however, some limitations in its validation process (i.e., inappropriate methods of assessing levels of agreement between EPHPP and a comparison tool) (Thomas et al. 2004). This review was limited by the inclusion of English language papers only, a possibility remains that some papers may have been missed. Otherwise, the search strategy was comprehensive.

Implications for research and practice
The key implication for research and practice is that this review did not find consistent evidence of the neural correlates of UL motor response to BT after stroke. In order to clearly determine the therapeutic potential of different modes of BT for patients with different types of brain lesions, these modes need to be examined systematically. The challenge in this area of research is thus threefold: identifying effective BT training modes, matching the optimal mode to patients with specific stroke lesions and UL impairments, and using valid and appropriate measures based on a clear rationale.
This current review indicates that participant characteristics such as type and precise site of stroke need to be reported more consistently and comprehensively in future studies (Kreisel et al. 2006). For example, diffusion tensor imaging (DTI) could be used to describe corticospinal tract integrity. This information would contribute to a better understanding of motor and neural responses to different BT modes (Van Delden et al. 2012). Future studies should also investigate the influence of stroke chronicity on responses to BT, so as to understand the neurophysiological markers of recovery (Ward et al. 2013).
As mentioned earlier, this review found diversity in BT modes between studies. Clearly, these modes provide different types of sensory input, which may have a differential effect on UL recovery, depending on the integrity of the neural pathways involved. Studies are required to systematically compare different modes of BT in participants with known stroke lesions. Since different modes of BT are thought to exploit distinct neural mechanisms (Cauraugh and Summers 2005), future studies should also explore the unknown relationship between different modes of UL movement and brain activity patterns.
The frequency, intensity, duration, proportion of rest and actual practice, and organization of BT practice were all poorly reported and should be clearly detailed in future studies to enable replication (see Table 1). Additionally, the treatment dose in many of the included studies was low compared to recent studies. A recent Cochrane review suggested that intervention sessions of 30-60 min, 5-7 days a week may provide a treatment effect (Pollock et al. 2014b), whereas Birkenmeier et al. (Birkenmeier et al. 2010) found a few hundred repetitions per session beneficial for stroke participants. Four studies were well below this suggested dosage (Lewis and Byblow 2004;Luft et al. 2004;Summers et al. 2007;Whitall et al. 2011), whereas the APBP priming intervention (Stinear et al. 2014) included 500-1500 repetitions-albeit driven by the nonaffected side. Future intervention studies should aim for an adequate dose of practice where possible.
Comparisons should allow the analysis of effects of BT as a single intervention to understand the effects of BT alone. In this review, two studies compared UL motor outcomes and brain activation between BATRAC and DMTE (Luft et al. 2004;Whitall et al. 2011). While considered a mode of BT, the additional element of rhythmic auditory stimulation, which is embedded within BATRAC, could confound results. In terms of other confounders, one of the three fMRI papers did not control for mirror movements (Table 3) (Wu et al. 2010). FMRI methodology should minimize and control for mirror movements as they can confound bilateral activation patterns (Kim et al. 2003).
Another complication in synthesizing the findings from this review were the wide range and variation of neurophysiological measures in the studies included (Tables 2 and 3). For fMRI, this was mainly due to differences in the analysis methods for the regions of interest (ROI). For the TMS studies, eliciting MEPs in stroke participants appeared to be challenging, which limited the amount of usable data. Additionally, a wide range of TMS measures were used which limited comparison between studies. As sample sizes in future studies on this topic are unlikely to be large, standardization of measures across different studies would facilitate data to be pooled for meta-analyses, enabling clearer conclusions to be drawn.
In future studies it would also be useful to determine correlations between neural and behavioral measures to shed insight on the neural processes underlying BT. Studies should, however, avoid analyses from inadequately powered subgroups of participants.
Before the pattern of neuroplasticity associated with different BT modes is delineated, the potential value of BT as a therapeutic intervention must not be prematurely cast off, however. Therapists should continue to use clinical reasoning when selecting BT, by considering various modes of delivery in conjunction with patients' UL impairments and functional goals.