A comparison of group prediction approaches in longitudinal discriminant analysis

Abstract Longitudinal discriminant analysis (LoDA) can be used to classify patients into prognostic groups based on their clinical history, which often involves longitudinal measurements of various clinically relevant markers. Patients' longitudinal data is first modelled using multivariate generalised linear mixed models, allowing markers of different types (e.g. continuous, binary, counts) to be modelled simultaneously. We describe three approaches to calculating a patient's posterior group membership probabilities which have been outlined in previous studies, based on the marginal distribution of the longitudinal markers, conditional distribution and distribution of the random effects. Here we compare the three approaches, first using data from the Mayo Primary Biliary Cirrhosis study and then by way of simulation study to explore in which situations each of the three approaches is expected to give the best prediction. We demonstrate situations in which the marginal or random‐effects approach perform well, but find that the conditional approach offers little extra information to the random‐effects and marginal approaches.


INTRODUCTION
Regular surveillance of a patient is a crucial step in determining if/when they will develop a particular disease. Patients thought to be at risk of a disease may be asked to attend periodic clinic appointments at which a number of clinically relevant variables (referred to as markers) are measured. These variables can be used to assess the risk that a particular patient has of developing a disease, possibly within a set time frame. Patients are classified into prognostic groups based on their risk of having the disease. One may consider a two-group case where patients are allocated to the disease group or the no disease group. Alternatively, a multiple group scenario could be considered where patients are classified into groups based on the anticipated severity of their disease (e.g. stages of cancer). Such a clinical problem can be addressed using methods of discriminant analysis.
In many clinical settings, only the most recent information (obtained at the most recent clinic visit) is considered in assessing the risk of developing a disease for a particular patient. All previously gathered information is not considered, which could be an inefficient use of data. It may also be the case that the change in a patient's marker values over time is more informative in predicting their risk than simply the most recent value of the marker. To allow for a more flexible classification approach, in recent years, longitudinal discriminant analysis (LoDA) methods have been developed, which classify patients into prognostic groups based on their longitudinal history.
In each of the LoDA methods referenced above, a linear mixed model is first used to model the longitudinal evolution of each marker for patients of known prognosis. A useful feature of mixed models is that random effects are used to allow patient-specific deviation from the mean profile as well as to model correlation between observations of a marker at different time points and also between markers. In the case of markers of different types (Fieuws et al., 2008;Hughes et al., 2016), the linear mixed model is extended to a multivariate generalised linear mixed model (MGLMM). A MGLMM is fit separately to data from each prognostic group. The output of these models is then used in the LoDA to inform a classification rule. In other words, we use the longitudinal data on markers, from patients of known prognosis, to derive a classification rule which predicts the future disease status of a patient of unknown prognosis based on their individual longitudinal history. Morrell, Brant, and Sheng (2007) specify three alternative ways to use the output from the mixed model to predict disease status, namely marginal, conditional and random-effects prediction. In each case, the prediction has a different focus. For marginal prediction, the marginal distribution of the new patient's observed longitudinal data is used to predict their future status. That is, the prediction is focused on the mean evolution of the markers over time. We are interested in which of the group-specific mean longitudinal profiles, calculated using the MGLMM, the new patient's trajectory is closest to. The conditional prediction replaces the marginal distribution with the conditional density of the observed longitudinal data given the estimate of the new patient's random effects. In this case, the prediction is based on the patient-specific evolution of markers over time, ignoring any error in the variability of the patient's estimated random effects. This method could be thought of as comparing the mean longitudinal profiles for a subset of patients with 'similar' random effects in each group to the conditional longitudinal profile of the new patient. Finally, for random-effects prediction the density of the patient's estimated random effects is used for prediction and the focus is on the patient-specific evolution of the markers.
Most applications of LoDA have focused on the so called 'marginal' prediction approach. To the best of our knowledge, Morrell et al. (2007) were the first to propose the use of conditional and random-effects prediction as alternatives. Relatively little work has been done to assess which of the three methods is most appropriate to use, or whether different approaches suit some scenarios more than others. In work that aims to identify patients with prostate cancer based on the evolution over time of prostate-specific antigen (PSA), Morrell et al. (2007) and Morrell, Sheng, and Brant (2011) compare the three prediction approaches using a number of measures. In terms of sensitivity (proportion of correctly identified cancer cases) and lead time (mean time before clinical diagnosis that a patient is correctly predicted as a cancer case), the marginal method performs the best, whilst in terms of specificity (proportion of correctly identified non-cases) and probability of correct classification (PCC), the random-effects method performed the best. By contrast, Komárek et al. (2010) used the three methods to identify patients with primary biliary cirrhosis (PBC) based on three continuous longitudinal markers and concluded that the random-effects method gave the best prediction. Hughes et al. (2016) also compared the three approaches to identify patients with refractory epilepsy and show that, in their application the marginal and conditional approaches performed similarly, with a slight preference for the marginal approach whilst the random-effects approach performed poorly.
All previous comparisons of the three approaches to LoDA have been based on specific data sets and have provided different conclusions as to which approach works best. In this paper we investigate this matter further, by way of simulation study, to determine whether the three approaches are sensitive to different types of differences between the prognostic groups.
An outline of this paper is as follows. In Section 2, we give an overview of the LoDA methodology and explain in more detail the marginal, conditional and random-effects approaches. Section 3 gives a real data application of LoDA using the PBC data available within the mixAK (Komárek & Komárková, 2014) package in R (R Core Team, 2016). We describe a simulation study comparing the three approaches in two different scenarios in Section 4. We highlight some conclusions in Section 5.

Multivariate generalised linear mixed model
Our aim in this paper is to use data from patients of known prognosis to predict the group membership at some future point for new patients. We first introduce some notation following the definitions of Hughes et al. (2016). Each patient may belong to one of groups based on a diagnosis at specific time . We represent this by a value of the random variable ∈ { 0, … , − 1 } , which is only observed at time . We assume that for each patient, measurements are made on ≥ 1 markers at times = ( ,1 , … , , ) , ,1 < ⋯ < , < , = 1, … , . In common with the entire MGLMM methodology, this approach does not require that each marker is measured at the same time points, or even the same number of times. Neither is it necessary for all patients to have the same number of measurements or identical visit schedules. For each marker, these longitudinal observations for a particular patient are denoted = ( ,1 , … , , ) , = 1, … , . The longitudinal evolution of each marker may depend on additional covariate vectors ,1 , … , , ∈ ℝ which we denote as . We aim to use the information collected for a patient up until some < to predict the future group, , to which the patient belongs. The prediction is based on the information gathered about the patient at time and also all previous data for that patient.
We first fit separate MGLMMs to the longitudinal data for each prognostic group. The expected value (transformed by an appropriate link function) for the -th observation ( = 1, … , ) of the -th marker ( = 1, … , ) of a patient in group (denoted , ) is given by where ℎ −1 is a chosen link function (chosen dependent on the particular exponential family distribution being modelled (e.g. normal, Poisson, Bernoulli), with possible dispersion parameters ), , = , () and , = , () are covariate vectors used in a model for the prognostic group and , = 1, … , , = 0, … , − 1 denotes unknown regression coefficients. The unobserved random-effects vector = ( 1 , … , ) accounts for possible correlation between repeated observations of the same marker and also different markers on the same patient. Typically, the random-effects vector is assumed to jointly follow a normal distribution. However, Hughes et al. (2016) allow additional flexibility by specifying a mixture of normal distributions for the joint distribution of the random-effects vector in each prognostic group (see also Komárek et al., 2010;Verbeke & Lesaffre, 1996). That is, they assume | = ∼ ∑ =1  ( , ), where  ( , ) stands for a multivariate normal distribution with the mean vector and a covariance matrix . The mixture distributions are weighted by a factor , ( = 1, … , ). This multivariate normal distribution has a density denoted as (⋅; , ).
To fit this MGLMM, we need to estimate fixed effects regression coefficients from (1), denoted ∶= ( 1 , … , , 1 , … , ) and additionally mixture related parameters denoted . Full details of this MGLMM, which is based on the MGLMM proposed by Komárek and Komárková (2013) can be found in Hughes et al. (2016).

Group probabilities for individual patients
The aim of the discriminant analysis is to use the model parameters, and , estimated from the MGLMM in each group, to classify a new patient based on their longitudinal history. An application of Bayes theorem gives the probability that a patient belongs to group given their longitudinal and covariate data and the model parameters from the MGLMMs fit to patients of known status.
wherêdenotes a predictive density of the observed markers given the group and model parameters (or in the case of the randomeffects approach, the density of the random effects given the group-specific mixture parameters). Here the prior probabilities of belonging to each group are denoted by = P( = ), = 0, … , − 1 and are often taken to be the proportions of the prognostic groups in the study population. In a frequentist setting, , is estimated using the maximum likelihood estimates of the relevant model parameters in group . The proposed MGLMM produces a likelihood function involving intractable integrals, and so instead Hughes et al. (2016) propose the use of Bayesian estimates of the group membership probabilities. In a Bayesian setting, , is estimated as the mean of the posterior predictive density estimated from samples from a Markov Chain Monte Carlo (MCMC) scheme (see Komárek & Komárková, 2013 for details of the MCMC procedure and also for the full specification of the Bayesian model). As already indicated, in this paper we investigate three different ways of specifying the predictive density , in order to classify patients into prognostic groups.

Marginal prediction
The marginal prediction approach is the most commonly used approach in the LoDA literature. The aim of this approach is to compare the longitudinal profiles of a new patient to the group-specific average profiles (computed from the historical data).
The new patient is assigned to the prognostic group to which their longitudinal profiles lie closest. Here the predictive density , is taken as the marginal density of . That is, where , is the marginal density and denotes a (conditional) density of the observed markers in the prognostic group given the random-effect vectors, Here (⋅ | ; , ) denotes an exponential family density of the random variable , related to the GLMM (1). The randomeffects density, in (3), in the prognostic group is The group membership probabilities,  ) , are evaluated at each draw of the MCMC procedure and approximate group membership probabilities are calculated as the average across all samples.

Conditional prediction
For the conditional approach, the marginal distribution of is replaced by the conditional distribution of given a patientspecific estimate of the unknown random effects as the form of the predictive density , . That is, we use in place of , in (2) and the conditional group membership probabilities are calculated as the average over draws in the MCMC procedure In this case, the random effects for the patient must be estimated, and the mean of the conditional distribution of the random effects given the patient data and the model parameters is typically used (Hughes et al., 2016;Komárek et al., 2010).

Random-effects prediction
Random-effects prediction focuses on the patient-specific evolution of the longitudinal markers. As with the conditional approach, a suitable estimate of the patient-specific random effect is required. The predictive density , is taken to be the density of the random effects evaluated at the patient and group-specific estimate of the random effect given the marker data, . As previously, the mean group membership probabilities, to be used for classification, are calculated by averaging over the MCMC samples. ( ) ) , = 0, … , − 1.

Classification rules
The estimates of the marginal, conditional and random-effects group membership probabilities for a patient, are then used to classify the patient into a prognostic group. Typically, for each scheme, the patient is assigned to the group with the largest probability. For example, for marginal prediction of the future status of a (new) patient would be , . This is equivalent to setting a cut-off probability of 0.5 in the two-group classification case. An alternative scheme would be to classify a patient into a group only if the probability of belonging to that group is greater than a chosen cut-off . This cut-off is typically chosen through analysis of a receiver operating characteristic (ROC) curve (by selecting for example the cut-off that gives the closest point on the ROC curve to the top left corner). In the Bayesian methods outlined, the MGLMMs do not need to be refitted to classify new patients. Simply the group membership probabilities are calculated and an appropriate classification rule is applied. In this paper, all the longitudinal information gathered on a patient up until the time of prediction is used to calculate a patient's group membership probabilities. However, the LoDA approach can also be used to calculate dynamic predictions where the patient's group probabilities are recalculated each time new information becomes available, as was described in Hughes et al. (2016). Komárek et al. (2010) present an application of LoDA to data from the Dutch Multicenter Primary Biliary Cirrhosis study, using three continuous markers to show that, for this application, the random-effects prediction approach performs better than the marginal and conditional approaches. A similar PBC data set (The Mayo clinic trial Dickson, Grambsch, Fleming, Fisher, & Langworthy, 1989;Murtaugh et al., 1994) is presented in Komárek and Komárková (2013) in the context of cluster analysis and this data set is included within the mixAK (Komárek & Komárková, 2014) package in R (R Core Team, 2016; the data are available in Appendix D of Fleming & Harrington, 1991, and also electronically at http://lib.stat.cmu.edu/datasets/pbcseq). We present here an application of multivariate LoDA using continuous, binary and Poisson markers to the Mayo PBC data. PBC is a rare, but fatal liver disease. The initial study aimed to determine if the use of D-penicillamine increased the length of patient survival. Data on a large number of clinical parameters were recorded for 312 patients over a median of 6.3 years per patient.

PRIMARY BILIARY CIRRHOSIS DATA
Our aim is to use only the data collected up until 2.5 years to predict those patients who will die or require transplant within five years. Therefore, we focus on patients known to be alive and without a liver transplant after two and a half years, and for whom we also know their condition after five years. We identified 202 patients who were known to be alive without transplant after five years and 51 patients who died or had a liver transplant at some point in time between 2.5 and 5 years. Four longitudinal markers were considered for the multivariate LoDA, specifically the continuous markers albumin and logarithmic serum bilirubin, the platelet count (Poisson) and a binary marker indicating blood vessel malformations. See Figure 1 for individual patient profiles for each marker.
The GLMM for each of the continuous and count markers contained a random intercept and a random time slope, whilst the GLMM for the binary marker contained a random intercept and a fixed effect for time (in each model time was recorded in months). To keep things simple, and to allow easy comparison with the simulations presented in Section 4 we consider a one-component mixture distribution (i.e. = 1, see Hughes et al., 2016) for the random-effects distribution.
To predict the group membership of a patient, separate MGLMMs were fit to patients in each group excluding the data of the patient for whom prediction was being made. Table 1 shows the predictive accuracy of this leave-one-out cross-validation study applied to the PBC data. The cut-off was chosen to give the point closest to the top left corner of the ROC curve (Fig. 2) and the predictive accuracies relate to the cut-off reported for each of the three methods. For the PBC data, all three methods give reasonably good prediction of whether or not a patient will be alive without transplant after five years of observation. However, the conditional approach gives worse predictions than the other two approaches, whilst the marginal approach gives the best prediction, with 78% of patients who will die or require transplant correctly identified (Sensitivity), 81% of patients who will be alive without transplant correctly identified (Specificity) and 81% of patients correctly identified overall. The area under Time (months) Blood vessel malformation F I G U R E 1 Observed longitudinal profiles of albumin (mg/dl), log(bilirubin) (log(mg/dl)), platelet counts and blood vessel malformation for patients who are known to be alive at 5 years (Group 0, solid lines) and who die between 2.5 and 5 years (Group 1, dashed lines); the thick lines show fitted mean over time T A B L E 1 Prediction accuracy from leave-one-out cross-validation of random-effects, marginal and conditional prediction for PBC data

Random
Marginal Conditional ROC curve (AUC) summarises the performance of the classification methods over a range of cut-offs and again the marginal prediction approach performs best. A positive predictive value (PPV) of 51% for the marginal approach shows the percentage of patients predicted to die or require transplant who ultimately did die or require transplant, whilst the negative predictive value (NPV) of 94% shows that 94% of patients predicted to be alive without transplant were indeed alive after five years without requiring transplant. Profiles of the longitudinal markers in each group are shown in Figure 1. The thick lines represent the group average profile. Except for the platelet count, the mean group profiles clearly differ between the two groups (i.e. there exist marginal differences between the groups). The variability around the mean group profiles also appears to be different between the groups. These factors explain why the marginal and random-effects approaches give good classification accuracy. Komárek  that the random-effects approach gives best prediction when using LoDA on the Dutch Multicenter Primary Biliary Cirrhosis data. However, they have approximately 10 years of follow-up per patient with 13 observations per patient on average (every three months for the first year and then annually after that). We believe that the increased number of observations per patient allowed better estimation of the patient-specific random effects, hence showing the improved prediction accuracy from the random-effects approach. In fact, when we analysed all the Mayo PBC data (average of 7.03 visits per patient), and not just the first 2.5 years of data per patient (3.53 visits per patient) we also observed that the random-effects approach gave better predictive accuracy. This suggests that the random-effects approach can give added information and improvement in classification accuracy, but only if the random effects are precisely estimated.

SIMULATION STUDY
In Section 3, we presented an application of multivariate LoDA in which the marginal and the random-effects prediction method gave good predictive accuracy. However, as noted in Section 1, there have been contrasting findings in published studies as to which prediction method is best (Hughes et al., 2016;Morrell et al., 2007Morrell et al., , 2011. This suggests that the type of data being considered influences which method will give the best prediction accuracy. To explore this further, we considered simulation scenarios, based on the PBC data, but altered to reflect situations in which we believed the marginal and conditional approaches would lead to the most accurate predictions. We simulate data from 200 patients who are alive after five years without requiring transplant and 50 patients who were alive at 2.5 years but subsequently died or required transplant before five years, approximately reflecting the prevalence of the PBC data. For each patient, we simulated four clinic visits (following Komárek & Komárková, 2013). The first visit occurred at = 0 and the remaining visits were generated from uniform distributions in the intervals (170, 200), (350, 390) and (710, 770) days. This approximates to a visit after six months and then visits at one and two years. To more easily control the simulation differences we consider only a single normal distribution for the random effects (i.e. = 1, no mixture).
In each group, marker values were simulated from the appropriate GLMM at each of the four time points for each of the four markers considered in Section 3 (albumin, log(bilirubin), platelet count and blood vessel malformations). The values used to simulate the marker data from a GLMM are given in Table 2. We consider two alternative scenarios in our simulations.
In Scenario 1, we keep the fixed effects parameters and the means of the random effects as they are for the PBC data in both groups, with the only difference being that the random-effects variance-covariance matrix, , is set to be the same in each group. In this setting, the differences between the groups are in the mean profiles and so we would expect the marginal prediction method to give the best prediction. In each group there is approximately the same amount of variability around the group average for each marker. The focus of this simulation scenario is on the marginal differences between groups. In all the published comparisons of the three prediction approaches, either the random-effects or marginal method has given the most accurate prediction. We are not aware of any studies in which the conditional method is the best. Further, we find it difficult to envisage a situation in which the conditional approach would outperform both marginal and random-effects approach simultaneously. We suspect emphasising differences between the marginal profiles in each group would lead to the conditional approach outperforming the random-effects method, but not the marginal method. In contrast, greater differences in the randomeffects structure would allow the conditional approach to outperform the marginal approach, but would be unlikely to lead to the conditional approach being better than the random-effects approach. Morrell et al. (2011) discuss the three approaches and speculate that the conditional approach may work well in the case where the residual error is large in comparison to the random-effects variance. For our second scenario, we investigate further this possibility. Since only the continuous markers have a residual error term, in this scenario we only consider an MGLMM including the continuous markers in our simulation. In this case, the means and variances of the random effects are set to be the same in each group and the only difference is the value of the residual error. This reflects a scenario in which the measurement error in one group is larger than in the other group.

T A B L E 2 Parameter estimates for the PBC data and the modifications used for each simulation scenario
For each scenario, we simulated 100 data sets. The MGLMMs in each group were based on 10,000 iterations of 1:10 thinned MCMC after a burn in of 500 iterations. In each case, leave-one-out cross-validation was used to provide individual patient predictions. MGLMMs were fitted using the GLMM_MCMC function, and LoDA was performed using the GLMM_longitDA2 function from the R package mixAK (Komárek & Komárková, 2014). The reported prediction accuracies and model parameters are based on the averages over 100 simulated data sets. Source code to reproduce the results is available from the corresponding author upon request. Table 3 shows the mean parameter estimates for the MGLMM in each group across 100 simulated data sets. The simulated data sets approximate well the true model as shown by the low values of bias and MSE for most parameters. The coverage reports the proportion of times in which the true model parameter was within the estimated 95% credible interval for the parameter in the simulated data sets. The random slope variances for the continuous markers are poorly estimated in the simulated data sets. This is shown by the low coverage values of 0.45 and 0.76 in Group 0 and 0.57 and 0.50 in Group 1. We believe this may be due to the fact that the 'true' random-effects variance for the slopes are smaller than the residual error making them difficult to estimate accurately (Table 2). However, the simulated data sets provide good approximations to the true GLMM parameters.

Results for Scenario 1
Under Scenario 1, the marginal method gave the best predictive accuracy in terms of AUC, specificity, PCC and PPV (Table 4). The choice of method is not so clear-cut as in Table 1 as the random-effects approach gives the best sensitivity and NPV, although with a much worse specificity. These accuracies were calculated by selecting the optimal cut-off for each simulated data set and averaging the respective sensitivities, specificities, etc. Nevertheless Figure 3, which averages the sensitivity and specificity at each cut-off across the 100 simulated data sets, shows that the marginal approach consistently outperforms the other methods. This is consistent with what we expected since the main differences between the groups were in the fixed effects and expected values of the random effects.
We conclude from Scenario 1 that when the main differences between the groups are in the mean longitudinal evolution, the marginal method will be the best tool to classify patients. This effect was true in the case of Brant et al. (2003) and Morrell et al. (2011), where the marginal approach was shown to give the best classification results. The expected PSA level was seen to increase substantially between visits for patients who developed prostate cancer, and so the marginal approach was able to detect a difference between the largely stable PSA profile of healthy patients and the generally increasing longitudinal PSA profiles of patients who would ultimately develop prostate cancer. By contrast, Figure 1 and Tables 2 and 3 of Komárek et al. (2010), in which the random-effects approach gave best prediction, show that although there were some differences between the mean longitudinal profiles of each group there were also substantial differences in the patient-specific variability around the group mean in each group. Incorporating this additional information (which the random-effects approach does) led to the random-effects approach most accurately identifying patients who would require liver transplant or die. Table 6 shows that the bias, standard deviation and MSE of the estimated parameters was generally very low demonstrating that each simulated sample approximated the true model well.

Results for Scenario 2
In Scenario 2, the only difference between the two groups is the value of the residual variance ( Table 2). The random-effects approach is unable to detect this difference. In addition, since the residual variance is larger than the random-effects variances, T A B L E 3 Simulation study Scenario 1: Posterior means, highest posterior density (HPD) intervals, bias, standard deviation (SD), mean square error (MSE) and coverage for the fixed and random effects

Note.
These measurements are the average of 100 simulations.
T A B L E 4 Scenario 1 prediction accuracy from leave-one-out cross-validation of random-effects, marginal and conditional prediction; the reported values are the averages over the 100 simulated data sets the model is unable to make accurate estimates of the individual random effects leading to poor prediction (Table 5 and Fig. 4). The poor estimation of the random-effects parameters is also seen in the worse coverage rates in Table 6. The marginal and conditional approaches are still able to make accurate classification of patients with 90% and 89% of patients correctly identified, respectively. It is noticeable however, that even in a situation which we thought would most favour the conditional approach the marginal approach is just as good on all measures of accuracy. Figure 4 shows that whilst the marginal and conditional approaches classify the patients well, the random-effects approach performs little better than chance. According to Sections 4.1 and 4.2 of Komárek et al. (2010), in the case of continuous longitudinal markers, the normal distributions used to calculate group membership probabilities for both conditional and marginal methods use the residual error. For the marginal approach the variance of the multivariate normal distribution is influenced by the residual variance whilst for the conditional approach both mean and variance are affected. The normal distribution for the random-effects approach does T A B L E 6 Simulation study Scenario 2: Posterior means, highest posterior density (HPD) intervals, bias, standard deviation (SD), mean square error (MSE) and coverage for the fixed and random effects not use the residual variance and relies on an estimate of the individual random effects, which we noted above, has been poorly estimated due to the high residual error. This demonstrates why both conditional and marginal methods are able to detect a difference in the residual variance between the groups but the random-effects approach cannot. It should be noted that we observed large variation in the prediction accuracy of the random-effects approach over each simulated data sets. This accounts for the fact that the average 'best' sensitivities and specificities in Table 5 are noticeably better than the ROC curve for the random-effects approach in Figure 4 (where sensitivity and specificity are averaged across the 100 data sets at each cut-off). The prediction accuracy of the marginal and conditional approaches were, in contrast, much more stable. This is demonstrated in Figure 5, which shows the sensitivity, specificity, PCC and AUC for each simulated data set under Scenario 2. For each of the measures considered, the values in each simulated data set are very similar for both marginal and conditional methods. However, the inability of the random-effects approach to correctly estimate the individual patient random effects leads to very unstable estimates of sensitivity and specificity for example (a similar effect was observed in Scenario 1). It is noticeable that many of the simulated data sets gave sensitivity of 1 and specificity of 0, reflecting the fact that the randomeffects approach was unable to distinguish between the two groups. This leads us to conclude that when analysing data in which there seems to be a high likelihood of large measurement error, researchers should be wary about using the random-effects approach and may wish to focus on the marginal approach.

DISCUSSION
In this paper, we have compared three approaches to predicting group membership using LoDA, specifically the marginal, conditional and random-effects approaches. These approaches have been compared previously using a number of real data sets with contrasting results regarding which approach gives the most accurate prediction. The marginal and random-effects approaches are shown to give the most accurate classification in an application of multivariate LoDA to the real data of the Mayo PBC study. We explored the three approaches further by way of a simulation study in which we explored two scenarios designed to favour the marginal and conditional approaches.
When the average profile is noticeably different between prognostic groups then the marginal approach is expected to provide good classification accuracy. However, if the main difference between prognostic groups is dominated by the variability about the mean profile (differences in subject-specific variability across the groups) then the marginal approach is not able to distinguish patients as well and the random-effects approach is expected to work best.
The 95% credible interval coverage for the simulations indicated that for some of the parameters the coverage was considerably below 95%, suggesting poor estimation. On the other hand, a coverage around 99% was observed for some of the random-effects covariance terms, which may have been influenced by (i) the magnitude of the true values, which tend to be small in comparison to the residual error variance and (ii) the fact that we are attempting to fit a reasonably complicated model to fairly small numbers of patients (200 and 50 for Group 0 and Group 1, respectively), and with only four observations per patient. It is possible that over a larger number of simulated data sets, or with more repeated measurements of each marker, more precise credible intervals could be calculated which would in turn influence the coverage.

Random effects approach
Although three approaches have been reported in the literature (and compared in this paper), we have been unable to simulate a scenario in which the conditional approach works better than the marginal and random-effects approaches simultaneously. The conditional approach seems to offer little additional value to these two approaches.
There has been insufficient guidance as to which prediction approach to use in applications of LoDA. We suggest that a data analyst first plots longitudinal profiles of their markers for patients in each prognostic group. If there are differences in the group mean profiles and similar between-and within-subject variability between groups, then the marginal approach should be expected to provide the best accuracy results.
If, in addition there seems to be a difference in the level of variability about the group mean in each group, then the randomeffects approach is expected to offer additional information leading to more accurate classification. However, if the variability between patients is dominated by a large measurement error, then the random-effects approach should be avoided since estimates of the individual random effects are inaccurate. In such a case, the marginal approach would be preferred. In addition, if there are only a few repeated measurements per patient, it may be that estimates of individual patient random effects are not sufficiently precise to detect differences between the groups. In the case of only a few measurements per patient, we suggest the marginal approach is a good first option.
Further work could consider the effect that group prevalence has on the prediction accuracy of each method. The overall sample size and the number of longitudinal observations per patient may also influence the choice of which approach is preferable (e.g. the random-effects approach relies on having enough data collected to properly characterise subject-specific profiles).