Ben Arancibia and Yoni Sidi

Introduction and Overview of SWE WG

Mixed Models for Repeated Measures - Why is it a Problem?

MMRM Package

Why this is not “yet another package”

Long Term Perspective

Comparing MMRM R Package to SAS - Demo

Closing and Next Steps

- Ben Arancibia: Director of Data Science at GSK sitting within Statistical Data Sciences Innovation Hub
- Yoni Sidi: Director of Modeling and Simulation at Sage Therapeutics

- Open-source software has gained increasing popularity in Biostatistics over the last two decades
- Pros: rapid uptake of novel statistical methods and unprecedented opportunities for collaboration and innovation
- Cons: users face huge variability in software quality, in particular reliability, efficiency and maintainability

- Developing high-quality software with good coding practices, reproducible outputs, and self-sufficient documentation is critical to inform clinical and regulatory decisions

- To deal with the issues of statistical quality assurances for R packages and creating high quality statistical software a group came together to create through the “Software Engineering Working Group” (SWE WG) to look R packages from a statistical point of view

- An official working group of the ASA Biopharmaceutical Section
- Formed in August 2022
- Cross-industry collaboration with more than 30 members from over 20 organizations
- Home page at rconsortium.github.io/asa-biop-swe-wg

**Primary Goal**: Collaborate to engineer R packages that implement important statistical methods to fill in critical gaps**Secondary Goal**: Develop and disseminate best practices for engineering high-quality open-source statistical software

- First R package
`mmrm`

was published on CRAN in October 2022 and updated in December- We aim to estabilish this package as a new standard for fitting mixed models for repeated measures (MMRM)
- We have been developing and adopting best practices for software in the
`mmrm`

package, and open sourced it at github.com/openpharma/mmrm - Currently under active development to add more features

- Mixed Models for Repeated Measures (MMRM) is a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials
- No great R Package - initially thought that the MMRM problem was solved by using a combination of
`lme4`

and`lmerTest`

- Learned that this approach failed on large data sets (slow, did not converge)
`nlme`

does not give Satterthwaite adjusted degrees of freedom, has convergence issues, and with`emmeans`

it is only approximate- tried to extend
`glmmTMB`

to calculate Satterthwaite adjusted degrees of freedom

- First try to improve existing package
- Here we tried to extend
`glmmTMB`

to calculate Satterthwaite adjusted degrees of freedom - But it did not work

- Here we tried to extend
- Think about long term maintenance and responsibility

- Because
`glmmTMB`

is always using a random effects representation, we cannot have a real unstructured model (uses \(\sigma = \varepsilon > 0\) trick) - We only want to fit a fixed effects model with a structured covariance matrix for each subject
- The idea is then to use the Template Model Builder (
`TMB`

) directly - as it is also underlying`glmmTMB`

- but code the exact model we want - We do this by implementing the log-likelihood in
`C++`

using the`TMB`

provided libraries

`TMB`

- Fast
`C++`

framework for defining objective functions (`Rcpp`

would have been alternative interface) - Automatic differentiation of the log-likelihood as a function of the variance parameters
- We get the gradient and Hessian exactly and without additional coding
- This can be used from the R side with the
`TMB`

interface and plugged into optimizers

- Ongoing maintenance and support from the pharmaceutical industry that is supported by American Statistical Association
- Package is part of the mission, but to emphasize our goal is to push out information on practices for engineering high-quality open-source statistical software

To run an MMRM model in SAS it is recommended to use either the PROC MIXED or PROC GLM procedures.

Less model assumptions are applied in PROC MIXED, primarily how one treats missingness.

We will compares the PROC MIXED procedure to the

`{mmrm}`

package in the following attributes:

- Documentation
- Unit Testing
- Estimation Methods

- Covariance structures
- Degrees of Freedom
- Contrasts

Both languages have online documentation of the technical details of the estimation and degrees of freedom methods and the different covariance structures available.

`{mmrm}`

`PROC MIXED`

One major advantage of the `{mmrm}`

over `PROC MIXED`

is that the unit testing in `{mmrm}`

is transparent. It uses the `{testthat}`

framework with `{covr}`

to communicate the testing coverage. Unit tests can be found in the GitHub repository under ./tests.

**Note**

The integration tests in `{mmrm}`

are set to a tolerance of 10e^-3 when compared to SAS outputs.

Method | `{mmrm}` |
`PROC MIXED` |
---|---|---|

ML | X | X |

REML | X | X |

- SAS has 23 non-spatial covariance structures, while mmrm has 10.
- 9 structures intersect with SAS
- Ante-dependence (homogeneous) is only in the mmrm package.

- SAS has 14 spatial covariance structures compared to the spatial exponential one available in mmrm.

**Tip**

For users that need more structure `{mmrm}`

is easily extensible via feature requests in the GitHub repository.

Covariance structures | `{mmrm}` |
`PROC MIXED` |
---|---|---|

Unstructured (Unweighted/Weighted) | X/X | X/X |

Toeplitz (hetero/homo) | X/X | X/X |

Compound symmetry (hetero/homo) | X/X | X/X |

Auto-regressive (hetero/homo) | X/X | X/X |

Ante-dependence (hetero/homo) | X/X | X |

Spatial exponential | X | X |

Method | `{mmrm}` |
`PROC MIXED` |
---|---|---|

Contain | X* | X |

Between/Within | X* | X |

Residual | X* | X |

Satterthwaite | X | X |

Kenward-Roger | X | X |

Kenward-Roger (Linear)** | X | X |

*Available through the `emmeans`

package.

**This is not equivalent to the KR2 setting in `PROC MIXED`

Contrasts and LSMeans estimates are available in `mmrm`

using

`mmrm::df_1d`

,`mmrm::df_md`

- S3 method that is compatible with
`emmeans`

- LS means difference can be produced through emmeans (
`pairs`

method) - Degrees of freedom method is passed from
`mmrm`

to`emmeans`

- By default
`PROC MIXED`

and`mmrm`

do not adjust for multiplicity, whereas`emmeans`

does.

**Note**

These are comparable to the LSMEANS statement in `PROC MIXED`

.

- Software engineering is a critical competence in producing high-quality statistical software
- A lot of work needs to be done regarding the establishment, dissemination and adoption of best practices for engineering open-source software
- Improving the way software engineering is done will help improve the efficiency, reliability and innovation within Biostatistics

- Prepare public training materials to disseminate best practice for software engineering in the Biostatistics community
- At the beginning of February, a face-to-face workshop will take place in Basel, Switzerland with a focus on open-source software for clinical trials
- Organize conference sessions with a focus on statistical software engineering at CEN, JSM and ASA/FDA Workshop
- Video series on best practices for software engineering (link)

- sasr
- HTA
- Bayesian MMRM

**Note**

Have an interest in working on these topics? Come work with us, information on the SWE WG can be found here: ASA BIOP SWE WG