---
title: "An introduction to MetaInsight"
vignette: >
  %\VignetteIndexEntry{An introduction to MetaInsight}
  %\VignetteEngine{quarto::html}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  out.width = "300px",
  fig.width =  1.5625, # 150px / 96dpi
  fig.height =  1.5625, # makes 300 px plots
  dpi = 96,
  fig.align = "center"
)
```

```{r eval = TRUE, echo = FALSE}
show_plot <- function(svg){
  old_mar <- par()["mar"]
  on.exit(par(old_mar))
  par(mar=rep(0,4))
  charToRaw(svg) |> 
  rsvg::rsvg_png(width = 300) |> 
  magick::image_read() |>
  plot()
}
```

The **metainsight** package contains functions to conduct network meta-analyses (NMA). 
NMA can compare treatments between studies even when no direct comparisons have been made and help determine the 'best' intervention for patients. 
Options are provided for conducting frequentist NMA, Bayesian NMA and Bayesian meta-regression including baseline risk can all be conducted using a single data format and consistent terminology.
MetaInsight is primarily designed to be used through the Shiny app, but the functions can also be used outside of the app and provide a consistent and simple-to-use interface to the underlying analytical packages ([bnma](https://cran.r-project.org/package=bnma), [gemtc](https://cran.r-project.org/package=gemtc) and [netmeta](https://cran.r-project.org/package=netmeta), with additional functionality from [meta](https://cran.r-project.org/package=meta) and [metafor](https://cran.r-project.org/package=metafor)).

## Installation

Install **metainsight** using:

```r
install.packages('metainsight')
```

or from GitHub using:
  
```r
remotes::install_github("CRSU-Apps/metainsight")
```

## Running the app

To run the app use:

```{r echo = TRUE, eval=FALSE}
library(metainsight)
run_metainsight()
```

If you have a saved file from the app, you can restore the app state using:

```r
run_metainsight(load_file = "<path to file>")
```

## Plots

Most plots are produced as scalable vector graphics (svg) which has the advantage of being consistent when they are saved to different formats and avoids you having to manually adjust plot dimensions depending on your dataset. 
When plotting functions are called in an rmarkdown or quarto chunk, the plots will display as images, but if called from the console they will produce code. To show them as images via the console, pipe them to `htmltools::browsable` and they will be visible in the Viewer panel of RStudio. 
To save plots to file, use the `write_plot` function which supports exporting as png, pdf or svg. For example:

```{r write_plot, echo = TRUE, eval = FALSE}
summary_study(configured_data) |> write_plot("summary_study_plot.pdf")
```

In this vignette, to keep the file size small, a different function called `show_plot` is used to display plots, which is not part of the package.

Plots produced by `*_deviance` and `*_mcmc` functions are not produced in this way, partly because they are not intended for publication and also because the deviance plots are interactive inside the app.

## Conducting an analysis

Functions are categorised depending on the stage in the analysis and type of analysis being conducted and match the tabs at the top of the app.
For all analyses, the data must be loaded and configured prior to producing outputs.

### Setup

Two example datasets are built into the package for either a binary or continuous outcome. 
We will use the built in continuous dataset for these examples which are loaded when `data_path` is not provided to the `setup_load` function. 
Data can be provided either in a long format, where each row is a study arm or a wide format, where each row is a study. 
Studies must contain a minimum of two arms and share a treatment with another study.

For continuous outcomes, long data should contain the columns:

- `Study` - an identifier, e.g. author and year
- `T` - name of treatment
- `N` - number of participants 
- `Mean` - mean value of the outcome
- `SD` - standard deviation of the outcome. 

Wide data for continuous outcomes should contain at least the following columns where the number refers to the arm of the study and extra columns should be added depending on the number of arms. 
For studies with fewer arms, the later-numbered columns should be left blank or set as `NA`.

- `Study`
- `N.1` 
- `N.2` 
- `Mean.1`
- `Mean.2`
- `SD.1`
- `SD.2`

For binary outcomes, long data should contain the columns: 

- `Study`
- `T`
- `N` (as for continuous data)
- `R` - the number of participants with the outcome of interest. 

Wide data for binary outcomes follows the same convention and should contain at least the following columns: 

- `Study`
- `T.1` 
- `T.2`
- `R.1`
- `R.2`
- `N.1`
- `N.2`

Additionally, a `covar.<name>` column can be added to all formats containing covariate data where `<name>` should be replaced with the name of the covariate. 
For long data, covariate values must be equal for every study arm within a study. 
Risk of bias data can also be included with all columns containing values ranging from 1 (low risk) to 3 (high risk): `rob` for the overall risk of bias, `indirectness` for indirectness and `rob.<name>` for up to ten individual components.

`setup_load` checks that the data is valid and reports any errors that make it unsuitable to analyse.

The function returns a list of objects, with the uploaded data in the `data` object.

```{r setup_load}
library(metainsight)

loaded_data <- setup_load(outcome = "continuous")

loaded_data$data[1:6, 1:7] |> knitr::kable()
```

After loading the data, the analysis needs to be configured using the `setup_configure` function which has the parameters:

- `reference_treatment`: One of the treatments in the dataset which the others will be compared to. Typically placebo or standard care but can be any treatment.
- `effects`: What type of models to fit, either `fixed` or `random`.
- `outcome_measure`: The outcome measure to use in the analysis. For binary outcomes this can be `OR` (odds ratio), `RR` (relative risk) or `RD` (risk difference) and for continuous outcomes this can be `MD` (mean difference) or `SMD` (standardised mean difference). The outcome measure chosen has an impact on the rest of the analysis as not all analyses can be conducted for all outcome measures. All analyses of continuous outcomes can be conducted using `MD` and all analyses of  binary outcomes can be conducted using `OR`.
- `ranking_option`: Whether low outcome values are `good` (for example blood pressure when treating hypertension) or `bad` (for example bone density when treating osteoporosis)
- `seed`: A value used to generate random numbers that ensures the analyses are reproducible. 

If your dataset contains studies which are not connected to the others (i.e. do not share a common treatment), they will be removed from the analysis. 

The result of `configured_data` can be passed to all the `summary_*` and `freq_*` functions and to the `baseline_model`, `bayes_model`, `bayes_nodesplit` and `covariate_model` functions.

```{r setup_configure}
configured_data <- setup_configure(loaded_data = loaded_data,
                                   reference_treatment = "Placebo",
                                   effects = "random",
                                   outcome_measure = "MD",
                                   ranking_option = "good",
                                   seed = 123)
```

Studies can be excluded from the analysis using `setup_exclude` and the resulting object can be passed to the same functions as the configured data. 
`setup_configure` sanitises the study names to ensure they are suitable for analysis by removing special characters and spaces and these must be used in `setup_exclude`. 
To view the sanitised study names use `unique(configured_data$connected_data$Study)`

```{r setup_exclude}
subsetted_data <- setup_exclude(configured_data = configured_data, 
                                exclusions = c("Study01", "Study02"))
```

### Summarising the data

The `summary_char` function produces three tables that characterise the data. 
`network` contains information about the network, `treatments` contains information about each treatment and `pairs` contains information about each treatment pair. 

```{r summary_char}
characteristics <- summary_char(configured_data)

characteristics$network |> knitr::kable()
```

The `summary_study` function produces a forest plot showing the results of each study. 
If risk of bias data was uploaded this is also presented. 
An interactive version of this plot can be used to exclude studies from the analysis in the app.

```{r summary_study}
summary_study(configured_data) |> show_plot()
```

The `summary_network` function produces a diagram of the network, visualising how many studies contain each treatment. 
Two styles are available - `netplot` which shows the number of studies as the size of nodes and edges and `netgraph` which shows the number of studies on the edges.

```{r summary_network}
summary_network(configured_data, style = "netplot") |> show_plot()
```

### Frequentist network meta-analysis

Frequentist analyses use netmeta and the model is fitted in `setup_configure` so there is no explicit function to fit the model.
Outputs are generated using the `freq_*` functions.
`freq_forest` produces a forest plot showing the treatment effects relative to the reference treatment.

```{r freq_forest}
freq_forest(configured_data) |> show_plot()
```

`freq_compare` produces a matrix of all treatment comparisons. Above the leading diagonal are estimates from pairwise meta-analyses, below the leading diagonal are estimates from network meta-analyses:

```{r freq_compare}
freq_compare(configured_data) |> knitr::kable()
```

`freq_inconsistency` summarises the direct and indirect evidence for each treatment pair and any inconsistencies between the direct and indirect evidence. 
For this dataset where all studies were only comparing with the reference treatment, for each treatment pair the evidence is entirely direct or indirect so no differences occur.

```{r freq_inconsistency}
freq_inconsistency(configured_data) |> knitr::kable()
```

`freq_summary` summarises all the results in a graphical format, ranking the treatments by their effect on the outcome. 

```{r freq_summary}
freq_summary(configured_data) |> show_plot()
```

### Bayesian network meta-analysis

Bayesian analyses use gemtc and all the functions have the `bayes_*` prefix. To fit the model pass the configured data to `bayes_model`. To make the model fit more quickly, `n_adapt` and `n_iter` are set very low but in actual usage can be unspecified so that the default values are used:

```{r bayes_model}
fitted_bayes_model <- bayes_model(configured_data,
                                  n_adapt = 100,
                                  n_iter = 100)
```

Some outputs are directly equivalent to the frequentist outputs, for example `bayes_forest` and `bayes_compare`. 
Outputs are generated by passing the fitted model to the functions. 

Two plots are available for displaying treatment rankings based on SUCRA scores, either the `rankogram` which displays treatments rankings as a line chart or `radial` which displays the rankings on a radar plot, overlayed with the network structure.

```{r bayes_ranking}
ranking_data <- bayes_ranking(fitted_bayes_model, configured_data)
ranking_plot(ranking_data, "rankogram") |> show_plot()
```

Model diagnostic plots are produced by `bayes_mcmc` to aid with assessing model convergence. Three types of plots are produced and one of each is produced per parameter of the model. The plots are not displayed here for brevity.

```{r bayes_mcmc, eval = FALSE}
mcmc_plots <- bayes_mcmc(fitted_bayes_model)

mcmc_plots$gelman_plots
mcmc_plots$trace_plots
mcmc_plots$density_plots
```

Deviance plots are produced by `bayes_deviance` that show the effect of individual studies on the fitted model. These plots are interactive and hovering over the points displays the study name, aiding their exclusion from sensitivity analyses. They are also not displayed here.

```{r bayes_deviance, eval = FALSE}
deviance_plots <- bayes_deviance(fitted_bayes_model)

deviance_plots$scat_plot
deviance_plots$stem_plot
deviance_plots$lev_plot
```

#### Nodesplitting models

Nodesplitting models test the consistency of direct and indirect evidence for each treatment pair within the network. 
These models cannot be fitted to every dataset and the function will return an error when it is not possible.
Nodesplitting is not possible with the default dataset because all the evidence for each treatment pair is either entirely direct or indirect so a different bundled dataset is used in this example:

```{r bayes_nodesplit}
nodesplit_path <- system.file("extdata", 
                              "continuous_nodesplit.csv", 
                              package = "metainsight")

nodesplit_loaded_data <- setup_load(data_path = nodesplit_path,
                                    outcome = "continuous")

nodesplit_configured_data <- setup_configure(loaded_data = nodesplit_loaded_data,
                                             reference_treatment = "Placebo",
                                             effects = "random",
                                             outcome_measure = "MD",
                                             ranking_option = "good",
                                             seed = 123)

nodesplit_model <- bayes_nodesplit(nodesplit_configured_data, 
                                   n_adapt = 100,
                                   n_iter = 100)

bayes_nodesplit_plot(nodesplit_model) |> show_plot()
```

### Bayesian network meta-regression with a covariate

If a covariate is included in the loaded data, the `covariate_` functions can be used to assess whether the covariate affects the outcome. 
As for the Bayesian NMA models, these models are fitted with gemtc, but two parameters are required in addition to the configured data. 
The `covariate_value` determines the covariate value at which to fit the model and this value must be within the range of covariate values in the data. 
The `regressor_type` specifies how slopes are fitted for each treatment. When `shared` the regression coefficients are the same for all treatment comparisons, when `exchangeable` they differ but all come from a shared distribution and when  `unrelated` they all differ.

```{r covariate_model}
fitted_covariate_model <- covariate_model(configured_data,
                                          covariate_value = 50,
                                          regressor_type = "shared",
                                          n_adapt = 100,
                                          n_iter = 100)
```

The results can be summarised using `metaregression_plot` which first requires that the data is processed further with `covariate_regression`. The `comparators` are the treatments to compare against the reference treatment.

```{r covariate_regression}
regression_data <- covariate_regression(fitted_covariate_model, 
                                        configured_data)

metaregression_plot(fitted_covariate_model,
                    configured_data,
                    regression_data,
                    comparators = c("Glucocorticoids",
                                    "Ketamine",
                                    "Gabapentinoids")) |> show_plot()
```

Other outputs are directly equivalent to those produced for with the `bayes_*` functions and the `covariate_*` functions have the same suffixes. Nodesplitting is not currently available for these models.

### Bayesian baseline-risk network meta-regression

Baseline risk adjusts for an approximation of the baseline health of participants in the studies by including the expected outcome in the reference arm as a covariate. This outcome is imputed for studies that did not include the reference treatment.

These models are fitted with bnma but the outputs are converted into a format consistent with gemtc. 
All the functions use the `baseline_*` prefix.

The data can be summarised with `baseline_summary`:

```{r baseline_summary}
baseline_summary(configured_data) |> show_plot()
```

Fitting the model with `baseline_model` uses the same syntax as `covariate_model` but no covariate value is specified as this is a property of the dataset. `n_iter`, `max_iter` and `check_iter` are set very low but in actual use should be unspecified so as to use the default values.

```{r baseline_model}
fitted_baseline_model <- baseline_model(configured_data,
                                        regressor_type = "shared",
                                        n_iter = 120,
                                        max_iter = 120,
                                        check_iter = 12)
```

### Exporting to CINeMA

Models fitted in metainsight can be exported into a format that can be uploaded to <a href="https://cinema.med.auth.gr/" target="_blank">CINeMA</a> (Confidence in Network Meta-Analysis). 
Currently, only the frequentist and Bayesian NMA models are supported.

```{r export_cinema}
cinema_project <- export_cinema(configured_data)
writeLines(cinema_project, tempfile(fileext = ".json"))
```

### App data

All the data can be downloaded from the app and advanced users may wish to conduct further analyses. 
This section describes the structure and where important objects can be found.
The data are saved as an rds file which can be read with `readRDS` and following the convention internal to the app, we name this object `common`:

```{r eval = FALSE}
common <- readRDS("<path to file>")
```

Within this object, data is structured as follows:

- `common$configured_data` contains the configured data
- `common$subsetted_data` contains the configured data but with the selected studies excluded
- Within both of these objects `freq` contains the frequentist analyses, with `freq$netmeta` containing the results from `netmeta::netmeta` and `pairwise` containing the results from `meta::pairwise`.
- The other models are stored as separate objects in `common` and follow a consistent pattern where `_all` refers to models made fitted to `configured_data` and `_sub` refers to models fitted to `subsetted_data`: 
  - `bayes_model_all`
  - `bayes_model_sub`
  - `bayes_nodesplit_all`
  - `bayes_nodesplit_sub`
  - `covariate_model_all`
  - `covariate_model_sub`
  - `baseline_model_all`
  - `baseline_model_sub`
  
Other objects are also stored and you can see the full list using `names(common)`.
 



