---
title: "Plate Randomization Vignette"
author: "Olink DS team"
date: "`r Sys.Date()`"
output:
  html_vignette:
    toc: true
    toc_depth: 3
    includes:
      in_header: ../man/figures/logo.html

vignette: >
  %\VignetteIndexEntry{Plate Randomization Vignette}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(warning = FALSE,
                      fig.height = 6,
                      fig.width = 6,
                      message = FALSE)
```

## Introduction

This vignette describes how to use Olink® Analyze to randomize the samples with
an option to keep subjects on the same plate. When a study is well randomized
the experimental variables can be considered to be evenly distributed across
each plate, even for a larger study. What variables to randomize for should be
decided for each study as these vary with the study purpose. Correct sample
randomization will empower your study and minimize the risk of introducing any
bias that can confound downstream analyses. If at all possible, samples from the
same subject that are taken at various times during a longitudinal research
should all be placed on the same plate to further limit data fluctuation. In
order to evenly disperse the remaining experimental variables between plates,
the individuals should subsequently be distributed. If randomization is not
performed, true biological variation may be missed or misidentified as e.g.
technical variations. However, it would be challenging for us to control every
variable. Most of the time, a complete randomization will be carried out, and
the outcomes can be assessed by visualizing the layout of the plates. In this
vignette, you will learn how to achieve this using the
`olink_plate_randomizer()` function. And on top of that, you will also learn how
to use `olink_displayPlateLayout()` and `olink_displayPlateDistributions()` to
evaluate the performed randomization based on a given grouping variable.

## Sample randomization

The input manifest should be a tibble/data frame in long format containing all
sample ID's. Sample ID column must be named SampleID. An example manifest is
shown in Table 1.

```{r manifest, echo = FALSE, message = FALSE}
OlinkAnalyze::manifest |>
  dplyr::slice_head(
    n = 10L
  ) |>
  knitr::kable(
    format = "html",
    booktabs = TRUE,
    linesep = "",
    digits = 4L,
    longtable = TRUE,
    row.names = FALSE,
    caption = "Table 1. Example of Sample manifest",
    label = "manifest"
  )
```

### Complete randomization

The simplest way to randomize the samples is to perform complete randomization.

```{r complete_randomization, message = FALSE, warning = FALSE, results = 'hide'}
randomized_manifest <- OlinkAnalyze::olink_plate_randomizer(
  Manifest = OlinkAnalyze::manifest,
  seed = 123456
)
```

### Generate randomization scheme that keeps subjects on the same plate

In the case of multiple samples per subject (e.g. in longitudinal studies),
Olink recommends keeping each subject on the same plate to further reduce
variation in the data. The individuals should then be distributed so that the
remaining experimental variables of interest are evenly distributed across
plates. This could be achieved by using the `SubjectColumn` argument. The
`SubjectColumn` argument must correspond to a column within the manifest and
cannot be SampleID. If `SubjectColumn` is specified, every SampleID must have a
value in the subject column, even if there is only 1 sample for that subject.

However, if there are too many samples per subject (>8), complete randomization
is recommended. The plate size can be either directly set to 48 or 96 using the `PlateSize`
argument or can be inferred by setting the `Product` argument to "Target 96",
"Target 48", "Explore 384","Explore 3072", or "Explore HT." The number of Olink
external controls can be set using the `num_ctrl` argument, which defaults to 8
controls. The Olink external controls can also be set to be randomized
throughout the plate using the `rand_ctrl` argument.

```{r randomize_subject, message = FALSE, eval = TRUE, echo = TRUE}
# Example longitudinal randomization with subjects kept on the same plate.
randomized_manifest_subject <- OlinkAnalyze::olink_plate_randomizer(
  Manifest = OlinkAnalyze::manifest,
  Product = "Explore HT",
  SubjectColumn = "SubjectID",
  num_ctrl = 10L
)
```

```{r randomize_controls, message = FALSE, eval = FALSE, echo = TRUE}
# Randomize Olink external control samples.
randomized_manifest_random <- OlinkAnalyze::olink_plate_randomizer(
  Manifest = OlinkAnalyze::manifest,
  Product = "Explore HT",
  SubjectColumn = "SubjectID",
  num_ctrl = 10L,
  rand_ctrl = TRUE
)
```


```{r randomized_controls_table, echo = FALSE, message = FALSE, warning = FALSE}
randomized_manifest_subject |>
  dplyr::slice_head(
    n = 10L
  ) |>
  knitr::kable(
    format = "html",
    booktabs = TRUE,
    linesep = "",
    digits = 4L,
    longtable = TRUE,
    row.names = FALSE,
    caption = paste0(
      "Table 2. Example of Randomized Sample manifest ",
      "with subjects kept on the same plate."
    ),
    label = "manifest"
  )
```

The number of samples on each plate could be specified via the `available.spots`
argument. For example, the following code will lead to 48 samples on the first
and the second plate, and 42 samples on the third plate. The number of
iterations for fitting subjects on the same plate can be set by the `iterations`
argument. Increasing the number of iterations allows additional randomization
attempts to the algorithm to satisfy the constraints in the expense of increased
computation time. The randomization can be reproduced by setting the `seed`
argument to a specific number.

```{r subjects_randomization, message = FALSE, results = 'hide'}
randomized_manifest_num <- OlinkAnalyze::olink_plate_randomizer(
  Manifest = OlinkAnalyze::manifest,
  SubjectColumn = "SubjectID",
  available.spots = c(48L, 48L, 42L),
  iterations = 500L,
  seed = 123456
)
```

## Multi-study manifests

In the case of a manifest that includes multiple studies, the optional `study`
argument can be used to group samples by study and randomize the samples within
the study. In the example below, "Site" is used as a surrogate for study. After
randomization the sites are kept together across plates.

```{r}
randomized_manifest_multi <- OlinkAnalyze::olink_plate_randomizer(
  Manifest = OlinkAnalyze::manifest,
  study = "Site"
)
```

## Visualization

To illustrate the goodness of randomization, both `olink_displayPlateLayout()`
and `olink_displayPlateDistributions()` functions could be used.

### Plate layouts

The `olink_displayPlateLayout()` function could be used to visualize the layout
of the plate by specifying the color for the variable of interest using the
`fill.color` argument. The label of the colored variable could be shown in the plot
`fill.color` argument.

```{r figure_captions, message = FALSE, echo = FALSE}
fcap1 <- paste0(
  "Figure 1. Randomized samples in 96 well plate format, ",
  "colored by Subject ID."
)
fcap2 <- paste0(
  "Figure 2. Randomized samples in 96 well plate format ",
  "with labeled wells."
)
fcap3 <- paste0(
  "Figure 3. Randomized samples in a 96-well plate format, ",
  "with the number of samples per plate pre-defined."
)
fcap4 <- paste0(
  "Figure 4. Randomized samples in 96 well plate format, ",
  "colored by Studies."
)
fcap5 <- "Figure 5. Distribution of Subject ID across randomized plates."
fcap6 <- "Figure 6. Distribution of Site across randomized plates."
```

```{r, fig.height = 6, fig.width = 7.5, fig.align = "center", fig.cap = fcap1}
OlinkAnalyze::olink_displayPlateLayout(
  data = randomized_manifest,
  fill.color = "SubjectID",
  include.label = FALSE
)
```

The label of the colored variable could be shown in the plot via the
`include.label` argument.

```{r, fig.height = 6, fig.width = 7.5, fig.align = "center", fig.cap = fcap2}
OlinkAnalyze::olink_displayPlateLayout(
  data = randomized_manifest,
  fill.color = "SubjectID",
  include.label = TRUE
)
```

```{r, fig.height = 6, fig.width = 7.5, fig.align = "center", fig.cap = fcap3}
OlinkAnalyze::olink_displayPlateLayout(
  data = randomized_manifest_num,
  fill.color = "SubjectID",
  include.label = FALSE
)
```

```{r, fig.height = 6, fig.width = 7.5, fig.align = "center", fig.cap = fcap4}
OlinkAnalyze::olink_displayPlateLayout(
  data = randomized_manifest_multi,
  fill.color = "Site"
)
```

### Plate distribution

The distribution of the given grouping variable on each plate could be
visualized with a bar chart using function `olink_displayPlateDistributions()`. By
setting `fill.color = 'SubjectID'`, we could make sure all the samples from the
same subject were put on the same plate.

```{r, fig.align = "center", fig.cap = fcap5}
OlinkAnalyze::olink_displayPlateDistributions(
  data = randomized_manifest,
  fill.color = "SubjectID"
)
```

We could also check the distribution of other variables. For example, the
distribution of the variable `Site` could be visualized by setting
`fill.color = 'Site'`.

```{r, fig.align = "center", fig.cap= fcap6}
OlinkAnalyze::olink_displayPlateDistributions(
  data = randomized_manifest,
  fill.color = "Site"
)
```

If the randomization algorithm fails to converge or it is not performed well,
the randomization could be performed again with different settings for `seed`
and `iterations`.

## Data Output

The output of the randomization function would be a "tibble" including SampleID,
SubjectID etc. assigned to well positions. Columns include same columns as
SubjectID etc. assigned to well positions. Columns include the original Manifest
columns with following additional ones:

  - plate: Plate number

  - column: Column on the plate

  - row: Row on the plate

  - well: Well location on the plate

The randomized manifest could be outputted as an Excel file using the
The randomized manifest could be exported as an Excel spreadsheet using the
function `write_xlsx()` from the R package `writexl`.

```{r, eval = FALSE}
writexl::write_xlsx(
  x = randomized_manifest,
  path = "randomized.manifest.xlsx"
)
```

## Contact Us

We are always happy to help. Email us with any questions:

-   biostat\@olink.com for statistical services and general stats
    questions

-   support\@olink.com for Olink lab product and technical support

-   info\@olink.com for more information

## Legal Disclaimer

© `r format(Sys.Date(), "%Y")` Olink Proteomics AB, part of Thermo Fisher
Scientific.

Olink products and services are For Research Use Only. Not for use in diagnostic
procedures.

All information in this document is subject to change without notice. This
document is not intended to convey any warranties, representations and/or
recommendations of any kind, unless such warranties, representations and/or
recommendations are explicitly stated.

Olink assumes no liability arising from a prospective reader’s actions based on
this document.

OLINK, NPX, PEA, PROXIMITY EXTENSION, INSIGHT and the Olink logotype are
trademarks registered, or pending registration, by Olink Proteomics AB. All
third-party trademarks are the property of their respective owners.

Olink products and assay methods are covered by several patents and patent
applications [https://www.olink.com/patents/](https://olink.com/patents/).
