An Introduction to BFI

Hassan Pazira

Marianne Jonker

2024-04-25

Overview

The R package BFI (Bayesian Federated Inference) provides several functions to carry out the Bayesian Federated Inference method for two kinds of models (GLM and Survival) with multicenteral data without combining/sharing them. In this tutorial we focus on GLM only, so that this version of the package is available for two commonly used families: "binomial" and "gaussian". The mostly using functions include bfi(), MAP.estimation(), and inv.prior.cov(). In the following, we will see how the BFI package can be applied to real datasets included in the package.

How to use it?

Before we go on, we first install and load the BFI package using the devtools package:

# First install and load the package 'devtools'
#if(!require(devtools)) {install.packages("devtools")}
library(devtools)

# Now install BFI from GitHub
#devtools::install_github("hassanpazira/BFI", force = TRUE)

# load BFI
library(BFI)

By the following code we can see there two available datasets in the package: trauma and Nurses.

data(package = "BFI")

The trauma data can be utilized for the "binomial" family and Nurses data can be used for "gaussian". To avoid repetition, we only use the trauma data set. Load and inspect the trauma data as follows:

# Load 'trauma' in the R workspace
data("trauma")

# Get the number of rows and columns
dim(trauma)
## [1] 371   6
# To get an idea of the dataset, print the first 7 rows
head(trauma, 7)
##   sex age hospital ISS GCS mortality
## 1   1  20        3  24  15         0
## 2   0  38        3  34  13         0
## 3   0  37        3  50  15         0
## 4   0  17        3  43   4         1
## 5   0  49        3  29  15         0
## 6   0  30        3  22  15         0
## 7   1  84        2  66   3         1

This data set consists of data of 371 trauma patients from three hospitals (peripheral hospital without a neuro-surgical unit, status=1, peripheral hospital with a neuro-surgical unit, status=2, and academic medical center, status=3).

As we can see it has 6 columns:

(col_name <- colnames(trauma))
## [1] "sex"       "age"       "hospital"  "ISS"       "GCS"       "mortality"

The covariates sex (dichotomous), age (continuous), ISS (Injury Severity Score, continuous), and GCS (Glasgow Coma Scale, continuous) are the predictors, and mortality is the response variable. hospital is a categorical variable which indicates the hospitals involved in the study. For more information about this dataset use

# Get some info about the dataset from the help file
?trauma

References