---
title: "Exogenous dyadic covariates"
author: "Francisco Richter"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Exogenous dyadic covariates}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>",
    fig.width = 6,
    fig.height = 4
)
```

```{r}
library(amorem)
```

Exogenous information — such as geographic distance between actors — can
drive the rate at which relational events occur.  `amorem` supports this
through the `contribution_logits` argument of `simulate_relational_events()`,
which accepts any sender × receiver matrix of log-intensities.

## US state distance matrix

The package ships a 56 × 56 distance matrix (in metres) between US states
and territories.  We load it and transform to a log-scale:

```{r}
data("dist_matrix", package = "amorem")

# log-transform to compress the range
dist_log <- log(dist_matrix / 100000 + 1)
```

## Defining a non-linear effect

Following the issue description, the true effect of distance on the
log-rate is a smooth, non-linear function:

$$f(d) = \sin\!\bigl(-d / 1.5\bigr)$$

where $d$ is the log-transformed distance.

```{r}
true_effect <- sin(-dist_log / 1.5)
```

We can visualise this curve:

```{r true-effect-curve}
d_seq <- seq(0, max(dist_log), length.out = 200)
plot(d_seq, sin(-d_seq / 1.5),
    type = "l", lwd = 2, col = "red",
    xlab = "log-distance", ylab = "f(d)",
    main = "True non-linear distance effect"
)
```

## Simulating events with exogenous covariates

We pass the effect matrix directly as `contribution_logits`.  The Gillespie
algorithm uses these values to weight which dyad fires next.  We also
request one control per event for downstream inference:

```{r}
set.seed(42)

states <- rownames(dist_matrix)

events <- simulate_relational_events(
    n_events        = 800,
    senders         = states,
    receivers       = states,
    contribution_logits = true_effect,
    allow_loops     = FALSE,
    n_controls      = 1
)

head(events)
```

## Recovering the effect with a GAM

For each event–control pair we compute the **difference** in log-distance.
A GAM with a smooth term `s(delta_dist)` should recover the true curve.

```{r fit-gam}
library(mgcv)

get_dist <- function(s, r) {
    dist_log[cbind(match(s, states), match(r, states))]
}

events$dist_val <- mapply(get_dist, events$sender, events$receiver)

cases <- events[events$event == 1, ]
controls <- events[events$event == 0, ]
cases <- cases[order(cases$stratum), ]
controls <- controls[order(controls$stratum), ]

fit_df <- data.frame(
    y          = 1,
    delta_dist = cases$dist_val - controls$dist_val
)

fit <- gam(y ~ s(delta_dist) - 1, family = binomial, data = fit_df)
summary(fit)
```

## Plotting estimated vs true effect

```{r effect-plot}
x_grid <- seq(min(fit_df$delta_dist), max(fit_df$delta_dist), length.out = 300)
pred <- predict(fit, newdata = data.frame(delta_dist = x_grid), type = "link")

plot(x_grid, pred,
    type = "l", lwd = 2,
    xlab = expression(Delta ~ "log-distance"),
    ylab = "Estimated effect",
    main = "GAM-recovered smooth vs true effect"
)
abline(h = 0, lty = 2, col = "grey50")
```

The GAM successfully captures the non-linear relationship between distance
and event intensity, demonstrating that `amorem` handles exogenous dyadic
covariates seamlessly through `contribution_logits`.
