To launch the app, the user needs to provide a data frame with the observations.
This will automatically sort the variables into numeric and other
variables, as well as impute any missing data using
VIM::kNN() if VIM is installed otherwise entries with
missing data will be removed. Numeric variables can be sorted into the
clustering space (space1) and the linked space
(space2) once loaded into the app in the data screen. Variables can be
added or removed from each space with the drop down selector that
appears by clicking on the box.
Alternatively, the data can be passed in as two separate arrays with
the clustering variables in df = space1 and the linked
variables in linked = space2. This will select them in
space 1 and 2 in the data screen when loading the app. This can be done
like this:
A complete input for pandemonium() includes optional
data and function inputs. All inputs are shown in the following
call.
pandemonium(df,
cov = NULL, is.inv = FALSE, exp = NULL, linked = NULL, linked.cov = NULL,
linked.exp = NULL, group = NULL, label = NULL, user_dist = NULL,
dimReduction = list(tSNE = tSNE, umap = umap), getCoords = list(normal = normCoords), getScore = NULL
)| Input | Type | Applies to | Default | Purpose |
|---|---|---|---|---|
label |
vector, length = n | points | row index | Shown in tours/dim. reduction hover text |
group |
vector / data.frame | points | none | Define user-specified groups; categorical or numeric |
cov,linked.cov* |
matrix | group/space | computed via stats::cov |
Used in getScores, getCoords, anomaly
tour |
exp,linked.exp |
data frame with column value length of number of
variables in space |
variables | mean vector | Reference point in space used in getCoords |
user_dists |
matrix | space1 | ignored | Advanced: overrides getDists output |
* cov can also be the inverse covariance matrix by
setting is.Inv=TRUE
| Input | Type | Use |
|---|---|---|
getCoords |
named list of coordinate functions | computes coordinates for distance calculations |
getScores |
Function that returns a named list | computes scores and/or bins for use in plotting |
See
vignette("get-scores")andvignette("get-coords")for more information on these inputs.
Once a call to pandemonium() is made the app will load
into the data page which looks like below.
On this page, variables can be removed from either space or even moved between them. A Coordinate function can be selected from the input functions. There are also two additional inputs for groupings or flags as well as a label.
In this input you can select variables passed to pandemonium in the
group= input, as well as any variables removed from space 1
or space 2 by deleting them in their inputs. Variables that were
automatically removed from the two spaces for being non-numeric can also
be selected. This is designed for categorical data so that it can be
compared to clustering results. The selected variable(s) will be
converted into a single factor with interaction().
In this input you can select the label passed to pandemonium in the
label= input, as well as the same removed variables from
above. This input is designed to give a unique label for each point so
row numbers or unique IDs are recommended.
One of the key features of the data input page is the ability to move
variables across spaces after loading the application. This feature can
have some unexpected effects on the covariance matrix and reference
point if provided. Removing a variable from a space will slice out the
corresponding entries from the covariance matrix and reference point.
Adding variables to a space will cause the covariance matrix and
reference point to be recalculated, cov() is used for the
covariance matrix and colMeans() for the reference point.
If the provided covariance matrix is an inverse covariance matrix it is
first inverted using solve() before slicing. In some cases
this may behave unexpectedly and altering variables may be better done
by relaunching pandemonium with the correctly filtered data.