Package 'lsbclust' reference manual

Title:	Least-Squares Bilinear Clustering for Three-Way Data
Description:	Functions for performing least-squares bilinear clustering of three-way data. The method uses the bilinear decomposition (or bi-additive model) to model two-way matrix slices while clustering over the third way. Up to four different types of clusters are included, one for each term of the bilinear decomposition. In this way, matrices are clustered simultaneously on (a subset of) their overall means, row margins, column margins and row-column interactions. The orthogonality of the bilinear model results in separability of the joint clustering problem into four separate ones. Three of these sub-problems are specific k-means problems, while a special algorithm is implemented for the interactions. Plotting methods are provided, including biplots for the low-rank approximations of the interactions.
Authors:	Pieter Schoonees [aut, cre], Patrick Groenen [ctb]
Maintainer:	Pieter Schoonees <[email protected]>
License:	GPL (>= 2)
Version:	1.1
Built:	2025-03-11 04:00:31 UTC
Source:	https://github.com/cran/lsbclust

Least Squares Latent Class Matrix Factorization

Description

Funtions for least squares latent class matrix factorizations.

Author(s)

Pieter C. Schoonees [aut, cre], Patrick J.F. Groenen [aut]

References

Van Rosmalen, J., Van Herk, H., & Groenen, P. J. F. (2010). Identifying response styles: A latent-class bilinear multinomial logit model. Journal of Marketing Research, 47(1), 157-172.

K-Means Over One Way of An Three-Way Array

Description

Vectorize matrix slices over a specific way of an three-way array, and conduct kmeans on it.

Usage

akmeans(data, centers, margin = 3L, ndim = NULL, ...)
akmeans(data, centers, margin = 3L, ndim = NULL, ...)

Arguments

`data`	Three-way data array
`centers`	Passed to `kmeans`
`margin`	Integer indicating which way to cluster over
`ndim`	The rank of the low dimensional approximation of the matrix slices to construct before clustering (using `svd`)
`...`	Additional arguments passed to `kmeans`

Examples

set.seed(1)
res <- akmeans(data = carray(dcars), margin = 3L, centers = 5, nstart = 10)
set.seed(1)
res <- akmeans(data = carray(dcars), margin = 3L, centers = 5, nstart = 10)

Bilinear Decomposition of a Matrix

Description

Decomposes a matrix into an overall mean matrix, row margins matrix, column margins matrix and an interaction matrix, depending on delta.

Usage

bicomp(x, delta = c(1, 1, 1, 1), which = 0L:4L)
bicomp(x, delta = c(1, 1, 1, 1), which = 0L:4L)

Arguments

`x`	A matrix to be decomposed.
`delta`	A vector of length four with 0/1 entries which controls the type of decomposition made.
`which`	A vector giving the elements to return, with 0 = original data, 1 = overall means, 2 = row means, 3 = column means and 4 = interactions.

Value

An object of class bicomp, possible also inheriting from class data.frame, which is either a named list with the required components, or a single matrix if a single component is requested. An additional attribute return_type gives information on the type of matrices returned.

Double-Centre a Three-way Array

Description

Double-centre the matrix slices of a three-way array.

Usage

carray(array, margin = 3L, rows = TRUE, columns = TRUE)
carray(array, margin = 3L, rows = TRUE, columns = TRUE)

Arguments

`array`	A three-way array
`margin`	The way of the array over which the centring must be done
`rows`	Logical indicating whether to centre the rows of the matrix slices
`columns`	Logical indicating whether to centre the columns of the matrix slices

Compare Simulation Results

Description

Generic function to compare simulation results in lsbclust.

Usage

cfsim(fitted, actual, method = c("diag", "cRand"))
cfsim(fitted, actual, method = c("diag", "cRand"))

Arguments

`fitted`	An object of class `lsbclust` containing the fitted results.
`actual`	An object of class `lsbclust_sim` containing the simulated data.
`method`	The type of statistics to calculate, passed to `cl_agreement`

Compare LSBCLUST Simulation Results

Description

This function compares cluster membership and parameter estimates for the results of akmeans on simulated data, constructed using rlsbclust, to the true underlying values.

Usage

## S3 method for class 'akmeans'
cfsim(fitted, actual, method = c("diag", "cRand"))
## S3 method for class 'akmeans'
cfsim(fitted, actual, method = c("diag", "cRand"))

Arguments

`fitted`	An object of class `akmeans` containing the fitted results.
`actual`	An object of class `lsbclust_sim` containing the simulated data.
`method`	The method for calculating cluster agreement across random starts, passed on to `cl_agreement`. None is calculated when set to `NULL`.

Examples

## Simulate LSBCLUST data, fit akmeans on double-centered data, and compare
set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5))
dat[[1]]$data <- carray(dat[[1]]$data)
res <- akmeans(data = dat[[1]]$data, centers = 5, margin = 3, ndim = 2)
cfsim(res, dat[[1]])
## Simulate LSBCLUST data, fit akmeans on double-centered data, and compare
set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5))
dat[[1]]$data <- carray(dat[[1]]$data)
res <- akmeans(data = dat[[1]]$data, centers = 5, margin = 3, ndim = 2)
cfsim(res, dat[[1]])

Compare LSBCLUST Simulation Results

Description

This function compares cluster membership and parameter estimates for the results of lsbclust on simulated data to the true underlying values.

Usage

## S3 method for class 'lsbclust'
cfsim(fitted, actual, method = c("diag", "cRand"))
## S3 method for class 'lsbclust'
cfsim(fitted, actual, method = c("diag", "cRand"))

Arguments

`fitted`	An object of class `lsbclust` containing the fitted results.
`actual`	An object of class `lsbclust_sim` containing the simulated data.
`method`	The type of statistics to calculate, passed to `cl_agreement`

Examples

## Simulate LSBCLUST data, fit LSBCLUST, and compare
set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5))
res <- lsbclust(data = dat[[1]]$data, nclust = c(5, 4, 6, 5))
cfsim(res, dat[[1]])
## Simulate LSBCLUST data, fit LSBCLUST, and compare
set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5))
res <- lsbclust(data = dat[[1]]$data, nclust = c(5, 4, 6, 5))
cfsim(res, dat[[1]])

Compare LSBCLUST Simulation Results

Description

This function compares cluster membership and parameter estimates for the results of T3Clusf on simulated data, using rlsbclust, to the true underlying values.

Usage

## S3 method for class 'T3Clusf'
cfsim(fitted, actual, method = c("diag", "cRand"))
## S3 method for class 'T3Clusf'
cfsim(fitted, actual, method = c("diag", "cRand"))

Arguments

`fitted`	An object of class `lsbclust` containing the fitted results.
`actual`	An object of class `lsbclust_sim` containing the simulated data.
`method`	The method for calculating cluster agreement across random starts, passed on to `cl_agreement`. None is calculated when set to `NULL`.

Examples

## Simulate LSBCLUST data, fit T3Clusf on double-centered data, and compare
set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5))
dat[[1]]$data <- carray(dat[[1]]$data)
res <- T3Clusf(X = dat[[1]]$data, Q = 2, G = 5)
cfsim(res, dat[[1]])
## Simulate LSBCLUST data, fit T3Clusf on double-centered data, and compare
set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5))
dat[[1]]$data <- carray(dat[[1]]$data)
res <- T3Clusf(X = dat[[1]]$data, Q = 2, G = 5)
cfsim(res, dat[[1]])

S3 export

Description

These export into the framework set out in package clue.

Usage

## S3 method for class 'int.lsbclust'
cl_class_ids(x)

## S3 method for class 'int.lsbclust'
is.cl_partition(x)

## S3 method for class 'int.lsbclust'
is.cl_hard_partition(x)

## S3 method for class 'lsbclust_sim_part'
cl_class_ids(x)

## S3 method for class 'lsbclust_sim_part'
is.cl_partition(x)

## S3 method for class 'lsbclust_sim_part'
is.cl_hard_partition(x)

## S3 method for class 'T3Clusf'
cl_class_ids(x)

## S3 method for class 'T3Clusf'
is.cl_partition(x)

## S3 method for class 'T3Clusf'
is.cl_hard_partition(x)

## S3 method for class 'akmeans'
cl_class_ids(x)

## S3 method for class 'akmeans'
is.cl_partition(x)

## S3 method for class 'akmeans'
is.cl_hard_partition(x)
## S3 method for class 'int.lsbclust'
cl_class_ids(x)

## S3 method for class 'int.lsbclust'
is.cl_partition(x)

## S3 method for class 'int.lsbclust'
is.cl_hard_partition(x)

## S3 method for class 'lsbclust_sim_part'
cl_class_ids(x)

## S3 method for class 'lsbclust_sim_part'
is.cl_partition(x)

## S3 method for class 'lsbclust_sim_part'
is.cl_hard_partition(x)

## S3 method for class 'T3Clusf'
cl_class_ids(x)

## S3 method for class 'T3Clusf'
is.cl_partition(x)

## S3 method for class 'T3Clusf'
is.cl_hard_partition(x)

## S3 method for class 'akmeans'
cl_class_ids(x)

## S3 method for class 'akmeans'
is.cl_partition(x)

## S3 method for class 'akmeans'
is.cl_hard_partition(x)

Arguments

`x`	An object of class `int.lsclust`

C++ Function for Cluster Means

Description

This function calculates the cluster means in vectorized form based on the current value of the clustering vector.

Usage

ClustMeans(nclust, start, data)
ClustMeans(nclust, start, data)

Arguments

`nclust`	The number of clusters.
`start`	The current clustering vector.
`data`	The concatenated data, with J * K rows and N columns

Value

A numeric matrix with nclust rows and J*K columns.

Centring Matrix

Description

A utility function for calculating centring matrices.

Usage

cmat(k)
cmat(k)

Arguments

`k`	An integer determining the dimensions of the centring matrix.

Dutch Cars Data

Description

This data set relates to 187 Dutch households rating 10 automobile manufacturers according to 8 variables (original Dutch terms in parentheses): price (prijsniveau), design (vormgeving), safety (veiligheid), operating cost (gebruikskosten), ) sportiness (sportiviteit), size (modelgrootte), reliability (betrouwbaarheid) and feautures (uitrusting). A rating scale from 1 to 10 was used.

Usage

dcars
dcars

Format

A three-way array with cars in the first dimension, variables in the second and consumers in the third dimension.

The items and labels for the endpoints of the scales are (original Dutch labels in parentheses):

Affordability: A rating from 1 = Expensive (duur) to 10 = Cheap (goedkoop)
Attractiveness: A rating from 1 = Ugly (lelijk) to 10 = Beautiful (mooi)
Safety: A rating from 1 = Bad (slecht) to 10 = Good (goed)
OperatingCost: A rating from 1 = Low (laag) to 10 = High (hoog)
Sportiness: A rating from 1 = Slow (langzaam) to 10 = Fast (snel)
Size: A rating from 1 = Large (groot) to 10 = Small (klein)
Reliability: A rating from 1 = Bad (slecht) to 10 = Good (goed)
Features: A rating from 1 = Simple (eenvoudig) to 10 = Luxurious (luxe)

Details

The original sample consisted of 188 households. However, one of these households (code 87845) was discarded because it appears that they used a rating scale from 0 to 10 instead of from 1 to 10. Note that all rating scales has been reversed so that higher scores are better for most items. The exceptions are OperatingCost and Size, where larger values mean higher costs and smaller cars respectively.

Source

Tammo Bijmolt, Michel van de Velden

Examples

data("dcars")
set.seed(5448)
m <- lsbclust(data = dcars, delta = c(1, 1, 1, 1), nclust = c(5, 3, 6, 8), nstart = 5, 
              nstart.kmeans = 10, parallel = FALSE, fixed = "columns")
data("dcars")
set.seed(5448)
m <- lsbclust(data = dcars, delta = c(1, 1, 1, 1), nclust = c(5, 3, 6, 8), nstart = 5, 
              nstart.kmeans = 10, parallel = FALSE, fixed = "columns")

Extract Fitted Values for akmeans

Description

An S3 method for fitted for class "akmeans".

Usage

## S3 method for class 'akmeans'
fitted(object, ...)
## S3 method for class 'akmeans'
fitted(object, ...)

Arguments

`object`	An object of class `"akmeans"`
`...`	Unimplemented

Value

An array approximating the original data

Extract Fitted Values for LSBCLUST

Description

An S3 method for fitted for class "lsbclust".

Usage

## S3 method for class 'lsbclust'
fitted(object, ...)
## S3 method for class 'lsbclust'
fitted(object, ...)

Arguments

`object`	An object of class `"lsbclust"`
`...`	Unimplemented

Value

An array approximating the original data

Extract Fitted Values for T3Clusf

Description

An S3 method for fitted for class "T3Clusf".

Usage

## S3 method for class 'T3Clusf'
fitted(object, ...)
## S3 method for class 'T3Clusf'
fitted(object, ...)

Arguments

`object`	An object of class `"T3Clusf"`
`...`	Unimplemented

Value

An array approximating the original data

Generalized Procrustes Rotation

Description

This function finds K orthogonal rotation matrices so that the rotated versions of the input configurations match each other optimally in the least-squares sense. The algorithm depends on the starting values for the rotation matrices. At present identity matrices are used as starting values. Only rotations / reflections are considered – no scaling or translation factors are included.

Usage

genproc(configs, maxit = 50L, reltol = 1e-06, random = FALSE)
genproc(configs, maxit = 50L, reltol = 1e-06, random = FALSE)

Arguments

`configs`	A list of original configuration matrices
`maxit`	The maximum number of iterations allowed
`reltol`	The relative error tolerance for determining numeric convergence.
`random`	Logical indicating whether or not to use random starts (only applicable when the dimensionality is two).

References

Gower, J. C., & Dijksterhuis, G. B. (2004). Procrustes problems (Vol. 3). Oxford: Oxford University Press.

Create Array of Indicator Matrices

Description

This function takes a matrix or data.frame and the number of rating categories maxcat and produces a three-way array of m by maxcat indicator matrices, one for each of the n rows. The input x must be a matrix or data.frame of dimensions n by m which contains the ratings on a scale of 1 to maxcat for m items. Note that missing values (NA's) will not appear in the columns.

Usage

indarr(x, maxcat, na.add = TRUE)
indarr(x, maxcat, na.add = TRUE)

Arguments

`x`	a `matrix` of `data.frame`
`maxcat`	an integer indicating the maximum of the rating scale (which is assumed to start with 1)
`na.add`	logical indicating whether to add a designated category for missings or not. Defaults to TRUE.

Value

A list of rating by item indicator matrices.

Author(s)

Pieter C. Schoonees

Examples

data("lov")
arr <- indarr(lov[1:10, 1:9], maxcat = 9)
str(arr)
data("lov")
arr <- indarr(lov[1:10, 1:9], maxcat = 9)
str(arr)

Interaction Clustering in Least Squares Bilinear Clustering

Description

This function implements the interaction clustering part of the Least Squares Bilinear Clustering method of Schoonees, Groenen and Van de Velden (2014).

Usage

int.lsbclust(data, margin = 3L, delta, nclust, ndim = 2,
  fixed = c("none", "rows", "columns"), nstart = 50, starts = NULL,
  alpha = 0.5, parallel = FALSE, mc.cores = detectCores() - 1,
  maxit = 100, verbose = 1, method = "diag", minsize = 3L,
  return_data = FALSE)
int.lsbclust(data, margin = 3L, delta, nclust, ndim = 2,
  fixed = c("none", "rows", "columns"), nstart = 50, starts = NULL,
  alpha = 0.5, parallel = FALSE, mc.cores = detectCores() - 1,
  maxit = 100, verbose = 1, method = "diag", minsize = 3L,
  return_data = FALSE)

Arguments

`data`	A three-way array representing the data.
`margin`	An integer giving the single subscript of `data` over which the clustering will be applied.
`delta`	A four-element binary vector (logical or numeric) indicating which sum-to-zero constraints must be enforced.
`nclust`	An integer giving the desired number of clusters. If it is a vector, the algorithm will be run for each element.
`ndim`	The required rank for the approximation of the interactions (a scalar).
`fixed`	One of `"none"`, `"rows"` or `"columns"` indicating whether to fix neither sets of coordinates, or whether to fix the row or column coordinates across clusters respectively. If a vector is supplied, only the first element will be used.
`nstart`	The number of random starts to use.
`starts`	A list containing starting configurations for the cluster membership vector. If not supplied, random initializations will be generated.
`alpha`	Numeric value in [0, 1] which determines how the singular values are distributed between rows and columns.
`parallel`	Logical indicating whether to parallelize over different starts or not.
`mc.cores`	The number of cores to use in case `parallel = TRUE`, passed to `makeCluster`.
`maxit`	The maximum number of iterations allowed.
`verbose`	Integer controlling the amount of information printed: 0 = no information, 1 = Information on random starts and progress, and 2 = information is printed after each iteration for the interaction clustering.
`method`	The method for calculating cluster agreement across random starts, passed on to `cl_agreement`. None is calculated when set to `NULL`.
`minsize`	Integer giving the minimum size of cluster to uphold when reinitializing empty clusters.
`return_data`	Logical indicating whether to include the data in the return value or not

Value

An object of class int.lsb

Examples

data("supermarkets")
out <- int.lsbclust(data = supermarkets, margin = 3, delta = c(1,1,0,0), nclust = 4, ndim = 2, 
           fixed = "rows", nstart = 1, alpha = 0)
data("supermarkets")
out <- int.lsbclust(data = supermarkets, margin = 3, delta = c(1,1,0,0), nclust = 4, ndim = 2, 
           fixed = "rows", nstart = 1, alpha = 0)

C++ Function for Weighted K-Means

Description

This function does a weighted K-means clustering.

Usage

ComputeMeans(cm, data, weight, nclust)

AssignCluster(data, weight, M, nclust)

KMeansW(nclust, start, data, weight, eps = 1e-08, IterMax = 100L)
ComputeMeans(cm, data, weight, nclust)

AssignCluster(data, weight, M, nclust)

KMeansW(nclust, start, data, weight, eps = 1e-08, IterMax = 100L)

Arguments

`cm`	Numeric vector of class indicators.
`data`	The concatenated data, with N rows and M columns. Currently, the columns are clustered.
`weight`	The vector of length `nrows(data)` with weights with nonnegative elements.
`nclust`	The number of clusters.
`M`	Matrix of cluster means.
`start`	The current cluster membership vector.
`eps`	Numerical absolute convergence criteria for the K-means.
`IterMax`	Integer giving the maximum number of iterations allowed for the K-means.

Value

A list with the folowing values.

`centers`	the `nclust` by M matrix `centers` of cluster means.
`cluster`	vector of length N with cluster memberships.
`loss`	vector of length `IterMax` with the first entries containing the loss.
`iterations`	the number of iterations used (corresponding to the number of nonzero entries in `loss`)

Examples

set.seed(1)
clustmem <- sample.int(n = 10, size = 100, replace = TRUE)
mat <- rbind(matrix(rnorm(30*4, mean = 3), nrow = 30), 
             matrix(rnorm(30*4, mean = -2), nrow = 30), 
             matrix(rnorm(40*4, mean = 0), nrow = 40))
wt <- runif(100)
testMeans <- lsbclust:::ComputeMeans(cm = clustmem, data = mat, weight = wt, nclust = 3)
testK <- lsbclust:::KMeansW(start = clustmem, data = mat, weight = wt, nclust = 3)
set.seed(1)
clustmem <- sample.int(n = 10, size = 100, replace = TRUE)
mat <- rbind(matrix(rnorm(30*4, mean = 3), nrow = 30), 
             matrix(rnorm(30*4, mean = -2), nrow = 30), 
             matrix(rnorm(40*4, mean = 0), nrow = 40))
wt <- runif(100)
testMeans <- lsbclust:::ComputeMeans(cm = clustmem, data = mat, weight = wt, nclust = 3)
testK <- lsbclust:::KMeansW(start = clustmem, data = mat, weight = wt, nclust = 3)

C++ Function for Interaction Loss Function

Description

This function calculates the loss function for the interaction clustering for all data slices and clusters means. The inputs are numeric matrices.

Arguments

`x`	The data matrix, with the N slices strung out as vectors in the columns.
`y`	The matrix of cluster means, with each mean represented by a row.

Value

A numeric matrix with nclust rows and N columns.

List-of-values Data Set

Description

This is the list-of-values data set used in Van Rosmalen, Van Herk & Groenen (2010). Column names and factor labels differ slightly from that paper. Missing values are encoded as NA as usual. The first nine columns are items answered on a nine-point rating scale, with rating 1 representing 'very important' and category 9 'not important at all'. The respondents were asked how important each of these items are as a guiding principle in their lives.

Usage

data("lov")
data("lov")

Format

A data frame with 4514 observations on the following 12 variables.

Belonging: a numeric vector; 'a sense of belonging'
Excitement: a numeric vector
Relationships: a numeric vector; 'warm relationships with others'
Self-fulfilment: a numeric vector
Respected: a numeric vector; 'being well-respected'
Enjoyment: a numeric vector; 'fun and enjoyment'
Security: a numeric vector
Self-respect: a numeric vector
Accomplishment: a numeric vector; 'a sense of accomplishment'
Country: a factor with levels Britain, France, Germany, Italy and Spain
Education: a factor with levels Low and High
Age: a factor with levels -25, 25-39, 40-54 and 55+

Source

Joost van Rosmalen

References

Van Rosmalen, J., Van Herk, H., & Groenen, P. J. (2010). Identifying response styles: A latent-class bilinear multinomial logit model. Journal of Marketing Research, 47(1), 157-172.

Examples

data("lov")

## Construct array
lovarr <- indarr(lov[, 1:9], maxcat = 9)

## Run analysis
set.seed(13841)
fit <- lsbclust(data = lovarr, margin = 3, delta = c(0, 1, 0, 0), nclust = c(NA, 11, NA, 5), 
                 fixed = "rows", nstart = 1, iter.max = 50, nstart.kmeans = 10)
data("lov")

## Construct array
lovarr <- indarr(lov[, 1:9], maxcat = 9)

## Run analysis
set.seed(13841)
fit <- lsbclust(data = lovarr, margin = 3, delta = c(0, 1, 0, 0), nclust = c(NA, 11, NA, 5), 
                 fixed = "rows", nstart = 1, iter.max = 50, nstart.kmeans = 10)

Least-squares Bilinear Clustering of Three-way Data

Description

This function clusters along one way of a three-way array (as specified by margin) while decomposing along the other two dimensions. Four types of clusterings are allowed based on the respective two-way slices of the array: on the overall means, row margins, column margins and the interactions between rows and columns. Which clusterings can be fit is determined by the vector delta, with four binary elements. All orthogonal models are fitted. The nonorthogonal case delta = (1, 1, 0, 0) returns an error. See the reference for further details.

Usage

lsbclust(data, margin = 3L, delta = c(1L, 1L, 1L, 1L), nclust,
  ndim = 2L, fixed = c("none", "rows", "columns"), nstart = 20L,
  starts = NULL, nstart.kmeans = 500L, alpha = 0.5,
  parallel = FALSE, maxit = 100L, verbose = 1, method = "diag",
  type = NULL, sep.nclust = TRUE, ...)
lsbclust(data, margin = 3L, delta = c(1L, 1L, 1L, 1L), nclust,
  ndim = 2L, fixed = c("none", "rows", "columns"), nstart = 20L,
  starts = NULL, nstart.kmeans = 500L, alpha = 0.5,
  parallel = FALSE, maxit = 100L, verbose = 1, method = "diag",
  type = NULL, sep.nclust = TRUE, ...)

Arguments

`data`	A three-way array representing the data.
`margin`	An integer giving the single subscript of `data` over which the clustering will be applied.
`delta`	A four-element binary vector (logical or numeric) indicating which sum-to-zero constraints must be enforced.
`nclust`	A vector of length four giving the number of clusters for the overall mean, the row margins, the column margins and the interactions (in that order) respectively. Alternatively, a vector of length one, in which case all components will have the same number of clusters.
`ndim`	The required rank for the approximation of the interactions (a scalar).
`fixed`	One of `"none"`, `"rows"` or `"columns"` indicating whether to fix neither sets of coordinates, or whether to fix the row or column coordinates across clusters respectively. If a vector is supplied, only the first element will be used (passed to `int.lsbclust`).
`nstart`	The number of random starts to use for the interaction clustering.
`starts`	A list containing starting configurations for the cluster membership vector. If not supplied, random initializations will be generated (passed to `int.lsbclust`).
`nstart.kmeans`	The number of random starts to use in `kmeans`.
`alpha`	Numeric value in [0, 1] which determines how the singular values are distributed between rows and columns (passed to `int.lsbclust`).
`parallel`	Logical indicating whether to parallel over different starts or not (passed to `int.lsbclust`).
`maxit`	The maximum number of iterations allowed in the interaction clustering.
`verbose`	Integer controlling the amount of information printed: 0 = no information, 1 = Information on random starts and progress, and 2 = information is printed after each iteration for the interaction clustering.
`method`	The method for calculating cluster agreement across random starts, passed on to `cl_agreement` (passed to `int.lsbclust`).
`type`	One of `"rows"`, `"columns"` or `"overall"` (or a unique abbreviation of one of these) indicating whether clustering should be done on row margins, column margins or the overall means of the two-way slices respectively. If more than one opion are supplied, the algorithm is run for all (unique) options supplied (passed to `orc.lsbclust`). This is an optional argument.
`sep.nclust`	Logical indicating how nclust should be used across different `type`'s. If `sep.nclust` is `TRUE`, `nclust` is recycled so that each `type` can have a different number of clusters. If `sep.nclust` is `FALSE`, the same vector `nclust` is used for all `type`'s.
`...`	Additional arguments passed to `kmeans`.

Value

Returns an object of S3 class lsbclust which has slots:

`overall`	Object of class `ovl.kmeans` for the overall means clustering
`rows`	Object of class `row.kmeans` for the row means clustering
`columns`	Object of class `col.kmeans` for the column means clustering
`interactions`	Object of class `int.lsbclust` for the interaction clustering
`call`	The function call used to create the object
`delta`	The value of `delta` in the fit
`df`	Breakdown of the degrees-of-freedom across the different subproblems
`loss`	Breakdown of the loss across subproblems
`time`	Time taken in seconds to calculate the solution
`cluster`	Matrix of cluster membership per observation for all cluster types

References

Schoonees, P.C., Groenen, P.J.F., Van de Velden, M. Least-squares Bilinear Clustering of Three-way Data. Econometric Institute Report, EI2014-23.

Biplots of

Description

Construct simple two-dimensional biplots given matrices representing the rows and columns of a two-dimensional matrix using ggplot2.

Usage

meanbiplot(rows, cols)
meanbiplot(rows, cols)

Arguments

`rows`	A list of matrices representing the rows
`cols`	A list of matrices representing the columns

Examples

set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5))
meanbiplot(dat[[1]]$interactions$C, dat[[1]]$interactions$D)
set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5))
meanbiplot(dat[[1]]$interactions$C, dat[[1]]$interactions$D)

Plot Heatmap of A Matrix

Description

Construct a heatmap of a matrix using ggplot2.

Usage

meanheatmap(x)
meanheatmap(x)

Arguments

`x`	Matrix or list of matrices to be plotted

Examples

set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(6, 6), nclust = c(5, 4, 6, 5))
meanheatmap(Map(tcrossprod, dat[[1]]$interactions$C, dat[[1]]$interactions$D))
set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(6, 6), nclust = c(5, 4, 6, 5))
meanheatmap(Map(tcrossprod, dat[[1]]$interactions$C, dat[[1]]$interactions$D))

K-means on the Overall Mean, Row Margins or Column Margins

Description

This function conducts k-means on the overall mean, the row margins or column margins of a set of N matrices. These matrices are two-way slices of a three-dimensional array.

Usage

orc.lsbclust(data, margin = 3L, delta, nclust, sep.nclust = TRUE,
  type = NULL, verbose = 1, ...)
orc.lsbclust(data, margin = 3L, delta, nclust, sep.nclust = TRUE,
  type = NULL, verbose = 1, ...)

Arguments

`data`	A three-way array representing the data.
`margin`	An integer giving the single subscript of `data` over which the clustering will be applied.
`delta`	A four-element binary vector (logical or numeric) indicating which sum-to-zero constraints must be enforced.
`nclust`	An integer giving the desired number of clusters. In case `type` specifies more than one method, `nclust` can be a vector containing the number of clusters to be determined for each type of cluster, and in the correct order as determined by `type` (after matching the arguments). If `type` is of length greater than one and `nclust` is of length one, the behaviour is governed by `sep.nclust`.
`sep.nclust`	Logical indicating how nclust should be used across different `type`'s. If `sep.nclust` is `TRUE`, `nclust` is recycled so that each `type` can have a different number of clusters. If `sep.nclust` is `FALSE`, the same vector `nclust` is used for all `type`'s.
`type`	One of `"overall"`, `"rows"` or `"columns"` (or a unique abbreviation of one of these) indicating whether clustering should be done on row margins, column margins or the overall means of the two-way slices respectively. If more than one opion are supplied, the algorithm is run for all (unique) options supplied.
`verbose`	Integer controlling the amount of information printed: 0 = no information, 1 = Information on random starts and progress, and 2 = information is printed after each iteration for the interaction clustering.
`...`	Additional arguments passed to `kmeans`.

Value

A list containing a subset of the classes row.kmeans, col.kmeans and ovl.kmeans which are specific versions of class kmeans. In case type is a vector, a list is returned containing the results for each of the (unique) elements of type, with the same classes as before. See kmeans for an overview of the structure of these objects.

Plot a `bicomp` Object

Description

Plot method for an object of class bicomp (see bicomp).

Usage

## S3 method for class 'bicomp'
plot(x, which = 0L:4L, arrange = TRUE,
  col = c("red4", "beige", "blue4"), strip.legend = TRUE,
  add.titles = FALSE, ...)
## S3 method for class 'bicomp'
plot(x, which = 0L:4L, arrange = TRUE,
  col = c("red4", "beige", "blue4"), strip.legend = TRUE,
  add.titles = FALSE, ...)

Arguments

`x`	An object of class `bicomp`.
`which`	A numeric vector indicating which matrices to plot, with 0 = original data, 1 = overall means, 2 = row means, 3 = column means and 4 = interactions.
`arrange`	Logical indicating whether the arrange the plots side-by-side via `grid.arrange` or not.
`col`	A character vector of length three giving the parameters `low`, `mid` and `high` for `scale_fill_gradient2`.
`strip.legend`	Logical indicating whether to strip the legend off the plot or not.
`add.titles`	Logical indicating whether to add titles to the plots or not.
`...`	Additional arguments to `theme`.

Plot method for class 'col.kmeans'

Description

Simple plot method for object of class 'col.kmeans' as output by orc.lsbclust.

Usage

## S3 method for class 'col.kmeans'
plot(x, which = 1L, ...)
## S3 method for class 'col.kmeans'
plot(x, which = 1L, ...)

Arguments

`x`	An object of class `col.kmeans`
`which`	Which type of plot to produce (only 3 types are implemented).
`...`	additional arguments passed to `theme`.

Author(s)

Pieter C. Schoonees

Examples

data("dcars")
m <- orc.lsbclust(data = dcars, margin = 3, delta = c(1,1,1,1), nclust = 5, type = "columns")
plot(m)
data("dcars")
m <- orc.lsbclust(data = dcars, margin = 3, delta = c(1,1,1,1), nclust = 5, type = "columns")
plot(m)

Plot Method for Class 'int.lsbclust'

Description

Two-dimensional plot method for object of class 'int.lsbclust' as output by int.lsbclust.

Usage

## S3 method for class 'int.lsbclust'
plot(x, which = seq_len(nclust),
  plot.type = c("biplots", "means", "estimates"), segments = NULL,
  biplot.axes = TRUE, nmarkers = 5, alpha = NULL,
  check.alpha = TRUE, fix.alpha = FALSE, probs = 0,
  arrange = FALSE, fix.limits = TRUE, limit.exp = 1.05,
  lambda.scale = TRUE, procrustes.rotation = x$fixed == "none",
  fix.lambda = FALSE, labs.grey = TRUE, label.0 = FALSE,
  tick.length = 0.0075 * diff(lims), axis.col = "grey60",
  label.size = 3, axis.size = 0.25, axis.title.size = 4,
  draw.axis = NULL, points.col = list(rows = "red", columns = "blue2"),
  offset.tick.labels = 3.5, offset.axis.title = list(rows = 0.015 *
  max(nchar(rnms)), columns = 0.015 * max(nchar(cnms))),
  axis.arrow = grid::arrow(angle = 20, length = grid::unit(0.0175,
  "npc")), ...)
## S3 method for class 'int.lsbclust'
plot(x, which = seq_len(nclust),
  plot.type = c("biplots", "means", "estimates"), segments = NULL,
  biplot.axes = TRUE, nmarkers = 5, alpha = NULL,
  check.alpha = TRUE, fix.alpha = FALSE, probs = 0,
  arrange = FALSE, fix.limits = TRUE, limit.exp = 1.05,
  lambda.scale = TRUE, procrustes.rotation = x$fixed == "none",
  fix.lambda = FALSE, labs.grey = TRUE, label.0 = FALSE,
  tick.length = 0.0075 * diff(lims), axis.col = "grey60",
  label.size = 3, axis.size = 0.25, axis.title.size = 4,
  draw.axis = NULL, points.col = list(rows = "red", columns = "blue2"),
  offset.tick.labels = 3.5, offset.axis.title = list(rows = 0.015 *
  max(nchar(rnms)), columns = 0.015 * max(nchar(cnms))),
  axis.arrow = grid::arrow(angle = 20, length = grid::unit(0.0175,
  "npc")), ...)

Arguments

`x`	An object of class `int.lsbclust`.
`which`	A vector indicating which item segments to plot.
`plot.type`	Character string giving the type of plots to produce: either `"biplots"` for the biplots approximating the cluster means, `"means"` for level plots of the cluster means themselves or `"estimates"` for level plots of the low-rank approximations of the cluster means (as represented in the biplots).
`segments`	A logical vector with two elements, indicating whether the rows and columns should be plotted as line segments or not.
`biplot.axes`	A logical indicating whether to plot calibrated biplot axes for the line segments indicated in `segments` or not.
`nmarkers`	Either a single integer giving the number of desired markers per biplot axis for all axes, or a named list. This is passed as the argument `n` to `pretty`. See `Details` for information on the list option.
`alpha`	Numeric value in [0, 1] which determines how the singular values are distributed between rows and columns. It will trigger a recomputation of the updates if it does not correspond to the value used when fitting the model. Do not confuse this with the term "alpha" used in the context of colour transparency.
`check.alpha`	Logical indicating whether to look for a better alpha. This is only used when `alpha = NULL` is used. Do not confuse this with the term "alpha" used in the context of colour transparency.
`fix.alpha`	Logical indicating whether to fix alpha across all clusters or not when `fixed == "none"`. Do not confuse this with the term "alpha" used in the context of colour transparency.
`probs`	Argument passed to `quantile` to determine the alpha value. The corresponding quantile of the distances of all points in the biplots to the origin will be used to determine alpha in case check.alpha = TRUE.
`arrange`	Logical indicating whether to arrange the plots side-by-side via `grid.arrange` or not.
`fix.limits`	Logical indicating whether biplot x- and y-limits must be fixed across clusters or not. Note that this is automatically set to `TRUE` when `fixed == "rows"` or `fixed == "columns"`. When limits are fixed, the axis calibrations are also turned off.
`limit.exp`	A numeric expansion factor applied multiplicatively to the plot limits, but only when `fixed` equals `"rows"` or `"columns"`.
`lambda.scale`	Logical indicating whether to apply lambda scaling to the coordinates or not. If true, the scaling is done such that the average squared distance to the origin is equal for the row and column coordinates.
`procrustes.rotation`	Logical indicating whether to do Procrustes rotations so that the location of the axes indicated as segments (see argument `segments`) are similar across configurations.
`fix.lambda`	Logical indicating whether to fix lambda across all clusters or not.
`labs.grey`	Logical indicating whether to apply greying to the text labels are well.
`label.0`	Logical indicating whether to label the origin or not.
`tick.length`	The required tick length as a `unit` object. It defaults to a propoprtion of the width of the plot region (through lazy evaluation).
`axis.col`	The colour of the biplot axes.
`label.size`	The size of the labels for the markers on the biplot axes.
`axis.size`	Line size for biplot axes.
`axis.title.size`	Size of biplot axis titles.
`draw.axis`	A list with up to two components which must be named `"rows"` and `"columns"`. Each element contains a vector indicating which biplot axes should be drawn. The vectors can be character vectors containing the names of the axes to be drawn, numeric vectors containing indices indicating which axes to draw, or logical vectors indicating which biplot axes to draw. In case of the default value `NULL`, the elements of `segments` are used for the `"rows"` and `"columns"` entries.
`points.col`	A named list containing the colours to use for plotting the sets of points. The elements `"rows"` and `"columns"` contain vectors giving the colours for the points. Single element vectors are recycled across the different points, otherwise the vectors must be of the appropriate length.
`offset.tick.labels`	A numeric value giving the offset factor of the biplot axis marker labels from their respective tick marks. Higher (lower) values lead to labels being further from (nearer to) their respective tick marks.
`offset.axis.title`	A names list of (up to) two numeric values giving the fixed length offset of the biplot axis title label from the end of the axis segment. The two elements must have names `"rows"` and code"columns".
`axis.arrow`	An `arrow` object to be used for the endpoints of biplot axis segment lines. This is passed to `geom_segment`.
`...`	Additional arguments passed to `theme`.

Details

In case nmarkers is a list, it can have up to two elements. These are required to be named "rows" and/or "columns", otherwise an error will be thrown. The elements of the list contains either single numeric values each or numeric vectors of the appropriate lengths indicating the n argument passed to pretty.

In some cases, the row and/or column fit values can contain non-finite values. If that occurs, colour transparency cannot and will not be used for that particular element (and this can vary between clusters). This relates to the alpha parameter in the plotting routines.

Plot method for class 'lsbclust'

Description

This plot method simply plots each of the components in the list of class lsbclust.

Usage

## S3 method for class 'lsbclust'
plot(x, type = c("overall", "rows", "columns",
  "interactions"), biplot.axes = TRUE, ...)
## S3 method for class 'lsbclust'
plot(x, type = c("overall", "rows", "columns",
  "interactions"), biplot.axes = TRUE, ...)

Arguments

`x`	An object of class `orc.kmeans`
`type`	A character vector indicating which component(s) of `x` to plot: a combination of `"overall"`, `"rows"`, `"columns"` and `"interactions"`.
`biplot.axes`	A logical indicating whether to plot calibrated biplot axes for the line segments indicated in `segments` or not.
`...`	additional arguments passed to the plot methods of the respective components, typically to `theme`. Use e.g. `plot(x$interactions)` for more control over the respective plots.

Author(s)

Pieter C. Schoonees

Examples

data("dcars")
m <- lsbclust(data = dcars, margin = 3, delta = c(1, 1, 1, 1), nclust = 5, nstart = 1)
plot(m)
data("dcars")
m <- lsbclust(data = dcars, margin = 3, delta = c(1, 1, 1, 1), nclust = 5, nstart = 1)
plot(m)

Plot method for class 'ovl.kmeans'

Description

Simple plot method for object of class 'ovl.kmeans' as output by orc.lsbclust.

Usage

## S3 method for class 'ovl.kmeans'
plot(x, which = 1L, ...)
## S3 method for class 'ovl.kmeans'
plot(x, which = 1L, ...)

Arguments

`x`	An object of class `ovl.kmeans`
`which`	Which type of plot to produce. Currently only `which = 1` is implemented.
`...`	additional arguments passed to `theme`.

Author(s)

Pieter C. Schoonees

Examples

data("dcars")
m <- orc.lsbclust(data = dcars, margin = 3, delta = c(1,1,1,1), nclust = 5, type = "overall")
plot(m)
data("dcars")
m <- orc.lsbclust(data = dcars, margin = 3, delta = c(1,1,1,1), nclust = 5, type = "overall")
plot(m)

Plot method for class 'row.kmeans'

Description

Simple plot method for object of class 'row.kmeans' as output by orc.lsbclust.

Usage

## S3 method for class 'row.kmeans'
plot(x, which = 1L, ...)
## S3 method for class 'row.kmeans'
plot(x, which = 1L, ...)

Arguments

`x`	An object of class `row.kmeans`
`which`	Which type of plot to produce (only 3 types are implemented).
`...`	additional arguments passed to `theme`.

Author(s)

Pieter C. Schoonees

Examples

data("dcars")
m <- orc.lsbclust(data = dcars, margin = 3, delta = c(1,1,1,1), nclust = 5, type = "rows")
plot(m)
data("dcars")
m <- orc.lsbclust(data = dcars, margin = 3, delta = c(1,1,1,1), nclust = 5, type = "rows")
plot(m)

Plot method for class 'step.lsbclust'

Description

Plot 'step.lsbclust' objects.

Usage

## S3 method for class 'step.lsbclust'
plot(x, which = 1L:5L, col.all = NULL,
  arrange = FALSE, chull = FALSE, ...)
## S3 method for class 'step.lsbclust'
plot(x, which = 1L:5L, col.all = NULL,
  arrange = FALSE, chull = FALSE, ...)

Arguments

`x`	An object of class `step.lsbclust`
`which`	Which type of plot to produce.
`col.all`	A character vector of length one indicating which of `"overall"`, `"rows"`, `"columns"` or `"interactions"` should be mapped to colour in the plot for all possible models. Care needs to be taken that the stated component is included in the fit.
`arrange`	Logical indicating whether the arrange the plots side-by-side via `grid.arrange` or not.
`chull`	Logical indicating whether to plot the estimated convex hull or not.
`...`	additional arguments passed to `theme`.

Author(s)

Pieter C. Schoonees

Plot Method for Class 'T3Clusf'

Description

Two-dimensional plot method for object of class 'T3Clusf' as output by T3Clusf.

Usage

## S3 method for class 'T3Clusf'
plot(x, which = seq_len(nclust), arrange = FALSE,
  ...)
## S3 method for class 'T3Clusf'
plot(x, which = seq_len(nclust), arrange = FALSE,
  ...)

Arguments

`x`	An object of class `T3Clusf`.
`which`	An integer vector indicating which item segments to plot.
`arrange`	Logical indicating whether to arrange the plots on a single page or not
`...`	Additional arguments to `theme`

Print method for object of class 'lsbclust'

Description

Print a 'lsbclust' object.

Usage

## S3 method for class 'lsbclust'
print(x, ...)
## S3 method for class 'lsbclust'
print(x, ...)

Arguments

`x`	An object of class 'lsbclust'
`...`	Unimplemented.

Simulate from LSBCLUST Model

Description

Simulate three-way arrays adhering to the LSBCLUST framework (see lsbclust).

Usage

rlsbclust(ndata = 50L, nobs, size, nclust, clustsize = NULL,
  delta = rep(1L, 4L), ndim = 2L, alpha = 0.5, fixed = c("none",
  "rows", "columns"), err_sd = 1, svmins = 1, svmax = 6)
rlsbclust(ndata = 50L, nobs, size, nclust, clustsize = NULL,
  delta = rep(1L, 4L), ndim = 2L, alpha = 0.5, fixed = c("none",
  "rows", "columns"), err_sd = 1, svmins = 1, svmax = 6)

Arguments

`ndata`	Integer giving the number of data sets to generate with the same underlying parameters.
`nobs`	Integer giving the number of observations to sample.
`size`	Vector with two elements giving the number of rows and columns respectively of each simulated observation.
`nclust`	A vector of length four giving the number of clusters for the overall mean, the row margins, the column margins and the interactions (in that order) respectively. Alternatively, a vector of length one, in which case all components will have the same number of clusters.
`clustsize`	A list of length four, with each element containing a vector of the same length as the corresponding entry in `nclust`, indicating the number of elements to contribute to each sample. Naturally, each of these vectors must sum to `nobs`, or an error will result. Positional matching are used, in the order "overall", "rows", "columns" and "interactions". If `NULL`, all clusters will be of equal size.
`delta`	A four-element binary vector (logical or numeric) indicating which sum-to-zero constraints must be enforced.
`ndim`	The required rank for the approximation of the interactions (a scalar).
`alpha`	Numeric value in [0, 1] which determines how the singular values are distributed between rows and columns (passed to `int.lsbclust`).
`fixed`	One of `"none"`, `"rows"` or `"columns"` indicating whether to fix neither sets of coordinates, or whether to fix the row or column coordinates across clusters respectively. If a vector is supplied, only the first element will be used (passed to `int.lsbclust`).
`err_sd`	The standard deviation of the error distribution, as passed to `rnorm`
`svmins`	Vector of minimum values for the singular values (as passed to `simsv`). Optionally, if all minima are equal, a single numeric value which will be expanded to the correct length.
`svmax`	The maximum possible singular value (as passed to `simsv`)

Examples

## Nothing fixed, balanced classes
set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5))
res <- lsbclust(data = dat[[1]]$data, nclust = c(5, 4, 6, 5))
cfsim(res, dat[[1]])

## Rows fixed, balanced classes
set.seed(2)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5), 
                 fixed = "rows")
res <- lsbclust(data = dat[[1]]$data, nclust = c(5, 4, 6, 5), fixed = "rows")
cfsim(res, dat[[1]])

## Rows fixed, unbalanced classes
set.seed(3)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5), 
                 fixed = "columns", 
                 clustsize = list(NULL, NULL, c(40, 25, 15, 10, 5, 5), c(40, 25, 15, 10, 10)))
res <- lsbclust(data = dat[[1]]$data, nclust = c(5, 4, 6, 5), fixed = "columns")
cfsim(res, dat[[1]])
## Nothing fixed, balanced classes
set.seed(1)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5))
res <- lsbclust(data = dat[[1]]$data, nclust = c(5, 4, 6, 5))
cfsim(res, dat[[1]])

## Rows fixed, balanced classes
set.seed(2)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5), 
                 fixed = "rows")
res <- lsbclust(data = dat[[1]]$data, nclust = c(5, 4, 6, 5), fixed = "rows")
cfsim(res, dat[[1]])

## Rows fixed, unbalanced classes
set.seed(3)
dat <- rlsbclust(ndata = 1, nobs = 100, size = c(10, 8), nclust = c(5, 4, 6, 5), 
                 fixed = "columns", 
                 clustsize = list(NULL, NULL, c(40, 25, 15, 10, 5, 5), c(40, 25, 15, 10, 10)))
res <- lsbclust(data = dat[[1]]$data, nclust = c(5, 4, 6, 5), fixed = "columns")
cfsim(res, dat[[1]])

Generate A Random Orthonormal Matrix

Description

Uniformly sample an orthornormal matrix from the collection of all possible orthonormal matrices of a certain size. The QR decomposition is used on a matrix containing Gaussian random numbers. The QR decomposition might not be the most efficient algorithm under some circumstances.

Usage

rorth(nrow, ncol, sd = 1)
rorth(nrow, ncol, sd = 1)

Arguments

`nrow`	Integer giving the number of rows required.
`ncol`	Integer giving the number of columns required.
`sd`	The standard deviation passed to `rnorm`

References

Stewart, G. W. (1980). The efficient generation of random orthogonal matrices with an application to condition estimators. SIAM Journal on Numerical Analysis, 17(3), 403-409.

Examples

set.seed(1)
rorth(5, 2)
set.seed(1)
rorth(5, 2)

Simulate and Analyze LSBCLUST

Description

Perform a single simulation run for the LSBCLUST model. Multiple data sets are generated for a single set of underlying parameters,

Usage

sim_lsbclust(ndata, nobs, size, nclust, clustsize = NULL,
  delta = rep(1L, 4L), ndim = 2L, alpha = 0.5, fixed = c("none",
  "rows", "columns"), err_sd = 1, svmins = 0.5, svmax = 5,
  seed = NULL, parallel = FALSE, parallel_data = TRUE, verbose = 0,
  nstart_T3 = 20L, nstart_ak = 20L, mc.cores = detectCores() - 1,
  include_fits = FALSE, include_data = FALSE, nstart, nstart.kmeans)
sim_lsbclust(ndata, nobs, size, nclust, clustsize = NULL,
  delta = rep(1L, 4L), ndim = 2L, alpha = 0.5, fixed = c("none",
  "rows", "columns"), err_sd = 1, svmins = 0.5, svmax = 5,
  seed = NULL, parallel = FALSE, parallel_data = TRUE, verbose = 0,
  nstart_T3 = 20L, nstart_ak = 20L, mc.cores = detectCores() - 1,
  include_fits = FALSE, include_data = FALSE, nstart, nstart.kmeans)

Arguments

`ndata`	Integer giving the number of data sets to generate with the same underlying parameters.
`nobs`	Integer giving the number of observations to sample.
`size`	Vector with two elements giving the number of rows and columns respectively of each simulated observation.
`nclust`	A vector of length four giving the number of clusters for the overall mean, the row margins, the column margins and the interactions (in that order) respectively. Alternatively, a vector of length one, in which case all components will have the same number of clusters.
`clustsize`	A list of length four, with each element containing a vector of the same length as the corresponding entry in `nclust`, indicating the number of elements to contribute to each sample. Naturally, each of these vectors must sum to `nobs`, or an error will result. Positional matching are used, in the order "overall", "rows", "columns" and "interactions". If `NULL`, all clusters will be of equal size.
`delta`	A four-element binary vector (logical or numeric) indicating which sum-to-zero constraints must be enforced.
`ndim`	The required rank for the approximation of the interactions (a scalar).
`alpha`	Numeric value in [0, 1] which determines how the singular values are distributed between rows and columns (passed to `int.lsbclust`).
`fixed`	One of `"none"`, `"rows"` or `"columns"` indicating whether to fix neither sets of coordinates, or whether to fix the row or column coordinates across clusters respectively. If a vector is supplied, only the first element will be used (passed to `int.lsbclust`).
`err_sd`	The standard deviation of the error distribution, as passed to `rnorm`
`svmins`	Vector of minimum values for the singular values (as passed to `simsv`). Optionally, if all minima are equal, a single numeric value which will be expanded to the correct length.
`svmax`	The maximum possible singular value (as passed to `simsv`)
`seed`	An optional seed to be set for the random number generator
`parallel`	Logical indicating whether to parallelize over random starts. Note that `parallel_data` has precedence over this
`parallel_data`	Logical indicating whether to parallelize over the data sets. If `FALSE`, parallelization is done over random starts (depending on `parallel`).
`verbose`	Integer giving the number of iterations after which the loss values is printed.
`nstart_T3`	The number of random starts to use for `T3Clusf`
`nstart_ak`	The number of random starts to use for `akmeans`
`mc.cores`	The number of cores to use, passed to `makeCluster`
`include_fits`	Logical indicating whether to include the model fits, or or only the fit statistics
`include_data`	Logical indicating whether to include the simulated data fitted on, or only the results
`nstart`	From `lsbclust`
`nstart.kmeans`	From `lsbclust`

Examples

set.seed(1)
res <- sim_lsbclust(ndata = 5, nobs = 100, size = c(10, 8), nclust = rep(5, 4), 
                    verbose = 0, nstart_T3 = 2, nstart_ak = 1, parallel_data = FALSE,
                    nstart = 2, nstart.kmeans = 5 )

set.seed(1)
res <- sim_lsbclust(ndata = 5, nobs = 100, size = c(10, 8), nclust = rep(5, 4), 
                    verbose = 0, nstart_T3 = 2, nstart_ak = 1, parallel_data = FALSE,
                    nstart = 2, nstart.kmeans = 5 )

Randomly Generate Positive Singular Values

Description

Generate random singular values for a specified number of clusters for use in simulations. A mixture distribution is used with truncation to ensure that the singular values differ between clusters, are ordered, and are nonnegative.

Usage

simsv(nclust, ndim = 2, mins = 1, max = 5)
simsv(nclust, ndim = 2, mins = 1, max = 5)

Arguments

`nclust`	Integer giving the number of clusters for which to sample singular values.
`ndim`	Integer; the number of singular values required.
`mins`	Numeric vector of length `ndim` giving the minimum values for the respective singular values.
`max`	Numeric value giving the maximum possible value for the mean of the cluster- specific singular value distribution, relative to the `mins`

Model Search for lsbclust

Description

Fit lsbclust models for different numbers of clusters and/or different values of delta. The resulting output can be inspected through its plot method to facilitate model selection. Each component of the model is fitted separately.

Usage

step.lsbclust(data, margin = 3L, delta = c(1, 1, 1, 1), nclust,
  ndim = 2, fixed = c("none", "rows", "columns"), nstart = 20,
  starts = NULL, nstart.kmeans = 500, alpha = 0.5,
  parallel = FALSE, maxit = 100, verbose = -1, type = NULL, ...)
step.lsbclust(data, margin = 3L, delta = c(1, 1, 1, 1), nclust,
  ndim = 2, fixed = c("none", "rows", "columns"), nstart = 20,
  starts = NULL, nstart.kmeans = 500, alpha = 0.5,
  parallel = FALSE, maxit = 100, verbose = -1, type = NULL, ...)

Arguments

`data`	A three-way array representing the data.
`margin`	An integer giving the single subscript of `data` over which the clustering will be applied.
`delta`	A four-element binary vector (logical or numeric) indicating which sum-to-zero constraints must be enforced.
`nclust`	Either a vector giving the number of clusters which will be applied to each element of the model, that is to (a subset of) the overall mean, row margins, column margins and interactions. If it is a list, arguments are matched by the names `"overall"`, `"rows"` `"columns"` and `"interactions"`. If the list does not have names, the components are extracted in the aforementioned order.
`ndim`	The required rank for the approximation of the interactions (a scalar).
`fixed`	One of `"none"`, `"rows"` or `"columns"` indicating whether to fix neither sets of coordinates, or whether to fix the row or column coordinates across clusters respectively. If a vector is supplied, only the first element will be used (passed to `int.lsbclust`).
`nstart`	The number of random starts to use for the interaction clustering.
`starts`	A list containing starting configurations for the cluster membership vector. If not supplied, random initializations will be generated (passed to `int.lsbclust`).
`nstart.kmeans`	The number of random starts to use in `kmeans`.
`alpha`	Numeric value in [0, 1] which determines how the singular values are distributed between rows and columns (passed to `int.lsbclust`).
`parallel`	Logical indicating whether to parallelize over different starts or not (passed to `int.lsbclust`).
`maxit`	The maximum number of iterations allowed in the interaction clustering.
`verbose`	The number of iterations after which information on progress is provided (passed to `int.lsbclust`).
`type`	One of `"rows"`, `"columns"` or `"overall"` (or a unique abbreviation of one of these) indicating whether clustering should be done on row margins, column margins or the overall means of the two-way slices respectively. If more than one opion are supplied, the algorithm is run for all (unique) options supplied (passed to `orc.lsbclust`). This is an optional argument.
`...`	Additional arguments passed to `kmeans`.

Examples

m <- step.lsbclust(data = dcars, margin = 3, delta = c(1, 0, 1, 0), nclust = 4:5, 
                     ndim = 2, fixed = "columns", nstart = 1, nstart.kmeans = 100, 
                     parallel = FALSE)
                     
## For a list of all deltas                     
delta <- expand.grid(replicate(4, c(0,1), simplify = FALSE))
delta <- with(delta, delta[!(Var1 == 0 & Var3 == 1), ])
delta <- with(delta, delta[!(Var2 == 0 & Var4 == 1),])
delta <- delta[-4,]
delta <- as.list(as.data.frame(t(delta)))
m2 <- step.lsbclust(data = dcars, margin = 3, delta = delta, nclust = 4:5, 
                     ndim = 2, fixed = "columns", nstart = 1, nstart.kmeans = 100, 
                     parallel = FALSE)
m <- step.lsbclust(data = dcars, margin = 3, delta = c(1, 0, 1, 0), nclust = 4:5, 
                     ndim = 2, fixed = "columns", nstart = 1, nstart.kmeans = 100, 
                     parallel = FALSE)
                     
## For a list of all deltas                     
delta <- expand.grid(replicate(4, c(0,1), simplify = FALSE))
delta <- with(delta, delta[!(Var1 == 0 & Var3 == 1), ])
delta <- with(delta, delta[!(Var2 == 0 & Var4 == 1),])
delta <- delta[-4,]
delta <- as.list(as.data.frame(t(delta)))
m2 <- step.lsbclust(data = dcars, margin = 3, delta = delta, nclust = 4:5, 
                     ndim = 2, fixed = "columns", nstart = 1, nstart.kmeans = 100, 
                     parallel = FALSE)

Summary Method for Class "int.lsbclust"

Description

Some goodness-of-fit diagnostics are provided for all three margins.

Usage

## S3 method for class 'int.lsbclust'
summary(object, digits = 3, ...)
## S3 method for class 'int.lsbclust'
summary(object, digits = 3, ...)

Arguments

`object`	An object of class 'int.lsbclust'.
`digits`	The number of digits in the printed output.
`...`	Unimplemented.

Summary Method for Class "lsbclust"

Description

Summarize a lsbclust object.

Usage

## S3 method for class 'lsbclust'
summary(object, digits = 3, ...)
## S3 method for class 'lsbclust'
summary(object, digits = 3, ...)

Arguments

`object`	An object of class 'lsbclust'.
`digits`	The number of digits in the printed output.
`...`	Unimplemented.

Dutch Supermarkets Data Set

Description

This data set relates to 220 consumers rating 10 Dutch supermarket chains according to 8 variables. A rating scale from 1 to 10 was used.

Usage

supermarkets
supermarkets

Format

A three-way array with supermarkets in the first dimension, variables in the second and consumers in the third dimension.

Source

Michel van de Velden

Examples

data("supermarkets")
fit <- lsbclust(data = supermarkets, nclust = 6, fixed = "rows", nstart = 2)
data("supermarkets")
fit <- lsbclust(data = supermarkets, nclust = 6, fixed = "rows", nstart = 2)

T3Clusf: Tucker3 Fuzzy Cluster Analysis

Description

This is an implementation of the T3Clusf algorithm of Rocci & Vichi (2005).

Usage

T3Clusf(X, Q, R = Q, G = 2, margin = 3L, alpha = 1, eps = 1e-08,
  maxit = 100L, verbose = 1, nstart = 1L, parallel = TRUE,
  mc.cores = detectCores() - 1L, minsize = 3L)
T3Clusf(X, Q, R = Q, G = 2, margin = 3L, alpha = 1, eps = 1e-08,
  maxit = 100L, verbose = 1, nstart = 1L, parallel = TRUE,
  mc.cores = detectCores() - 1L, minsize = 3L)

Arguments

`X`	Three-way data array, with no missing values.
`Q`	Integer giving the number of dimensions required for mode B (variables). This is the first mode of the array, excluding the mode clustered over (see `margin`).
`R`	Integer giving the number of dimensions required for mode C (occasions). This is the second mode of the array, excluding the mode clustered over (see `margin`).
`G`	Integer giving the number of clusters required.
`margin`	Integer giving the margin of the array to cluster over. The remaining two modes, in the original order, corresponds to `Q` and `R`.
`alpha`	Numeric value giving the fuzziness parameter.
`eps`	Small numeric value giving the empirical convergence threshold.
`maxit`	Integer giving the maximum number of iterations allowed.
`verbose`	Integer giving the number of iterations after which the loss values are printed.
`nstart`	Integer giving the number of random starts required.
`parallel`	Logical indicating whether to parallelize over random starts if `nstart > 1`.
`mc.cores`	Argument passed to `makeCluster`.
`minsize`	Integer giving the minimum size of cluster to uphold when reinitializing empty clusters.

References

Rocci, R., & Vichi, M. (2005). Three-mode component analysis with crisp or fuzzy partition of units. Psychometrika, 70(4), 715-736.

Examples

data("dcars")
set.seed(13)
res <- T3Clusf(X = carray(dcars), Q = 3, R = 2, G = 3, alpha = 1)

data("dcars")
set.seed(13)
res <- T3Clusf(X = carray(dcars), Q = 3, R = 2, G = 3, alpha = 1)

Package 'lsbclust'

Help Index

Least Squares Latent Class Matrix Factorization

Description

Author(s)

References

K-Means Over One Way of An Three-Way Array

Description

Usage

Arguments

Examples

Bilinear Decomposition of a Matrix

Description

Usage

Arguments

Value

Double-Centre a Three-way Array

Description

Usage

Arguments

Compare Simulation Results

Description

Usage

Arguments

See Also

Compare LSBCLUST Simulation Results

Description

Usage

Arguments

Examples

Compare LSBCLUST Simulation Results

Description

Usage

Arguments

Examples

Compare LSBCLUST Simulation Results

Description

Usage

Arguments

Examples

S3 export

Description

Usage

Arguments

C++ Function for Cluster Means

Description

Usage

Arguments

Value

Centring Matrix

Description

Usage

Arguments

Dutch Cars Data

Description

Usage

Format

Details

Source

Examples

Extract Fitted Values for akmeans

Description

Usage

Arguments

Value

See Also

Extract Fitted Values for LSBCLUST

Description

Usage

Arguments

Value

See Also

Extract Fitted Values for T3Clusf

Description

Usage

Arguments

Value

See Also

Generalized Procrustes Rotation

Description

Plot a `bicomp` Object