Package 'mixturemodelsr' reference manual

Title:	R Wrapper for the Mixture-Models Python Library
Description:	First R package enabling mixture models for high-dimensional data through gradient-based optimization with Automatic Differentiation (AD). Provides an R interface to the 'Mixture-Models' Python package (Kasa et al., 2024) via reticulate. Unlike traditional EM-based approaches (e.g., mclust, flexmix), this package uses AD and gradient-based optimization (including second-order Newton-CG) to fit Gaussian Mixture Models (GMM), Mixture of Factor Analyzers (MFA), Parsimonious GMM (PGMM), MCLUST family, and t-mixture models without requiring stringent modeling constraints, making it suitable for high-dimensional settings where the number of parameters exceeds the sample size. Reference: Kasa, S. R., Yijie, H., Kasa, S. K., & Rajan, V. (2024). Mixture-Models: a one-stop Python Library for Model-based Clustering using various Mixture Models. arXiv preprint arXiv:2402.10229.
Authors:	Siva Rajesh Kasa [aut, cre, cph]
Maintainer:	Siva Rajesh Kasa <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.20
Built:	2026-05-31 10:33:11 UTC
Source:	https://github.com/kasakh/mixturemodelsr

Compute AIC for fitted model

Description

Compute AIC for fitted model

Usage

mm_aic(fit)
mm_aic(fit)

Arguments

fit

An mm_fit object

Value

Numeric AIC value

Examples

## Not run: 
fit <- mm_gmm_fit(iris[,1:4], k = 3)
mm_aic(fit)

## End(Not run)
## Not run: 
fit <- mm_gmm_fit(iris[,1:4], k = 3)
mm_aic(fit)

## End(Not run)

Compute BIC for fitted model

Description

Compute BIC for fitted model

Usage

mm_bic(fit)
mm_bic(fit)

Arguments

fit

An mm_fit object

Value

Numeric BIC value

Examples

## Not run: 
fit <- mm_gmm_fit(iris[,1:4], k = 3)
mm_bic(fit)

## End(Not run)
## Not run: 
fit <- mm_gmm_fit(iris[,1:4], k = 3)
mm_bic(fit)

## End(Not run)

Fit a Constrained Gaussian Mixture Model

Description

Fits a GMM with a common covariance matrix across all components.

Usage

mm_gmm_constrained_fit(
  x,
  k,
  optimizer = "Newton-CG",
  scale = 1,
  use_kmeans = TRUE,
  ...
)
mm_gmm_constrained_fit(
  x,
  k,
  optimizer = "Newton-CG",
  scale = 1,
  use_kmeans = TRUE,
  ...
)

Arguments

x

Numeric matrix or data.frame (rows = observations, columns = features)

k

Number of mixture components

optimizer

Optimizer name: "Newton-CG" (default), "grad_descent", "rms_prop", or "adam"

scale

Initialization scale parameter (default 1.0)

use_kmeans

Logical, whether to use k-means for initializing component means (default TRUE)

...

Additional arguments passed to Python init_params() or fit() methods

Value

An mm_fit object

Examples

## Not run: 
fit <- mm_gmm_constrained_fit(iris[, 1:4], k = 3)
mm_bic(fit)

## End(Not run)
## Not run: 
fit <- mm_gmm_constrained_fit(iris[, 1:4], k = 3)
mm_bic(fit)

## End(Not run)

Fit a Gaussian Mixture Model (GMM)

Description

Fits a Gaussian mixture model to data using gradient-based optimization. This is a wrapper around the Python Mixture-Models GMM implementation.

Usage

mm_gmm_fit(x, k, optimizer = "Newton-CG", scale = 1, use_kmeans = TRUE, ...)
mm_gmm_fit(x, k, optimizer = "Newton-CG", scale = 1, use_kmeans = TRUE, ...)

Arguments

x

Numeric matrix or data.frame (rows = observations, columns = features)

k

Number of mixture components

optimizer

Optimizer name: "Newton-CG" (default), "grad_descent", "rms_prop", or "adam"

scale

Initialization scale parameter (default 1.0)

use_kmeans

Logical, whether to use k-means for initializing component means (default TRUE)

...

Additional arguments passed to Python init_params() or fit() methods

Value

An mm_fit object containing:

py_model

Python model object

params_store

Full optimization path

params

Final fitted parameters

model_name

Model family name

k

Number of components

call

Original function call

n

Sample size

d

Number of dimensions

optimizer

Optimizer used

Examples

## Not run: 
# Setup (first time only)
mm_setup()

# Fit a 3-component GMM on iris data
fit <- mm_gmm_fit(iris[, 1:4], k = 3)
print(fit)

# Get cluster labels
labels <- mm_predict(fit)

# Model selection
mm_bic(fit)
mm_aic(fit)

## End(Not run)
## Not run: 
# Setup (first time only)
mm_setup()

# Fit a 3-component GMM on iris data
fit <- mm_gmm_fit(iris[, 1:4], k = 3)
print(fit)

# Get cluster labels
labels <- mm_predict(fit)

# Model selection
mm_bic(fit)
mm_aic(fit)

## End(Not run)

Compute log-likelihood for fitted model

Description

Compute log-likelihood for fitted model

Usage

mm_likelihood(fit)
mm_likelihood(fit)

Arguments

fit

An mm_fit object

Value

Numeric log-likelihood value

Examples

## Not run: 
fit <- mm_gmm_fit(iris[,1:4], k = 3)
mm_likelihood(fit)

## End(Not run)
## Not run: 
fit <- mm_gmm_fit(iris[,1:4], k = 3)
mm_likelihood(fit)

## End(Not run)

Fit an MCLUST Family Model

Description

Fits a model from the MCLUST family of constrained Gaussian mixture models. MCLUST models specify different parameterizations of the covariance structure.

Usage

mm_mclust_fit(
  x,
  k,
  model_type = NULL,
  optimizer = "Newton-CG",
  scale = 1,
  use_kmeans = TRUE,
  ...
)
mm_mclust_fit(
  x,
  k,
  model_type = NULL,
  optimizer = "Newton-CG",
  scale = 1,
  use_kmeans = TRUE,
  ...
)

Arguments

x

Numeric matrix or data.frame (rows = observations, columns = features)

k

Number of mixture components

model_type

Character string specifying the MCLUST model type. Common types include: "EII", "VII", "EEI", "VEI", "EVI", "VVI", "EEE", "VEE", "EVE", "VVE", "EEV", "VEV", "EVV", "VVV" If NULL, uses default from Python implementation.

optimizer

Optimizer name: "Newton-CG" (default), "grad_descent", "rms_prop", or "adam"

scale

Initialization scale parameter (default 1.0)

use_kmeans

Logical, whether to use k-means for initializing component means (default TRUE)

...

Additional arguments passed to Python init_params() or fit() methods

Details

MCLUST model types follow a three-letter naming convention:

First letter: Volume (E=equal across components, V=variable across components)
Second letter: Shape (E=equal, V=variable, I=spherical/identity)
Third letter: Orientation (E=equal, V=variable, I=axis-aligned/identity)

Common model types:

"EII": Spherical, equal volume
"VII": Spherical, variable volume
"EEE": Ellipsoidal, equal volume, shape, and orientation
"VVV": Ellipsoidal, variable volume, shape, and orientation (most flexible)

Value

An mm_fit object

Examples

## Not run: 
# Fit MCLUST with VVV (most flexible) model
fit <- mm_mclust_fit(iris[, 1:4], k = 3, model_type = "VVV")
mm_bic(fit)

# Fit MCLUST with spherical, equal volume model
fit_eii <- mm_mclust_fit(iris[, 1:4], k = 3, model_type = "EII")
mm_bic(fit_eii)

# Compare models
labels <- mm_predict(fit)
table(labels, iris$Species)

## End(Not run)
## Not run: 
# Fit MCLUST with VVV (most flexible) model
fit <- mm_mclust_fit(iris[, 1:4], k = 3, model_type = "VVV")
mm_bic(fit)

# Fit MCLUST with spherical, equal volume model
fit_eii <- mm_mclust_fit(iris[, 1:4], k = 3, model_type = "EII")
mm_bic(fit_eii)

# Compare models
labels <- mm_predict(fit)
table(labels, iris$Species)

## End(Not run)

Fit a Mixture of Factor Analyzers (MFA)

Description

Fits a mixture of factor analyzers model to data using gradient-based optimization.

Usage

mm_mfa_fit(
  x,
  k,
  q = NULL,
  optimizer = "Newton-CG",
  scale = 1,
  use_kmeans = TRUE,
  ...
)
mm_mfa_fit(
  x,
  k,
  q = NULL,
  optimizer = "Newton-CG",
  scale = 1,
  use_kmeans = TRUE,
  ...
)

Arguments

x

Numeric matrix or data.frame (rows = observations, columns = features)

k

Number of mixture components

q

Number of latent factors (NULL for automatic selection)

optimizer

Optimizer name: "Newton-CG" (default), "grad_descent", "rms_prop", or "adam"

scale

Initialization scale parameter (default 1.0)

use_kmeans

Logical, whether to use k-means for initializing component means (default TRUE)

...

Additional arguments passed to Python init_params() or fit() methods

Value

An mm_fit object

Examples

## Not run: 
# Fit MFA with 3 components and 2 latent factors
fit <- mm_mfa_fit(iris[, 1:4], k = 3, q = 2)
mm_bic(fit)

# Get cluster labels
labels <- mm_predict(fit)

## End(Not run)
## Not run: 
# Fit MFA with 3 components and 2 latent factors
fit <- mm_mfa_fit(iris[, 1:4], k = 3, q = 2)
mm_bic(fit)

# Get cluster labels
labels <- mm_predict(fit)

## End(Not run)

Get parameter values from fit object

Description

Extracts the parameter values from a fitted mixture model.

Usage

mm_params(fit, convert = TRUE)
mm_params(fit, convert = TRUE)

Arguments

fit

An mm_fit object

convert

Logical, whether to convert Python objects to R (default TRUE)

Value

Parameter object (structure depends on model family)

Examples

## Not run: 
fit <- mm_gmm_fit(iris[,1:4], k = 3)
params <- mm_params(fit)

## End(Not run)
## Not run: 
fit <- mm_gmm_fit(iris[,1:4], k = 3)
params <- mm_params(fit)

## End(Not run)

Fit a Parsimonious Gaussian Mixture Model (PGMM)

Description

Fits a parsimonious GMM with constraints on the covariance structure.

Usage

mm_pgmm_fit(
  x,
  k,
  model_type = NULL,
  q = NULL,
  optimizer = "Newton-CG",
  scale = 1,
  use_kmeans = TRUE,
  ...
)
mm_pgmm_fit(
  x,
  k,
  model_type = NULL,
  q = NULL,
  optimizer = "Newton-CG",
  scale = 1,
  use_kmeans = TRUE,
  ...
)

Arguments

x

Numeric matrix or data.frame (rows = observations, columns = features)

k

Number of mixture components

model_type

Character string specifying the PGMM model type (e.g., "VVV", "EEE", "VEV"). If NULL, uses default from Python implementation.

q

Number of latent factors (NULL for automatic selection)

optimizer

Optimizer name: "Newton-CG" (default), "grad_descent", "rms_prop", or "adam"

scale

Initialization scale parameter (default 1.0)

use_kmeans

Logical, whether to use k-means for initializing component means (default TRUE)

...

Additional arguments passed to Python init_params() or fit() methods

Details

PGMM model types follow the mclust naming convention:

First letter: Volume (E=equal, V=variable)
Second letter: Shape (E=equal, V=variable)
Third letter: Orientation (E=equal, V=variable, I=identity)

Value

An mm_fit object

Examples

## Not run: 
# Fit PGMM with variable volume, shape, and orientation
fit <- mm_pgmm_fit(iris[, 1:4], k = 3, model_type = "VVV")
mm_bic(fit)

## End(Not run)
## Not run: 
# Fit PGMM with variable volume, shape, and orientation
fit <- mm_pgmm_fit(iris[, 1:4], k = 3, model_type = "VVV")
mm_bic(fit)

## End(Not run)

Predict cluster labels

Description

Predicts cluster membership labels for observations using a fitted mixture model.

Usage

mm_predict(fit, newx = NULL)
mm_predict(fit, newx = NULL)

Arguments

fit

An mm_fit object returned by mm_*_fit functions

newx

Optional matrix or data.frame of new observations. If NULL, predictions are made on the training data.

Value

Integer vector of cluster labels (0-indexed from Python, converted to 1-indexed for R)

Examples

## Not run: 
fit <- mm_gmm_fit(iris[,1:4], k = 3)
labels <- mm_predict(fit)
table(labels, iris$Species)

## End(Not run)
## Not run: 
fit <- mm_gmm_fit(iris[,1:4], k = 3)
labels <- mm_predict(fit)
table(labels, iris$Species)

## End(Not run)

Python Configuration Diagnostics

Description

Prints detailed information about the Python environment configuration for mixturemodelsr. Useful for troubleshooting installation issues.

Usage

mm_py_info()
mm_py_info()

Value

Invisibly returns a list with diagnostic information

Examples

## Not run: 
mm_py_info()

## End(Not run)
## Not run: 
mm_py_info()

## End(Not run)

Check if Python module is available

Description

Check if Python module is available

Usage

mm_python_available()
mm_python_available()

Value

Logical indicating if either mixture_models or Mixture_Models is importable

Setup mixturemodelsr (user-friendly wrapper)

Description

One-time setup for R users. Provisions a dedicated conda environment with Python 3.10 + NumPy 1.23.5 and installs Mixture-Models==0.0.8.

Usage

mm_setup(force = FALSE)
mm_setup(force = FALSE)

Arguments

force

Logical, force reinstallation by deleting and recreating the env

Value

TRUE invisibly on success

Fit a t-Mixture Model (TMM)

Description

Fits a mixture of multivariate t-distributions to data using gradient-based optimization. T-distributions are more robust to outliers than Gaussian distributions.

Usage

mm_tmm_fit(x, k, optimizer = "Newton-CG", scale = 1, use_kmeans = TRUE, ...)
mm_tmm_fit(x, k, optimizer = "Newton-CG", scale = 1, use_kmeans = TRUE, ...)

Arguments

x

Numeric matrix or data.frame (rows = observations, columns = features)

k

Number of mixture components

optimizer

Optimizer name: "Newton-CG" (default), "grad_descent", "rms_prop", or "adam"

scale

Initialization scale parameter (default 1.0)

use_kmeans

Logical, whether to use k-means for initializing component means (default TRUE)

...

Additional arguments passed to Python init_params() or fit() methods

Details

T-mixture models use multivariate t-distributions instead of Gaussians, making them more robust to outliers. Each component has its own degrees of freedom parameter, allowing different tail behaviors.

Value

An mm_fit object

Examples

## Not run: 
# Fit TMM with 3 components (robust to outliers)
fit <- mm_tmm_fit(iris[, 1:4], k = 3)
mm_bic(fit)

# Get cluster labels
labels <- mm_predict(fit)
table(labels, iris$Species)

## End(Not run)
## Not run: 
# Fit TMM with 3 components (robust to outliers)
fit <- mm_tmm_fit(iris[, 1:4], k = 3)
mm_bic(fit)

# Get cluster labels
labels <- mm_predict(fit)
table(labels, iris$Species)

## End(Not run)

Print method for mm_fit objects

Description

Print method for mm_fit objects

Usage

## S3 method for class 'mm_fit'
print(x, ...)
## S3 method for class 'mm_fit'
print(x, ...)

Arguments

x

An mm_fit object

...

Additional arguments (ignored)

Package 'mixturemodelsr'

Help Index

Compute AIC for fitted model

Description

Usage

Arguments

Value

Examples

Compute BIC for fitted model

Description

Usage

Arguments

Value

Examples

Fit a Constrained Gaussian Mixture Model

Description

Usage

Arguments

Value

Examples

Fit a Gaussian Mixture Model (GMM)

Description

Usage

Arguments

Value

Examples

Compute log-likelihood for fitted model

Description

Usage

Arguments

Value

Examples

Fit an MCLUST Family Model

Description

Usage

Arguments

Details

Value

Examples

Fit a Mixture of Factor Analyzers (MFA)

Description

Usage

Arguments

Value

Examples

Get parameter values from fit object

Description

Usage

Arguments

Value

Examples

Fit a Parsimonious Gaussian Mixture Model (PGMM)

Description

Usage

Arguments

Details

Value

Examples

Predict cluster labels

Description

Usage

Arguments

Value

Examples

Python Configuration Diagnostics

Description

Usage

Value

Examples

Check if Python module is available

Description

Usage

Value

Setup mixturemodelsr (user-friendly wrapper)

Description

Usage

Arguments

Value

Fit a t-Mixture Model (TMM)

Description