Introduction

FMM is an approach for describing a great variety of rhythmic patterns in oscillatory signals through a single or a sum of waves. The main goal of this package is to implement all required functions to fit and explore FMM models. Specifically, the FMM package allows:

Fit single and multi-wave FMM models, as well as a restricted version for including a priori knowledge about the wave shapes.
Generate synthetic data from FMM models.
Visualize graphically the results of the fitting process.

Examples of real-word biological oscillations are also included to illustrate the potential of this new methodology.

Please visit the FMM Project website for complete and up-to-date information on the progress made on the FMM approach.

Getting starting

FMM is an R package available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=FMM. To install the package directly from CRAN, start R and enter:

install.packages("FMM")

The R source code is also provided via the GitHub repository at https://github.com/alexARC26/FMM. To install this development version enter:

install.packages("devtools")
devtools::install_github("alexARC26/FMM")

Once FMM is installed, it can be loaded by the following command:

library("FMM")

Background

The FMM model

The FMM package implements the homonymous methodology: a novel approach to describe rhythmic patterns in oscillatory signals as an additive nonlinear parametric regression model. The FMM model is capable of fitting a great extent of heterogeneous shapes including non-sinusoidal ones. Bellow the FMM models are briefly described.

The FMM wave

In general, at the time point t, a frequency modulated Möbius (FMM) wave is defined as W(t; A, α, β, ω) = Acos (ϕ(t; α, β, ω)) where A ∈ ℜ⁺ represents the wave amplitude and, $$ \phi\left(t; \alpha, \beta, \omega\right)= \beta + 2\arctan\left(\omega \tan\left(\frac{t - \alpha}{2}\right)\right) $$ the wave phase. α ∈ [0, 2π] is a translation parameter, whereas β ∈ [0, 2π] and ω ∈ [0, 1] describe the wave shape.

Two important features of a wave are the peak and trough, defined as the highest and lowest points above and below the rest position, respectively. In many applications, the peak and trough times could be very useful tools to extract practical information of a wave, since they capture important aspects of the dynamics. These two interesting parameters can be directly derived from the main parameters of an FMM wave as

$$ t^U = \alpha + 2\arctan\left(\frac{1}{\omega}\tan\left(-\frac{\beta}{2}\right)\right) \\ t^L = \alpha + 2\arctan\left(\frac{1}{\omega}\tan\left(\frac{\pi-\beta}{2}\right)\right) $$ where t^U and t^L denote the peak and trough times, respectively.

Monocomponent FMM model

Let X(t_i), t₁ < t₂ < … < t_n be the vector of observations. A monocomponent FMM model is defined as

X(t_i) = M + W(t_i; A, α, β, ω) + e(t_i), i = 1, …, n where M ∈ ℜ is an intercept parameter describing the baseline level of the signal, W(t_i; A, α, β, ω) is an FMM wave, and it is assumed that the errors e(t_i) are independent and normally distributed with zero mean and a common variance σ².

Multicomponent FMM model

Let X(t_i), t₁ < t₂ < … < t_n be the vector of observations. A multicomponent FMM model of order m, denoted by FMM_m, is defined as $$ X(t_i) = M + \sum_{J=1}^{m}W_J(t_i) + e(t_i) , \quad i = 1,\dots,n $$ where M ∈ ℜ is an intercept parameter describing the baseline level of the signal, and W_J(t_i) = W(t_i; A_J, α_J, β_J, ω_J) is the Jth FMM wave.

Real-world example data sets

The FMM approach can be useful to analyze oscillatory signals from different disciplines. In particular, it has already been used successfully for the analysis of several biological oscillatory signals such as the circadian rhythms, the electrocardiogram (ECG) signal, the neuronal action potential (AP) curves and the blood pressure signal.

The FMM package includes four real-world example datasets, in RData format, which illustrate the use of this package on the analysis of real signals from these areas. Bellow is a brief description of each of them.

mouseGeneExp. The mouseGeneExp data set contains expression data of the Iqgap2 gene from mouse liver. Gene expression values are collected along two periods of 24 hours. Samples are pooled and analyzed using Affymetrix arrays. The complete database is freely available at NCBI GEO, with GEO accession number GSE11923.
ecgData. The ecgData data set contains the voltage of the heart’s electric activity, measured in mV, from the patient sel100 of QT database (Laguna et al. (1997)). The data illustrate the typical ECG signal heartbeat from a healthy subject. Specifically, the ECG signal contained in this dataset corresponds to lead II in the fifth of the thirty annotated heartbeats. Recordings are publicly available on Physionet website.
neuronalSpike. The neuronalSpike data set contains the voltage data (in mV) of a neuronal AP curve, the oscillatory signal that measures the electrical potential difference between inside and outside the cell due to an external stimulus and serves as basic information unit between neurons. The data have been simulated with the Hodgkin-Huxley model (Hodgkin and Huxley (1952)) using a modified version of the python package NeuroDynex available at Gerstner et al. (2014).
neuronalAPTrain. The neuronalAPTrain data set contains data of a spike train composed of three similar shaped AP curves. The neuronal APs have been simulated with the Hodgkin-Huxley model (Hodgkin and Huxley (1952)).

Background references

For a detailed background on methodology, computational algorithms and diverse applications, see the following references.

Reference	Description
Rueda, Larriba, and Peddada (2019)	The single-component FMM model.
Rueda, Rodrı́guez-Collado, and Larriba (2021)	The multi-component FMM model.
Rueda, Larriba, and Lamela (2021)	The FMM approach for describing ECG signals.
Rueda et al. (2021)	A detailed review of FMM approach to analyze biomedical signals.
Rodrı́guez-Collado and Rueda (2021a)	Hodgkin-Huxley model representation using a particular restricted FMM model.
Rodrı́guez-Collado and Rueda (2021b)	The potential of FMM features to classify neurons.
Larriba and Rueda (2021)	The potential of FMM to solve problems in chronobiology.
Fernández et al. (2021)	The R package that allows implementing the model.

FMM usage

The remainder of this vignette will focus on usage of FMM functions. For each of the sections below, refer to the FMM R package manual for specific technical details on usage, arguments and methods or use ? to access individual manual pages.

FMM object

FMM is the main class in the FMM package. An object of class FMM contains the slots summarized in the following table.

Slot	Description
timePoints	Time points for each data point over one single observed period.
data	Data to be fitted to an FMM model. Data could be collected over multiple periods.
summarizedData	The summarized data at each time point across all considered periods.
nPeriods	The number of periods in data.
fittedValues	The fitted values by the FMM model.
M	Value of the estimated intercept parameter M.
A	Vector of the estimated FMM wave amplitude parameter(s) A.
alpha	Vector of the estimated FMM wave phase translation parameter(s) α.
beta	Vector of the estimated FMM wave skewness parameter(s) β.
omega	Vector of the estimated FMM wave kurtosis parameter(s) ω.
SSE	Value of the residual sum of squares values.
R2	Vector specifying the explained variance by each of the fitted FMM components.
nIter	Number of iterations of the backfitting algorithm.

The standard methods implemented for displaying relevant information of an object of the class FMM include the functions:

coef() to display the estimates of each FMM wave parameters.
fitted() to display the fitted values.
resid() to display the residuals of the model.
summary() to display the FMM wave parameter estimates, as well as the peak and trough times, together with the signal values at those times, a brief description of the residuals, and the proportion of variance explained by each FMM component and by the global model. The summary() output can be assigned to an object to get a list of all the displayed results.
getters for each of the FMM object slots such as getA(), getAlpha(), etc.

Simulating FMM data

The function generateFMM() can be used to simulate data from an FMM model. The main arguments of this function are the FMM model parameters: M, A, alpha, beta and omega. For generating data from a monocomponent FMM model enter:

generateFMM(M=2,A=3,alpha=1.5,beta=2.3,omega=0.1,
            plot=TRUE,outvalues=FALSE)

For an FMM model with m components, all these arguments are numeric vectors of length m, except M, which has length 1. For example, you can generate data from an FMM₂ model with the code:

fmm2.data <-generateFMM(M=0,A=c(2,2),alpha=c(1.5,3.4),beta=c(0.2,2.3),omega=c(0.1,0.2),
                         plot=FALSE, outvalues=TRUE)
str(fmm2.data)
#> List of 3
#>  $ input:List of 5
#>   ..$ M    : num 0
#>   ..$ A    : num [1:2] 2 2
#>   ..$ alpha: num [1:2] 1.5 3.4
#>   ..$ beta : num [1:2] 0.2 2.3
#>   ..$ omega: num [1:2] 0.1 0.2
#>  $ t    : num [1:200] 0 0.0316 0.0631 0.0947 0.1263 ...
#>  $ y    : num [1:200] 1.18 1.4 1.64 1.9 2.18 ...

A scatter plot of simulated data against time points can be drawn by setting plot = TRUE. When outvalues = TRUE, a list with input parameters, time points and simulated data are returned.

By default, the data will be simulated at a sequence of 100 equally spaced time points from 0 to 2π. The time point sequence can be modified by from, to and length.out arguments. It can be also manually set using the argument timePoints, in which case from, to and length.out will be ignored.

A Gaussian noise can be added by sigmaNoise argument, whose value sets the corresponding standard deviation of the normally distributed noise. Here, a Gaussian noise with σ = 0.3 is added to the fmm2.data data simulated above:

set.seed(15)
fmm2.data <-generateFMM(M=0,A=c(2,2),alpha=c(1.5,3.4),beta=c(0.2,2.3),omega=c(0.1,0.2),
                        plot=TRUE, outvalues=TRUE,
                        sigmaNoise=0.3)

Fitting FMM models

An FMM model can be fitted using the function fitFMM(). As result an object of the FMM class is obtained.

Basic fitting

The fitFMM() function requires the vData argument with the data to be fitted. In addition, to control a basic fitting, two other arguments can be used: timePoints for the specific time points of a single period, and nback with the number of FMM components to be fitted. For the fmm2.data simulated above, a bicomponent FMM model can be fitted by the code:

fit.fmm2 <- fitFMM(vData = fmm2.data$y, timePoints = fmm2.data$t, nback = 2)
#> |--------------------------------------------------|
#> |==================================================|
#>  Stopped by reaching maximum iterations (2 iteration(s))
summary(fit.fmm2)
#> 
#> Title:
#> FMM model with 2 components
#> 
#> Coefficients:
#> M (Intercept): -0.1793
#>                   A  alpha   beta  omega
#> FMM wave 1:  1.9045 3.3988 2.3253 0.2715
#> FMM wave 2:  1.9835 1.4846 0.1318 0.0909
#> 
#> Peak and trough times and signals:
#>              t.Upper Z.Upper t.Lower Z.Lower
#> FMM wave 1:   0.4910  3.7076  5.4190 -0.1864
#> FMM wave 2:   0.2283  2.9537  4.6142 -3.8835
#> 
#> Residuals:
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> -0.94229 -0.21722  0.03078  0.00000  0.21033  0.74636 
#> 
#> R-squared:
#> Wave 1 Wave 2  Total 
#> 0.7515 0.2043 0.9558

The summary() method allows the return of fitFMM() to be presented in tabular form, where each row corresponds to a component and each column to an FMM wave parameter. From the above summary results, we can see that the variance explained by the fitted model is 95.58% and that the FMM waves are labelled in decreasing order according to the R² value: the explained variance is 75.15% and 20.43% by FMM “Wave 1” and “Wave 2”, respectively.

The location of the peak and trough of each FMM wave, as well as the value of the signal at these time points, can be also estimated by the getFMMPeaks() function. When timePointsIn2pi=TRUE the peak and trough locations to be returned into the interval from 0 to 2π.

getFMMPeaks(fit.fmm2, timePointsIn2pi = TRUE)
#> $tpeakU
#> [1] 0.4909644 0.2282809
#> 
#> $tpeakL
#> [1] 5.419044 4.614189
#> 
#> $ZU
#> [1] 3.707557 2.953744
#> 
#> $ZL
#> [1] -0.1864053 -3.8834602

FMM wave estimation

To solve the estimation problem of a FMM wave a two-way grid search over the choice of the (α, ω) parameters is performed. Then, for each pair of (α, ω) fixed values, the estimates for M, A and β are obtained by solving a least square problem.

The lengthAlphaGrid and lengthOmegaGrid arguments are used to set the grid resolution by specifying the number of equally spaced α and ω values. When both arguments are large, the computational demand can be high. The algorithm will be computationally more efficient by reducing the size of the sequences of the α and ω parameters and repeating the fitting process a numReps of times, in such a way that, at each repetition, a new two-dimensional grid of (α, ω) points is created around the previous estimates.

The example code below shows two different configurations of the arguments lengthAlphaGrid, lengthOmegaGrid and numReps to estimated the previously simulated fmm2.data.

fit1 <- fitFMM(vData = fmm2.data$y, timePoints = fmm2.data$t, nback = 2, 
               lengthAlphaGrid = 48, lengthOmegaGrid = 24, numReps = 3,
               showTime = TRUE)
#> |--------------------------------------------------|
#> |==================================================|
#>  Stopped by reaching maximum iterations (2 iteration(s)) 
#> Time difference of 1.064178 secs
fit2 <- fitFMM(vData = fmm2.data$y, timePoints = fmm2.data$t, nback = 2, 
               lengthAlphaGrid = 14, lengthOmegaGrid = 7, numReps = 6,
               showTime = TRUE)
#> |--------------------------------------------------|
#> |==================================================|
#>  Stopped by reaching maximum iterations (2 iteration(s)) 
#> Time difference of 0.3097422 secs
getR2(fit1)
#> [1] 0.7515071 0.2043392
getR2(fit2)
#> [1] 0.7515011 0.2043493

By combining the value of these arguments of fitFMM() a balance between computational cost and the accuracy of estimates has been achieved. The execution time has been reduced considerably by setting shorter sequences for both lengthAlphaGrid and lengthOmegaGrid arguments, and increasing the number of repetitions from 3 to 6. Note that both configurations achieve the same accuracy.

FMM_m model estimation

A backfitting algorithm is used for the estimation of the multicomponent models.

The argument maxiter sets the maximum number of backfitting iterations. By default, maxiter iterations will be forced, but we can use the argument stopFunction to modify the stopping criteria.

Two criteria have been implemented as stop functions in this package. When stopFunction = alwaysFalse, maxiter iterations will be forced. If stopFunction = R2(), the algorithm will be stopped when the difference between the explained variability in two consecutive iterations is less than a value pre-specified in the difMax argument of R2() function. In the example above, we can use the argument stopFunction = R2(difMax = 0.01) to continue the search until there is an improvement, in terms of explained variability, of less than 1%.

fit3 <- fitFMM(vData = fmm2.data$y, timePoints = fmm2.data$t, nback = 2, 
               maxiter = 5, stopFunction = R2(difMax = 0.01),
               showTime = TRUE, showProgress = TRUE)
#> |--------------------------------------------------|
#> |==================================================|
#>  Stopped by the stopFunction (3 iteration(s)) 
#> Time difference of 1.630063 secs

A restricted FMM_m model

In some applications, it is not uncommon signals with repetitive shape-similar waves. The fitFMM() function allows fitting a restricted version of multicomponent FMM models that incorporate equality constraints on the β and ω parameters in order to obtain more efficient estimators. In particular, d blocks of restrictions can be added:

$$ \begin{array}{cc} \beta_1 = \dots = \beta_{m_1} & \omega_1 = \dots = \omega_{m_1} \\ \beta_{m_1+1} = \dots = \beta_{m_2} & \omega_{m_1+1} = \dots = \omega_{m_2} \\ \dots & \dots \\ \beta_{m_{d-1}+1} = \dots = \beta_{m_d} & \omega_{m_{d-1}+1} = \dots = \omega_{m_d} \end{array} $$

The argument betaOmegaRestrictions sets the equality constraints for the β and ω parameters. To add restrictions, integer vectors of length m can be passed to this argument, so that positions with the same numeric value correspond to FMM waves whose parameters, β and ω, are forced to be equal.

For example, the following code generates a set of 100 observations from an FMM model of order m = 4 with

intercept parameter M = 3,
amplitude parameters: A₁ = 4, A₂ = 3, A₃ = 1.5 and A₄ = 1,
phase translation parameters: α₁ = 3.8, α₂ = 1.2, α₃ = 4.5 and α₄ = 2, and
with regard to the shape parameters, pairs of waves are equal, satisfying: $$ \begin{array}{cc} \beta_1 = \beta_2 = 3 & \omega_1 = \omega_2 = 0.1 \\ \beta_3 = \beta_4 = 1 & \omega_3 = \omega_4 = 0.05 \end{array} $$

set.seed(1115)
rfmm.data <-generateFMM(M = 3, A = c(4,3,1.5,1), 
                        alpha = c(3.8,1.2,4.5,2),
                        beta = c(rep(3,2),rep(1,2)),
                        omega = c(rep(0.1,2),rep(0.05,2)),
                        plot = TRUE, outvalues = TRUE,
                        sigmaNoise = 0.3)

To impose the shape restriction on the fitting process, we use the argument betaOmegaRestrictions = c(1,1,2,2).

fit.rfmm <- fitFMM(vData = rfmm.data$y, timePoints = rfmm.data$t, nback = 4,
                   betaOmegaRestrictions = c(1, 1, 2, 2),
                   lengthAlphaGrid = 24, lengthOmegaGrid = 15, numReps = 5)
#> |--------------------------------------------------|
#> |==================================================|
#>  Stopped by reaching maximum R2 (4 iteration(s))
summary(fit.rfmm)
#> 
#> Title:
#> FMM model with 4 components
#> 
#> Coefficients:
#> M (Intercept): 3.4987
#>                   A  alpha   beta  omega
#> FMM wave 1:  4.2503 3.8062 3.0417 0.0805
#> FMM wave 2:  3.1200 1.1994 3.0417 0.0805
#> FMM wave 3:  1.4128 4.5100 0.9774 0.0371
#> FMM wave 4:  1.0554 2.0094 0.9774 0.0371
#> 
#> Peak and trough times and signals:
#>              t.Upper Z.Upper t.Lower Z.Lower
#> FMM wave 1:   0.6726  5.8381  4.9182 -2.5240
#> FMM wave 2:   4.3491  3.6011  2.3115 -2.2094
#> FMM wave 3:   1.5077 -1.4109  1.3290 -4.0140
#> FMM wave 4:   5.2902 -1.7981  5.1116 -3.7935
#> 
#> Residuals:
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> -0.99881 -0.26135 -0.03754  0.00000  0.26322  1.17515 
#> 
#> R-squared:
#> Wave 1 Wave 2 Wave 3 Wave 4  Total 
#> 0.5378 0.3623 0.0365 0.0218 0.9584

Additional arguments of `fitFMM()` function

nPeriods. For some applications, data are collected over multiple periods. This information is received by the fitFMM() function through the input argument nPeriods. When nPeriods>1, the FMM fitting is carried out by averaging the data collected at each time point across all considered periods.
parallelize. When parallelize = TRUE a parallel processing implementation is used for better performance.
restrExactSolution. When the argument restrExactSolution = FALSE the ω parameters of the restricted version will be estimated by a two-nested backfitting algorithm. Otherwise, the optimal solution for the restricted fitting, computationally more intensive, will be obtain.
showProgress. When showProgress = TRUE a progress indicator and information about the stopping criterion of the backfitting algorithm is displayed on the console.
showTime. When showTime = TRUE the execution time is displayed on the console.

Plotting FMM models

The FMM package includes the function plotFMM() to visualize an object of class FMM. An FMM object can be plotted in two ways:

The default plot that represents as points the original data and as a line the fitted signal.
A component plot that separately represents each centered FMM wave. This plot is displayed when the boolean argument components = TRUE. In this type of representation, when legendInComponentsPlot = TRUE, a legend appears to identify the represented waves.

The following code can be used to display graphically the fit.fmm2 object created in the previous section:

titleText <- "Two FMM waves"
par(mfrow=c(1,2))
# default plot
plotFMM(fit.fmm2, textExtra = titleText)
# component plot
plotFMM(fit.fmm2, components = TRUE, textExtra = titleText, legendInComponentsPlot = TRUE)

The argument textExtra has been used to add an extra text to the title of both graphical representations.

When the argument use_ggplot2 = TRUE, a more aesthetic and customizable plot is created using the ggplot2 package instead of base R graphics. For example, for creating ggplots for the object fit.rfmm which contains the restricted FMM model of order m = 4 previously fitted, we can enter:

library("RColorBrewer")
library("ggplot2")
library("gridExtra")

titleText <- "Four restricted FMM waves"
# default plot
defaultrFMM2 <- plotFMM(fit.rfmm, use_ggplot2 = TRUE, textExtra = titleText)
defaultrFMM2 <- defaultrFMM2 + theme(plot.margin=unit(c(1,0.25,1.3,1), "cm")) +
  ylim(-5, 6)
# component plot
comprFMM2 <- plotFMM(fit.rfmm, components=TRUE, use_ggplot2 = TRUE, textExtra = titleText)
comprFMM2 <- comprFMM2 + theme(plot.margin=unit(c(1,0.25,0,1), "cm")) +
  ylim(-5, 6) + scale_color_manual(values = brewer.pal("Set1",n = 8)[3:6])

grid.arrange(defaultrFMM2, comprFMM2, nrow = 1)

References

Fernández, Itziar, Alejandro Rodrı́guez-Collado, Yolanda Larriba, Adrián Lamela, Christian Canedo, and Cristina Rueda. 2021. “FMM: An R Package for Modeling Rhythmic Patterns in Oscillatory Systems.” arXiv Preprint arXiv:2105.10168.

Gerstner, Wulfram, WM Kistler, R Naud, and L Paninski. 2014. Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition. Cambridge University Press. https://doi.org/10.1017/CBO9781107447615.

Hodgkin, Alan L, and Andrew F Huxley. 1952. “A Quantitative Description of Membrane Current and Its Application to Conduction and Excitation in Nerve.” The Journal of Physiology 117 (4): 500–544. https://doi.org/10.1113/jphysiol.1952.sp004764.

Laguna, Pablo, Roger G Mark, A Goldberg, and George B Moody. 1997. “A Database for Evaluation of Algorithms for Measurement of QT and Other Waveform Intervals in the ECG.” In Computers in Cardiology 1997, 673–76. IEEE. https://doi.org/10.1109/CIC.1997.648140.

Larriba, Yolanda, and Cristina Rueda. 2021. “A Circadian Gene Expression Atlas in Humans.” Preprint.

Rodrı́guez-Collado, Alejandro, and Cristina Rueda. 2021a. “A Simple Parametric Representation of the Hodgkin-Huxley Model.” PLoS ONE 16 (7): e0254152. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0254152.

———. 2021b. “Electrophysiological and Transcriptomic Features Reveal a Circular Taxonomy of Cortical Neurons.” Frontiers in Human Neuroscience 15: 410. https://doi.org/10.3389/fnhum.2021.684950.

Rueda, Cristina, Itziar Fernández, Yolanda Larriba, and Alejandro Rodrı́guez-Collado. 2021. “The FMM Approach to Analyze Biomedical Signals: Theory, Software, Applications and Future.” Mathematics 9 (10): 1145. https://doi.org/10.3390/math9101145.

Rueda, Cristina, Yolanda Larriba, and Adrian Lamela. 2021. “The Hidden Wave in the ECG Uncovered Revealing a Sound Automated Interpretation Method.” Scientific Reports 11: 3724. https://doi.org/10.1038/s41598-021-82520-w.

Rueda, Cristina, Yolanda Larriba, and Shyamal D Peddada. 2019. “Frequency Modulated Möbius Model Accurately Predicts Rhythmic Signals in Biological and Physical Sciences.” Scientific Reports 9 (1): 18701. https://doi.org/10.1038/s41598-019-54569-1.

Rueda, Cristina, Alejandro Rodrı́guez-Collado, and Yolanda Larriba. 2021. “A Novel Wave Decomposition for Oscillatory Signals.” IEEE Transactions on Signal Processing 69: 960–72. https://ieeexplore.ieee.org/document/9321700.

FMM Vignette