Chapter 2. AirXCell intents, features and limitations

2.1. AirXCell features

AirXCell intents to provide the user with a GUI (Graphical User Interface) which she can use to perform most common tasks one usually uses a spreadsheet application or a mathematical computation environment for. It attempts to get the best of both worlds within a single, coherent and unified application.

AirXCell is built on top of R (The R Project for Statistical Computing). R is a free and open source advanced mathematical computing environment. Initially, R was oriented towards solving or computing statistical problems. Nowadays R is rather a pretty complete mathematical computing environment and can be used for a very broad range of mathematical applications. In addition, R is free, robust and scalable, making it the first choice for the AirXCell calculation engine.

AirXcell is not only a spreadsheet or an R GUI, it provides the user with an R Console (4.2.4), an R Code Editor (6), a Data Frame Viewer and Editor (8), a Chart Wizard (7) or various Dynamic Forms (2.1.5) that can be easily created by the site administrator to build any kind of User Interface on top of the R calculation engine.

2.1.1. What are modules ?

In fact, pretty much everything in AirXCell except the global UI (User Interface) canvas with the Console, the Menu Bar, etc. is a module (2.1.1).

AirXCell is not a spreadsheet software or an R Code Editor per se. AirXCell is rather a framework for building online Web application using the R calculation engine. Various modules are written on top of this framework to provide the user with utilities of all kind. The framework behind AirXCell supports the module execution platform with GUI (Graphical User Interface) components and the various abstractions aimed at helping the modules use the R calculation engine as well as specific behaviours they may well require.

As of version 0.5.10-SNAPSHOT, there are four modules implemented :

In future versions, there will likely be several additional modules.

Each module really implements in a way a complete new application on top of the AirXCell framework. For instance the Code Editor (6) module has only little to do with the Calculation Sheet (5) module, even though a script executed from the Code Editor has full access to the underlying R matrix representing a Calculation Sheet.

This separation of concerns is not enforced though. For instance the Chart Wizard (7) is tightly coupled to the Calculation Sheet (5) since its only goal is to provide the user with assistance in creating charts and graphs from values in a Calculation Sheet.

In addition to modules, AirXCell provides the interesting concepts of dynamic forms (2.1.5). Dynamic forms are much more limited in terms of User Interface or possibilities than modules since they are edited from within the application. But the fact that they can be developed completely dynamically, i.e. without any change on the AirXCell codebase is a strong advantage considering that a change can be implemented and put at disposal of the user within seconds. The runtime support for dynamic forms is implemented by a specific module though.

2.1.1.1. Module instances

A module is an application, almost a complete software on its own, not a particular usage of that application. As an example, whenever the user creates a new Calculation Sheet (5) in her Workspace (2.1.4), we say that she instantiated a new Calculation Sheet module.

Modules instantiated in a user workspace, kind of new sheets added on an usual spreadsheet software, are called module instances. A user can create or open as many module instances in her workspace as required.

2.1.2. Keyboard shortcuts support

AirXCell supports a wide range of keyboard shortcuts for various usages. Every feature available from the menu or a toolbar displays the corresponding keyboard shortcut as a hint whenever the user leaves the mouse over it for a moment.

In addition to these features, most common shortcuts such as Copy / Paste with Ctrl + C / Ctrl + V or switching the currently displayed tab using Ctrl + Shift + PG_UP / PG_DOWN are supported.

2.1.3. Chart Interception

Graphical facilities in R :

From http://cran.r-project.org/doc/manuals/R-intro.html#Graphics :

Graphical facilities form an important and extremely versatile set of components of the R environment. With R, it is possible to create almost every posible kind of graphics or event to build entirely new types of graphic.

The graphical facilities can be used in both interactive and batch modes, but in most cases, interactive use is more productive. Interactive use is also easy because at startup time R initiates a graphics device driver which opens a special graphics window for the display of interactive graphics.

Chartting commands are divided into three basic groups:

  • High-level chartting functions create a new chart on the graphics device, possibly with axes, labels, titles and so on.
  • Low-level chartting functions add more information to an existing chart, such as extra points, lines and labels.
  • Interactive graphics functions allow you interactively add information to, or extract information from, an existing chart, using a pointing device such as a mouse.

In addition, R maintains a list of graphical parameters which can be manipulated to customize your charts.

Graphical facilities in AirXCell :

AirXCell fully supports each and every high-level and low-level chartting functions. It provides the user with a virtual graphic device acting as an interceptor on charts generated from the underlying R environment. The intercepted graphic is sent towards the user's browser and displayed as an image inside of the AirXCell application. This feature is called Chart Interception.

Interactive graphics functions are not supported however. See AirXCell limitations - interactive charts. (2.3)

Chart interception occurs transparently without any specific action from the user. Whenever a chart is generated within the R environment, whatever the source, it is intercepted and displayed to the user. The source from which the chart is generated doesn't matter, it can be the R Console (4.2.4), a Chart Module (7) instance or a Dynamic Form (2.1.5) instance

2.1.4. Workspace

Just as the R environment, a unit of work in AirXCell, i.e. the collection of all module instances (2.1.1.1) belonging to the same task or opened in the AirXCell application at the same time is called a Workspace. A Workspace in AirXCell is the same as a Workbook for instance in other popular softwares.

The term Workspace comes from the R environment where it designates an R session in the R environment. Workspaces in AirXCell can be stored on the user local computer or online on the remote R server (4.4.1).

2.1.5. Dynamic Forms

AirXCell Modules (2.1.1) are interesting since they enable the AirXCell developers to provide and implement various kind of different applications on top of the AirXCell framework. Yet every new module or even only every single change applied on a module requires the AirXCell administrator to deploy a brand new version of AirXCell; hence the need for something else: the Dynamic Forms.

The Dynamic Form Subsystem is a mean offered to administrators, maintainers or webmasters of an AirXCell deployment to create brand new custom and specific applications using the R calculation engine, the AirXCell server backend and the AirXCell GUI. The idea is to implement new HTML forms and R programs within AirXCell without the need to change anything on the AirXCell codebase nor to redeploy or reinstall the software.

Dynamic forms consist of an HTML formular, the one shown to the user, and an R script that is tightly bound to this formular. The AirXCell Dynamic Form Subsystem intercepts the values set by the user and the actions performed on the HTML form and injects them in the R script before it is executed within the R environment.

This way, it becomes easy and almost straightforward to build any kind of Web Graphical User Interface on top of any R program. A dynamic form, just as a real application, needs to be executed by the user. The execution is triggered either by pushing a Submit button placed on the form itself, or using the Execute button on the dynamic form module toolbar.

A few dynamic forms are already implemented in the current version of AirXCell:

The features available from the toolbar are as follows:

  • Show Dynamic Formular : show the dynamic formular.

  • Show Results : show the the results returned by the server upong form execution.

  • Run : run the formular and get the results.

2.2. AirXCell is built on top of R

AirXCell is built on top of R. Each session a user opens on AirXCell is bound to a virtual R session as well. That R session has a few limitations though. For instance, it is not possible to the end user to install any new package.

The set of R packages available to end users is listed below.One should note though that these packages need to be loaded by the user before their functions become available in the workspace. A package is loaded within R using the plain old R command : require('package_name').

Table 2.1.  List of available R packages within AirXCell

  • abind : multi-dimensional array combination function
  • amelia : package supporting multiple imputation of missing data
  • amore : flexible neural network package
  • backtest : exploring portfolio-based conjectures about financial instruments
  • bayesm : package for Bayesian inference
  • bitops : package implementing bitwise operations
  • BLCOP : Black-Litterman and copula-opinion pooling frameworks
  • boot : package for bootstrapping functions from Davison and Hinkle
  • BsMD : package for bootstrapping functions from Davison and Hinkle
  • car : companion to Applied Regression by John Fox
  • caTools : package providing various utility functions
  • chron : package for chronologically ordered objects
  • class : package for classification
  • cluster : package for cluster analysis by Rousseeuw et al
  • coda : output analysis and diagnostics for MCMC simulations in R
  • codetools : package providing code analysis tools
  • colorspace : color space manipulation
  • combinat : package with utilities for combinatorics
  • datasets : variety of datasets
  • date : package for date handling
  • Design : regression modeling strategies tools by Frank Harrell
  • DiagnosisMed : medical diagnostic test accuracy analysis toolkit
  • DistributionUtils : Distribution Utilities
  • DoE.base : Full factorials, orthogonal arrays and base utilities for DoE packages
  • dynlm : Dynamic Linear Regression
  • eco : routines for Bayesian ecological inference
  • effects : graphical and tabular effects display for glm models
  • Epi : epidemiological analysis
  • epibasix : elementary epidemiological functions
  • epicalc : epidemiological calculator
  • epir : functions for analysing epidemiological data
  • epitools : epidemiology tools for data and graphics
  • erm : package for extended Rasch modelling
  • evd : functions for extreme value distributions
  • farma : package for financial engineering -- fArma
  • fAsianOptions : package for financial engineering -- fAsianOptions
  • fAssets : package for financial engineering -- fAssets
  • fBasics : package for financial engineering -- fBasics
  • fBonds : package for financial engineering -- fBonds
  • fCopulae : package for financial engineering -- fCopulae
  • fEcofin : package for financial engineering -- fEcofin
  • fExoticOptions : package for financial engineering -- fExoticOptions
  • fExtremes : package for financial engineering -- fExtremes
  • fGarch : package for financial engineering -- fGarch
  • fImport : package for financial engineering -- fImport
  • fKF : Fast Kalman Filter
  • fMultivar : package for financial engineering -- fMultivar
  • fNonlinear : package for financial engineering -- fNonlinear
  • fOptions : package for financial engineering -- fOptions
  • foreach : foreach looping support
  • foreign : package to read/write data from other stat. systems
  • fPortfolio : package for financial engineering -- fPortfolio
  • fRegression : package for financial engineering -- fRegression
  • FrF2 : Fractional Factorial designs with 2-level factors
  • fTrading : package for financial engineering -- fTrading
  • fUnitRoots : package for financial engineering -- fUnitRoots
  • fUtilities : package for financial engineering -- fUtilities
  • gam : Generalized Additive Models
  • g.data : package for delayed-data
  • gdata : package with data manipulation tools by Greg Warnes et al
  • GenABEL : package for genome-wide SNP association analysis
  • genetics : package for population genetics
  • gmaps : support for producing geographic maps with grid graphics
  • gmodels : package with tools for model fitting by Greg Warnes et al
  • gplots : package with tools for chartting data by Greg Warnes et al
  • gregmisc : package with miscellaneous functions by Greg Warnes et al
  • gss : package for structural multivariate function estimation using smoothing splines.
  • gtools : package with R programming tools by Greg Warnes et al
  • haplo.stats : package for hachartype analysis
  • hdf5 : package interfacing the NCSA HDF5 library
  • Hmisc : miscellaneous functions by Frank Harrell
  • HyperbolicDist : package providing functions for the hyperbolic and related distributions.
  • iterators : iterator support for vectos, lists and other containers
  • its : package for handling irregular time series
  • KernSmooth : package for kernel smoothing and density estimation
  • lattice : package for 'Trellis' graphics
  • latticeExtra : package of additional graphical displays based on lattice
  • lme4 : package for linear mixed effects model fitting
  • lmtest : package for diagnostic checking in linear models
  • lpsolve : package providing linear program solvers
  • mapdata : support for producing geographic maps (supplemental data)
  • mapproj : support for cartographic projections of map data
  • maps : support for producing geographic maps
  • maptools : tools for reading and handling spatial objects
  • MASS : functions and datasets to support Venables and Ripley, 'Modern Applied Statistics with S'
  • MatchIt : package of nonparametric matching methods
  • Matrix : package of classes for dense and sparse matrices
  • MCMCpack : routines for Markov chain Monte Carlo model estimation
  • medAdherence : medication adherence: commonly used definitions
  • mgcv : package for multiple parameter smoothing estimation
  • misc3d : collection of 3d chart functions and rgl-based isosurfaces
  • mnormt : package providing multivariate normal and t distribution
  • MNP : package for fitting multinomial probit (MNP) models
  • msm : multi-state Markov and hidden Markov models in continuous time
  • multicomp : package for multiple comparison procedures
  • mvtnorm : package to compute multivariate Normal and T distributions
  • nlme : package for (non-)linear mixed effects models
  • nnet : package for feed-forward neural networks
  • numDeriv : methods for calculating accurate numerical first and second order derivatives
  • nws : package for distributed programming via NetWorkSpaces
  • PerformanceAnalytics : econometric tools for performance and risk analysis
  • plotrix : package providing various chartting functions
  • plyr : tools for splitting, applying and combining data
  • polspline : package providing polynomial spline fitting
  • portfolio : package for analysing equity portfolios
  • portfolioSim : framework for simulating equity portfolio strategies
  • pscl : package for discrete data models
  • psy : procedures for psychometrics
  • qtl : package for genetic marker linkage analysis
  • quadprog : package for solving quadratic programming problems
  • quantmod : quantitative financial modelling framework
  • qvalue : package for Q-value estimation for FDR control
  • randomForest : package implementing the random forest classificator
  • randtoolbox : toolbox for pseudo and quasi random number generation and RNG tests
  • RaschSampler : package for sampling binary matrices with fixed margins
  • RColorBrewer : package providing suitable color palettes
  • relimp : package for inference on relative importance of regressors
  • reshape2 : lets you flexibly restructure and aggregate data using just two functions: melt and cast.
  • rggobi : package for the GGobi data visualization system
  • Rglpk : interface to the GNU Linear Programing Kit
  • rjags : bayesian graphical models using MCMC
  • R.methodsS3 : utility function for defining S3 methods
  • rngWELL : toolbox for WELL random number generators
  • robustbase : package providing basic robust statistics
  • ROCR : package to prepare and display ROC curves
  • R.oo : R object-oriented programming with or without references
  • rpart : package for recursive partitioning and regression trees
  • RQuantLib : package interfacing the QuantLib finance library
  • R.utils : provides utility methods useful when programming and developing R packages.
  • sandwich : package for model-robust standard error estimates
  • scatterplot3d : package for Visualizing Multivariate Data
  • schwartz97 : package on the Schwartz two-factor commodity model
  • sfsmisc : utilities from Seminar fuer Statistik ETH Zurich
  • SkewHyperbolic : the Skew Hyperbolic Student t-Distribution
  • slam : sparse lighweight arrays and matrices package
  • sn : package providing skew-normal and skew-t distribution
  • snow : package for simple network of workstations
  • sp : classes and methods for spatial data
  • spatial : package for spatial statistics
  • spatstat : spatial Point Pattern analysis, model-fitting, simulation, tests
  • spc : statistical process control
  • stabledist : stable Distribution Functions
  • stats : contains functions for statistical calculations and random number generation.
  • stats4 : contains functions and classes for statistics using the S version 4 class system.
  • stockPortfolio : build stock models and analyze stock portfolios
  • stringr : make it easier to work with strings
  • strucchange : package for structural change regression estimation
  • surveillance : temporal and Spatio-Temporal Modeling and Monitoring of Epidemic Phenomena
  • survival : package for survival analysis
  • taRifx : collection of utility and convenience functions
  • timeDate : package for financial engineering -- timeDate
  • timeSeries : package for financial engineering -- timeSeries
  • tseries : package for time-series analysis and comp. finance
  • urca : package providing unit root and cointegration tests
  • VarianceGamma : the Variance Gamma Distribution
  • vcd : visualizing Categorical Data
  • VGAM : package for estimating vector generalized additive models
  • XML : package for XML parsing and generation
  • xts : eXtensible Time Series
  • Zelig : package providing a unified front-end for estimating statistical models
  • zoo : package for totally ordered indexed observations


2.3. AirXCell limitations

AirXCell limitations as of version 0.5.10-SNAPSHOT are as follows :

  • Static plots are supported in the form of a PNG device. All the usual limitations of the R PNG image device apply to AirXCell. In addition, the user should not try one of R's mechamisms to change the plotting device since they are not supported by AirXCell so far. (Any attempt to do so would lead to unpredictable results and is verly likely to corrupt the AirXCell workspace.)

  • Interactive charting functions are not supported. AirXCell does not implement so far a fully functional virtual interactive graphic device on the R backend side. As such, it is yet not possible to forward user actions performed on the graphic within the GUI to the R side.

  • When working in an R environment, one often uses extensively the file upload feature and data loaded from files on the R side.

    A problem arises here with AirXCell since file upload is not yet supported. One can still use the file import feature from an R Code Editor (6) or from simple R commands executed in the R Console (4.2.4) but as it is not possible (yet as of version 0.5.10-SNAPSHOT) to upload any file on the remote R environment, the ability to load data from files is pretty useless.

    There is however a workaround to this limitation as described in Section 6.2, “Loading files in the R environment” since it is possible to upload a CSV file on the spreadsheet.

  • Whenever the user executes an endless script - i.e. for instance an infinite loop from within the R Console (4.2.4) or the R Code Editor (6), there is no chance to recover the workspace since the AirXCell backend will never give hand back to the AirXCell User Interface

    In this case, the workspace is killed on the AirXCell backend after 5 minutes of continuouus execution. At that moment the user received a failure notification and can only close her workspace and restart from scratch.

    We can therefor not stress enough how much it is important when developing R scripts with AirXCell to avoid infinite loops and endless scripts.