Changes in R

The ‘Changes in R’ article from the 2012-2 issue.

The R Core Team
2012-12-01

CHANGES IN R VERSION 2.15.2

NEW FEATURES

  • The X11() window gains an icon: the latter may be especially useful on Ubuntu’s ‘Unity’ interface.

    The WM_CLASS should be set in circumstances where the Window Manager failed to make use of X11 resource settings.

    (Contributed by Philip Johnson.)

  • The "Date" and "POSIXt" methods for cut() will accept an unsorted breaks argument (as the default method does, although this was undocumented). (Wish of PR#14961.)

  • Reference class methods (in the methods package) that use other methods in an indirect way (e.g. by sapply()) must tell the code analysis to include that method. They can now do so by invoking $usingMethods().

  • More Polish translations are available: for the RGui menus and for several recommended packages.

  • Multistratum MANOVA works. In fact, it seems to have done so for years in spite of the help page claiming it did not.

  • qqline() has new optional arguments distribution, probs and qtype, following the example of lattice’s panel.qqmathline().

  • The handling of single quotes in the en@quot pseudo-language has been slightly improved. Double quotes are no longer converted.

  • New functions checkPoFiles() and checkPoFile() have been added to the tools package to check for consistency of format strings in translation files.

  • model.matrix(~1, ...) now also contains the same rownames that less trivial formulae produce. (Wish of PR#14992, changes the output of several packages.)

  • Misuse of rep() on undocumented types of objects (e.g. calls) is now reported as an error.

  • The included LAPACK has been updated to 3.4.1, with some patches from the current SVN sources. (Inter alia, this resolves PR#14692.)

  • file.copy(recursive = TRUE) has some additional checks on user error leading to attempted infinite recursion (and on some platforms to crashing R).

  • PCRE has been updated to version 8.31, a bug-fix release.

  • The included version of liblzma has been updated to version 5.0.4, a minor bug-fix release.

  • New function .bincode(), a ‘bare-bones’ version of cut.default(labels = FALSE) for use in packages with image() methods.

  • The HTML manuals now use directional single quotes.

  • maintainer() now converts embedded new lines to spaces. It no longer gives a non-obvious error for non-installed packages.

  • The X11() device has some protection against being used with forked processes via package parallel.

  • Setting the environment variable R_OSX_VALGRIND (to any value) allows R to be run under valgrind on Mac OS 10.6 and 10.7 (valgrind currently has very limited support for 10.8), provided system() is not used (directly or indirectly). This should not be needed for valgrind >= 3.8.1.

  • The "model.frame" method for lm() uses xlevels: this is safer if data was supplied or model = FALSE was used and the levels of factors used in the fit had been re-ordered since fitting.

    Similarly, model.frame(fm, data=<data>) copies across the variables used for safe prediction from the fit.

  • Functions such as parLapply() in package parallel can make use of a default cluster if one is set. (Reported by Martin Morgan.)

  • chol(pivot = TRUE, LINPACK = FALSE) is now available using LAPACK 3.2 subroutine DPSTRF.

  • The functions .C(), .Call(), .External() and .Fortran() now check that they are called with an unnamed first argument: the formal arguments were changed from name= to .NAME= in R 2.13.0, but some packages were still using the old name. This is currently a warning, but will be an error in future.

  • step() no longer tries to improve a model with AIC of -Inf (a perfect fit).

  • spline() and splinefun() gain a new method "hyman", an implementation of Hyman’s method of constructing monotonic interpolation splines. (Based on contributions of Simon Wood and Rob Hyndman.)

  • On Windows, the C stack size has been increased to 64MB (it has been 10MB since the days of 32MB RAM systems).

PERFORMANCE IMPROVEMENTS

  • array() is now implemented in C code (for speed) when data is atomic or an unclassed list (so it is known that as.vector(data) will have no class to be used by rep()).

  • rep() is faster and uses less memory, substantially so in some common cases (e.g. if times is of length one or length.out is given, and each = 1).

  • findInterval(), tabulate(), cut(), hist() and image.default() all use .Call() and are more efficient.

  • duplicated(), unique() and similar now support vectors of lengths above \(2^{29}\) on 64-bit platforms.

  • Omitting PACKAGE in .C() etc calls was supposed to make use of the DLL from the namespace within which the enclosing function was defined. It was less successful in doing so than it might be, and gave no indication it had failed.

    A new search strategy is very successful and gives a warning when it fails. In most cases this is because the entry point is not actually provided by that package (and so PACKAGE should be used to indicate which package is intended) but in some the namespace does not have a DLL specified by a useDynLib() directive so PACKAGE is required.

UTILITIES

  • R CMD check now checks if a package can be loaded by library(pkgname, lib.loc = "somewhere") without being on the library search path (unless it is already installed in .Library, when it always will be).

  • R CMD check –as-cran notes ‘hidden’ files and directories (with names starting with a dot) that are not needed for the operation of R CMD INSTALL or R CMD build: such files should be excluded from the published tarball.

  • R CMD check (if checking subdirectories) checks that the R code in any demos is ASCII and can be parsed, and warns if not.

  • When R CMD Rd2pdf is used with inputenx.sty, it allows further characters (mainly for Eastern European languages) by including ix-utf8enc.dfu (if available). (Wish of PR#14989.)

  • R CMD build now omits several types of hidden files/directories, including inst/doc/.Rinstignore, vignettes/.Rinstignore, (.Rinstignore should be at top level), .deps under src, .Renviron, .Rprofile, .Rproj.user, .backups, .cvsignore, .cproject, .directory, .dropbox, .exrc, .gdb.history, .gitattributes, .gitignore, .gitmodules, .hgignore, .hgtags, .htaccess, .latex2html-init, .project, .seed, .settings, .tm_properties and various leftovers.

  • R CMD check now checks for .C(), .Call(), .External() and .Fortran() calls in other packages, and gives a warning on those found from R itself (which are not part of the API and change without notice: many will be changed for R 2.16.0).

C-LEVEL FACILITIES

  • The limit for R_alloc on 64-bit platforms has been raised to just under 32GB (from just under 16GB).

  • The misuse of .C("name", ..., PACKAGE = foo) where foo is an arbitrary R object is now an error.

    The misuse .C("name",..., PACKAGE = "") is now warned about in R CMD check, and will be an error in future.

DEPRECATED AND DEFUNCT

  • Use of array() with a 0-length dim argument is deprecated with a warning (and was contrary to the documentation).

  • Use of tapply() with a 0-length INDEX list is deprecated with a warning.

  • Translation packages are deprecated.

  • Calling rep() or rep.int() on a pairlist is deprecated and will give a warning. In any case, rep() converted a pairlist to a list so you may as well do that explicitly.

  • Entry point rcont2 is no longer part of the API, and will move to package stats in R 2.16.0.

  • The ‘internal’ graphics device invoked by .Call("R_GD_nullDevice", package = "grDevices") is about to be removed: use pdf(file = NULL) instead.

  • eigen(EISPACK = TRUE), chol(pivot = FALSE, LINPACK = TRUE), chol2inv(LINPACK = TRUE), solve(LINPACK = TRUE) and svd(LINPACK = TRUE) are deprecated and give a warning.

    They were provided for compatibility with R 1.7.0 (Mar 2003)!

  • The ‘internal function’ kappa.tri() has been renamed to .kappa_tri() so it is not inadvertently called as a method for class "tri".

  • Functions sessionData() and browseAll() in package methods are on a help page describing them as ‘deprecated’ and are now formally deprecated.

PACKAGE INSTALLATION

  • For a Windows or Mac OS X binary package install, install.packages() will check if a source package is available on the same repositories, and report if it is a later version or there is a source package but no binary package available.

    This check can be suppressed: see the help page.

  • install.packages(type = "both") has been enhanced. In interactive use it will ask whether to choose the source version of a package if the binary version is older and contains compiled code, and also asks if source packages with no binary version should be installed).

INSTALLATION

  • There is a new configure option –with-libtiff (mainly in case the system installation needs to be avoided).

  • LAPACK 3.4.1 does use some Fortran 90 features, so g77 no longer suffices.

  • If an external LAPACK is used, it must be version 3.2 or later.

BUG FIXES

  • On Windows, starting Rterm via R.exe caused Ctrl-C to misbehave. (PR#14948)

  • The tools::latexToUtf8() function missed conversions that were contained within braces.

  • Long timezone specifications (such as a file name preceded by :) could crash as.POSIXlt. (PR#14945)

  • R CMD build –resave-data could fail if there was no data directory but there was an R/sysdata.rda file. (PR#14947)

  • is.na() misbehaved on a 0-column data frame. (PR#14959)

  • anova.lmlist() failed if test was supplied. (PR#14960)

    It was unable to compute Cp tests for object of class "lm" (it assumed class "glm").

  • The formula method for sunflowerplot() now allows xlab and ylab to be set. (Reported by Gerrit Eichner.)

  • The "POSIXt" and "Date" methods for hist() could fail on Windows where adjustments to the right-hand boundary crossed a DST transition time.

  • On Windows, the code in as.POSIXct() to handle incorrectly specified isdst fields might have resulted in NA being returned.

  • aov() and manova() gave spurious warnings about a singular error model in the multiresponse case.

  • In ns() and bs(), specifying knots = NULL is now equivalent to omitting it, also when df is specified. (PR#14970)

  • sprintf() did not accept numbered arguments ending in zero. (PR#14975)

  • rWishart() could overflow the C stack and maybe crash the R process for dimensions of several hundreds or more. (Reported by Michael Braun on R-sig-mac.)

  • Base package vignettes (e.g. vignette("Sweave")) were not fully installed in builds of R from the tarball.

  • lchoose() and choose() could overflow the C stack and crash R.

  • When given a 0-byte file and asked to keep source references, parse() read input from stdin() instead.

  • pdf(compress = TRUE) did not delete temporary files it created until the end of the R session. (PR#14991)

  • logLik() did not detect the error of applying it to a multiple-response linear model. (PR#15000)

  • file.copy(recursive = TRUE) did not always report FALSE for a failure two or more directories deep.

  • qgeom() could return -1 for extremely small q. (PR#14967.)

  • smooth.spline() used DUP = FALSE which allowed its compiled C code to change the function: this was masked by the default byte-compilation. (PR#14965.)

  • In Windows, the GUI preferences for foreground color were not always respected. (Reported by Benjamin Wells.)

  • On OS X, the Quartz versions of the bitmap devices did not respect antialias = "none". (PR#15006.)

  • unique() and similar would infinite-loop if called on a vector of length > \(2^{29}\) (but reported that the vector was too long for \(2^{30}\) or more).

  • parallel::stopCluster() now works with MPI clusters without snow being on the search path.

  • terms.formula() could exhaust the stack, and the stack check did not always catch this before the segfault. (PR#15013)

  • sort.list(method = "radix") could give incorrect results on certain compilers (seen with clang on Mac OS 10.7 and Xcode 4.4.1).

  • backsolve(T, b) gave incorrect results when nrows(b) > ncols(T) and b had more than one column.

    It could segfault or give nonsense if k was specified as more than ncols(T).

  • smooth.spline() did not check that a specified numeric spar was of length 1, and gave corrupt results if it was of length 0.

  • Protection added to do_system. (PR#15025)

  • Printing of vectors with names > 1000 characters now works correctly rather than truncating. (PR#15028)

  • qr() for a complex matrix did not pivot the column names.

  • –with-blas=’-framework vecLib’ now also works on OS X 10.8.

  • R CMD check no longer fails with an error if a DESCRIPTION file incorrectly contains a blank line. (Reported by Bill Dunlap.)

  • install.packages(type = "both") could call chooseCRANmirror() twice.

  • lm.wfit() could segfault in R 2.15.1 if all the weights were zero. (PR#15044)

  • A malformed package name could cause R CMD INSTALL to write outside the target library.

  • Some of the quality control functions (e.g. tools::checkFF()) were wrongly identifying the source of S4 methods in a package and so not checking them.

  • The default type of display by browseEnv() when using R.app on Mac OS X has been incorrect for a long time.

  • The implementation of importMethodsFrom in a NAMESPACE file could be confused and fail to find generics when importing from multiple packages (reported and fixed by Michael Lawrence).

  • The detection of the C stack direction is better protected against compiler optimization. (PR#15011.)

  • Long custom line types would sometimes segfault on the cairographics-based devices. (PR#15055.)

  • tools::checkPoFile() unprotected too early in its C code and so segfaulted from time to time.

  • The Fortran code underlying nlminb() could infinite-loop if any of the input functions returned NA or NaN. This is now an error for the gradient or Hessian, and a warning for the function (with the value replaced by Inf). (In part, PR#15052.)

  • The code for creating coerce() methods could generate false notes about ambiguous selection; the notes have been suppressed for this function.

  • arima.sim() could give too long an output in some corner cases (in part, PR#15068).

  • anova.glm() with test = "Rao" didn’t work when models included an offset. (Reported by Søren Feodor Nielsen.)

  • as.data.frame.matrix() could return invalid data frame with no row.names attribute for 0-row matrix. (Reported by Hervé Pagès.)

  • Compilation with the vecLib or Accelerate frameworks on OS X without using that also for LAPACK is more likely to be successful.

Note

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Team, "Changes in R", The R Journal, 2012

BibTeX citation

@article{RJ-2012-2-r-changes,
  author = {Team, The R Core},
  title = {Changes in R},
  journal = {The R Journal},
  year = {2012},
  note = {https://rjournal.github.io/},
  volume = {4},
  issue = {2},
  issn = {2073-4859},
  pages = {76-79}
}