Installing and switching to MKL on Fedora

In our last post, we presented the FlexiBLAS library, coming to Fedora 33, and the accompanying flexiblas R package, which enables live switching of the BLAS backend among the various open source options readily available in the Fedora repositories.

In this post, we demonstrate how to install, register with FlexiBLAS, and finally switch to Intel’s Math Kernel Library (MKL) in a few steps. First, we prepare a proper environment using docker:

$ docker run --rm -it fedora:33
$ dnf install 'dnf-command(config-manager)' # install config manager
$ dnf install R-flexiblas # install R and the FlexiBLAS API interface for R

Then we add Intel’s YUM repository, import the public key and install MKL:

$ dnf config-manager --add-repo https://yum.repos.intel.com/mkl/setup/intel-mkl.repo
$ rpm --import https://yum.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
$ dnf install intel-mkl # or a specific version, e.g. intel-mkl-2020.0-088

Then, in an R session:

library(flexiblas)

flexiblas_load_backend("/opt/intel/mkl/lib/intel64/libmkl_rt.so")
#> flexiblas BLAS /opt/intel/mkl/lib/intel64/libmkl_rt.so not found in config.
#> <flexiblas> BLAS /opt/intel/mkl/lib/intel64/libmkl_rt.so does not provide an integer size hint. Assuming 4 Byte.
#> [1] 2

backends <- flexiblas_list_loaded()
backends
#> [1] "OPENBLAS-OPENMP"                        
#> [2] "/opt/intel/mkl/lib/intel64/libmkl_rt.so"

And that’s it: now, we are able to switch between the default one and MKL. As in our previous post, let’s compare them with a simple GEMM benchmark:

n <- 2000
runs <- 10

A <- matrix(runif(n*n), nrow=n)
B <- matrix(runif(n*n), nrow=n)

# benchmark
timings <- sapply(seq_along(backends), function(i) {
  flexiblas_switch(i)

  # warm-up
  C <- A[1:100, 1:100] %*% B[1:100, 1:100]

  unname(system.time({
    for (j in seq_len(runs))
      C <- A %*% B
  })[3])
})

results <- data.frame(
  backend = backends,
  `timing [s]` = timings,
  `performance [GFlops]` = (2 * (n / 1000)^3) / timings,
  check.names = FALSE)

results[order(results$performance),]
#>                                   backend timing [s] performance [GFlops]
#> 2 /opt/intel/mkl/lib/intel64/libmkl_rt.so      3.487             4.588471
#> 1                         OPENBLAS-OPENMP      0.754            21.220159

And still OpenBLAS rocks!

For questions, suggestions or issues related to this R interface, please use its issue tracker or the R-SIG-Fedora mailing list. For more general issues, please use Red Hat Bugzilla or the upstream issue tracker.

Switch BLAS/LAPACK without leaving your R session

BLAS and LAPACK comprise all the low-level linear algebra subroutines that handle your matrix operations in R and other software. Fedora ships the reference implementation from Netlib, which is accurate and stable, but slow, as well as several optimized backends, such as ATLASBLIS (serial, OpenMP and threaded versions) and OpenBLAS (serial, OpenMP and threaded flavours as well). However, up to version 32, Fedora lacked a proper mechanism to switch between them.

We are excited to announce that this situation changes with the upcoming release, which is already in beta status. Starting with Fedora 33, R (as well as Numpy, Octave and all the other BLAS/LAPACK consumers) is linked against the outstanding FlexiBLAS library, a BLAS/LAPACK wrapper that enables runtime switching of the optimized backend, and the OpenMP version of OpenBLAS is set as the default system-wide backend.

Moreover, the accompanying flexiblas R package enables changing the BLAS/LAPACK provider, as well as setting the number of threads for parallel backends, without leaving the R session. Let’s give this a quick test using docker:

$ docker run --rm -it fedora:33
$ dnf install R-flexiblas # install R and the FlexiBLAS API interface for R
$ dnf install flexiblas-* # install all available optimized backends

Then, in an R session we see:

library(flexiblas)

# check whether FlexiBLAS is available
flexiblas_avail()
#> [1] TRUE

# get the current backend
flexiblas_current_backend()
#> [1] "OPENBLAS-OPENMP"

# list all available backends
flexiblas_list()
#> [1] "NETLIB"           "__FALLBACK__"     "BLIS-THREADS"     "OPENBLAS-OPENMP"
#> [5] "BLIS-SERIAL"      "ATLAS"            "OPENBLAS-SERIAL"  "OPENBLAS-THREADS"
#> [9] "BLIS-OPENMP"

# get/set the number of threads
flexiblas_set_num_threads(12)
flexiblas_get_num_threads()
#> [1] 12

This is an example of GEMM benchmark for all the backends available:

library(flexiblas)

n <- 2000
runs <- 10
ignore <- "__FALLBACK__"

A <- matrix(runif(n*n), nrow=n)
B <- matrix(runif(n*n), nrow=n)

# load backends
backends <- setdiff(flexiblas_list(), ignore)
idx <- flexiblas_load_backend(backends)

# benchmark
timings <- sapply(idx, function(i) {
  flexiblas_switch(i)

  # warm-up
  C <- A[1:100, 1:100] %*% B[1:100, 1:100]

  unname(system.time({
    for (j in seq_len(runs))
      C <- A %*% B
  })[3])
})

results <- data.frame(
  backend = backends,
  `timing [s]` = timings,
  `performance [GFlops]` = (2 * (n / 1000)^3) / timings,
  check.names = FALSE)

results[order(results$performance),]
#>            backend timing [s] performance [GFlops]
#> 1           NETLIB     56.776            0.2818092
#> 5            ATLAS      5.988            2.6720107
#> 2     BLIS-THREADS      3.442            4.6484602
#> 8      BLIS-OPENMP      3.408            4.6948357
#> 4      BLIS-SERIAL      3.395            4.7128130
#> 6  OPENBLAS-SERIAL      3.206            4.9906425
#> 7 OPENBLAS-THREADS      0.773           20.6985770
#> 3  OPENBLAS-OPENMP      0.761           21.0249671

For questions, suggestions or issues related to this R interface, please use its issue tracker or the R-SIG-Fedora mailing list. For more general issues, please use Red Hat Bugzilla or the upstream issue tracker. There are a couple of posters by the authors of FlexiBLAS (1, 2) with a similar demo for Octave.

cran2copr: RPM repos with 15k binary R packages

Bringing R packages to Fedora (in fact, to any distro) is an Herculean task, especially considering the rate at which CRAN grows nowadays. So I am happy to announce the cran2copr project, which is an attempt to maintain binary RPM repos for most of CRAN (~15k packages as of Feb. 2020) in an automated way using Fedora Copr.

Are you a Fedora user? Enable the CRAN Copr repo for your system:

$ sudo dnf copr enable iucar/cran

and you are ready to go. Packages are prefixed with R-CRAN-, e.g.:

$ sudo dnf install R-CRAN-rstanarm

Currently, only x86_64 chroots for supported (non-EOL) versions of Fedora, including rawhide, are enabled. If you are interested in other chroots (from the supported architectures and distros), please open an issue on GitHub expressing so, but it is unlikely that it will be enabled in the short to medium term due to current storage limitations in the Copr infrastructure.

These repos are automatically synchronized with CRAN every day at 00:00 UTC through a GitHub Action that removes archived packages and builds the most recent updates. If you find any issue with any of the supported packages (see details and limitations below), please open an issue on GitHub.

Acknowledgements

Thanks to the authors of cran2deb for the inspiration. Thanks to RedHat and, particularly, the Copr team for developing this tool and maintaining the Fedora Copr service for the Fedora community. And thanks to AWS too, because they provide a CDN for free.

simmer 4.4.0 on CRAN

The 4.4.0 release of simmer, the Discrete-Event Simulator for R, is on CRAN. This update stabilises a new pace of a couple of new releases per year, which is more appropriate given the maturity that the project has reached.

This release brings us a dozen bug fixes and improvements, including the unification of the leave/renege API, further enhancements of convenience function to set up generators, and performance improvements for the simulation environment definition thanks to the vectorisation of add_resource and add_generator. See below for a complete list of changes.

New features

  • Add out and keep_seized parameters to leave() with the same behaviour as in renege_in() and renege_if(). Code and documentation of these functions are now integrated under help(renege) (#208, #217).
  • Convenience functions fromto and from_to accept dynamic parameters for arguments start_timestop_time and every (#219).
  • Activities to interact with sources have been vectorised to modify multiple sources at once (#222).
  • Several generators or resources with the same parameters can be added with a single call to add_generator() and add_resource() respectively if a vector of names is provided (#221).

Minor changes and fixes:

  • Fix get_mon_*() dispatch for named lists (#210).
  • Get/put the RNG state when random numbers are required in the backend (#218).
  • Fix convenience functions fromto and from_to to preserve the environment of the supplied functions (as part of #219).
  • Documentation improvements (#212, #220).
  • Fix queueing in multiple resources after preemption (#224 addressing #206).

simmer 4.3.0 + JSS publication

The 4.3.0 release of simmer, the Discrete-Event Simulator for R, is on CRAN. Along with this update, we are very glad to announce that our homonymous paper finally appeared in the Journal of Statistical Software. Please, use the following reference for citations (see citation("simmer")):

  • Ucar I, Smeets B, Azcorra A (2019). “simmer: Discrete-Event Simulation for R.” Journal of Statistical Software90(2), 1-30. doi: 10.18637/jss.v090.i02 (URL: https://doi.org/10.18637/jss.v090.i02).

It took quite a lot of work and time, but we are very proud of the final result. We would like to thank the editorial team for their hard work, with special thanks to the anonymous referee for their thorough reviews and valuable comments, and Norman Matloff for his advice and support. Last but not least, we are very grateful for all the discussion and fruitful ideas that our growing community provides via the simmer-devel mailing list and GitHub.

The new release bring us the ability to keep seized resources after reneging, as well as to define a range of arrival priorities that are allowed to access a resource’s queue if there is no room in the server. We moved a lot of activity usage examples that were scattered in a far too long vignette to the appropriate help pages, and of course there is the usual share of bug fixes. See below for a complete list of changes.

Special thanks to Tom Lawton for his contributions to this release, and to Benjamin Sawicki for his generous donation.

New features

  • Add ability to keep_seized resources after reneging (#204 addressing #200).
  • Add ability to define a range of arrival priorities that are allowed to access a resource’s queue if there is no room in the server (#205 addressing #202).

Minor changes and fixes:

  • Drop R6 as a dependency (#193 addressing #190).
  • Small fix in from and from_to + documentation update (75a9569).
  • Move activity usage examples to help pages (#194).
  • Fix shortest-queue selection policies (#196).
  • Fix batch triggering (#203).
  • Update JSS paper, CITATION, references and DOI.