AWS S3

AWS S3

Quick how to: # drat::addRepo(account = "Ignacio", alturl = "https://drat.ignacio.website/") # install.packages("IMS3") library("IMS3") ## Loading required package: aws.s3 set.enviroment() bucketlist() ## c..ignacios.test.bucket....2019.03.19T13.21.52.000Z.. ## 1 ignacios-test-bucket ## 2 2019-03-19T13:21:52.000Z # save an in-memory R object into S3 s3save(mtcars, bucket = "ignacios-test-bucket", object = "mtcars.Rdata") # `load()` R objects from the file s3load("mtcars.Rdata", bucket = "ignacios-test-bucket") Video talking about this:
Continuous integration with R + testthat + gitlab + docker

Continuous integration with R + testthat + gitlab + docker

Why? I want to make sure that when I make a change to some complicated code nothing breaks. Moreover, I want to make sure that nothing breaks in a clean install. Getting gitlab up and running If you are reading this you probably know that I like gitlab better than bitbucket, github, and that awfaul thing that you are probably using. You can skip this and the next section and just use the hosted version of gitlab which gives you 2000 minutes per month to do this stuff.
Resources for people that want to go Bayesian

Resources for people that want to go Bayesian

This is my list of resources for people that want to go Bayesian. This list is very incompleate and I plan to update it over the next couple of weeks. Online videos and coursse What are Bayesian Methods? - OPRE in 60 Seconds Tiny Data, Approximate Bayesian Computation and the Socks of Karl Broman: Less than 20 minutes, and very easy to follow. Bayesian Regression Modeling with rstanarm: Very short and simple Ben Goodrich’s Bayesian Statistics for the Social Sciences: Semester long, totally worth it Videos Class material Richard McElreath’s Statistical Rethinking: Semester long, totally worth it Videos Class material Book Papers, books, vignettes, and blogs Bayesian data analysis for newcomers Speaking on Data’s Behalf: What Researchers Say and How Audiences Choose Why We (Usually) Don’t Have to Worry About Multiple Comparisons Visualization in Bayesian workflow Stan User’s Guide Bayesian Data Analysis Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan Data Analysis Using Regression and Multilevel/Hierarchical Models What works for whom?
Random number generation with Rcpp and OpenMP

Random number generation with Rcpp and OpenMP

The following code shows how to write some simple code to draw random numbers from a normal and a binomial distribution. Notice that instead of declaring A as a numeric matri Serial Double loop #include <Rcpp.h> using namespace Rcpp; // [[Rcpp::export]] NumericMatrix my_matrix(int I) { NumericMatrix A(I,2); for(int i = 0; i < I; i++){ A(i,0) = R::rnorm(2,1) ; A(i,1) = R::rbinom(1,0.5) ; } colnames(A) = CharacterVector::create("Normal", "Bernoulli"); return A; } set.
Hello Rcpp

Hello Rcpp

This past weekend I discovered the wonders of c++ thanks to this datacamp course. Although c++ syntax is different, knowing Fortran made this much easier. Filling a matrix with c++ The following code creates a function that can be called from R to fill a matrix. Something that is different than in Fortran is that to make loops more efficient you have to do right (j) to left (i) instead of left to right.
Hello World: R + Fortran + OpenMP

Hello World: R + Fortran + OpenMP

Why? I want to fill up a big matrix and I care about speed and to a lesser degree memory efficiency. In practice the matrix will have 4000 rows and K columns where K is the number of observations for which I want to run my predictive model. For this exercise I will keep K to just 500 because my R approach eats a ton of memory. For this simple exercise, I will \(A_{ik} = 1 / (1 + exp(i^2 + i^3 + k^2 + k^3))\) in practice the operation that I need to do is much more complicated which will make the difference is run time even bigger.
cast-web-api in a snap

cast-web-api in a snap

After the sd card on my raspberry pi died the prospect of creating a new one so I could connect my google cast devices to smartthings did not sound like fun. Installing cast-web-api is not trivial. So, I decided this was a good opportunity to create my first snap. Creating a snap to wrap a nodejs app was super easy. I shared my code on github, but this is the whole thing:

nvidia-docker + greta

Goal: Use greta with nvidia-docker Docker file: ## Based on work by https://github.com/earthlab/dockerfiles/blob/master/r-greta/Dockerfile ## https://github.com/rocker-org/ml ## rocker ## FROM nvidia/cuda:9.0-cudnn7-runtime MAINTAINER "Ignacio Martinez" ignacio@protonmail.com RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections ## Prepare R installation from RUN sh -c 'echo "deb https://cloud.r-project.org/bin/linux/ubuntu xenial-cran35/" >> /etc/apt/sources.list' \ && apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 RUN apt-get update \ && apt-get upgrade -y -q \ && apt-get install -y --no-install-recommends \ libapparmor1 \ r-base \ r-base-dev \ littler \ r-cran-littler \ libxml2-dev \ libxt-dev \ libssl-dev \ libcurl4-openssl-dev \ imagemagick \ python-pip \ libpython2.

Cloud computing with R and AWS

Why? You want to run R code on the cloud. For whatever reason, you don’t want to use google nor azure. Credit I took most of the code from this gist The code This function takes a list with your instances, the path to your private key, and returns a cluster object that can be used with the future package. I was told that this function will be part of a new package soon.

Embarrassingly Parallel Computing with doAzureParallel

Why? You want to run 100 regressions, they each take one hour, and the only difference is the data set they are using. This is an embarrassingly parallel problem. For whatever reason, you want to use Azure instead of google compute engine… Before you start I will assume that: you have an Azure account, you have correctly installed, and configured doAzureParallel Create some fake data library(dplyr) library(stringr) set.