Java-R-Integration with JRI for On-Demand Predictions

This article provides you with a short overview of how to use JRI for using R from within a Java application. In particular, it will give you an understanding of how to use this technology for on-demand predictions based on R models.

Note: Trivial aspects such as constant definitions or exception handling are omitted in the provided code snippets.

What is JRI?

JRI is a Java/R Interface providing a Java API to R functionality. A JavaDoc specification of this interface (org.rosuda.JRI.Rengine) can be found here. The project homepage describes how to initially set up JRI in various environments.

Typical Use Cases for On-Demand Predictions via JRI

Classification or numeric prediction models embedded in R scripts can originate from legacy implementations or conscious decisions to use R for a certain use case. Typically, a static set of data is used to train and validate a model which can then be applied to another static set of unclassified data. However, this approach rather aims at deriving general insights from data sets than predicting concrete instances. It is in particular insufficient for systems with real-time user interaction, for example for custom welcome screens depending on the estimated value of a user that has just registered.

Hello R World from Java

After installing R as well as the JRI package, any Java application will be able to instantiate org.rosuda.JRI.Rengine after adding the corresponding JARs to its build path. The following simplistic example demonstrates how we can use the R interface.

import org.rosuda.JRI.Rengine;
import org.rosuda.JRI.REXP;

public class HelloRWorld {
   Rengine rengine; // initialized in constructor or autowired

   public void helloRWorld() {
      rengine.eval(String.format("greeting <- '%s'", "Hello R World"));
      REXP result = rengine.eval("greeting");
      System.out.println("Greeting from R: "+result.asString());
   }
}

Calling Rengine.eval(String) corresponds to typing commands to the R console and hence provides access to any required functionality. Note that even in this trivial example two separate calls share a common context which is maintained throughout the lifecycle of a Rengine instance. Objects of org.rosuda.JRI.REXP encapsulate any output from R to the user. Depending on the evaluated command, other methods than REXP.asString() may be suitable for extracting its result (see JavaDoc).

Running .R scripts from Java

Even though it would be possible to implement large R scripts in Java by passing each statement to Rengine.eval(String), this is much less convenient then writing or even re-using traditional .R scripts. So let’s have a look at how we can achieve the same result with a slightly different solution.

Project structure:

/src/main
      /java
         com.comsysto.jriexample.HelloRWorld2.java
      /resources
         helloWorld.R

helloWorld.R:

greeting <- 'Hello R World'

HelloRWorld2.java:

import org.rosuda.JRI.Rengine;
import org.rosuda.JRI.REXP;
import org.springframework.core.io.ClassPathResource;

public class HelloRWorld2 {
   Rengine rengine; // initialized in constructor or autowired

   public void helloRWorld() {
      ClassPathResource rScript = new ClassPathResource("helloWorld.R");
      rengine.eval(String.format("source('%s')",
         rScript.getFile().getAbsolutePath()));
      REXP result = rengine.eval("greeting");
      System.out.println("Greeting from R: "+result.asString());
   }
}

Any .R script can be executed like this and all variables it adds to the context will be accessible via JRI. However, this code does not work if the Java application is packaged as a JAR or WAR archive because the .R script will not have a valid absolute path. In this case, copying the script to a regular folder (e.g. java.io.tmpdir) at runtime and passing the temporary file to R is a feasible workaround.

Training R Models with Java Application Data

Now that we know how to execute .R scripts using JRI we are able to integrate prediction models based on R into a Java application. The only remaining question is: how can R access the required data for training the model? The easiest way is to use the following file-based approach. We will build a linear regression model that predicts y from x1 and x2.

  1. Extract suitable data from the Java persistence layer and store it in a temporary .csv file.
    import au.com.bytecode.opencsv.CSVWriter;
    
    private void writeTrainingDataToFile(File file) {
       CSVWriter writer = new CSVWriter(new FileWriter(file), ";");
       writer.writeNext(new String[] {"x1","x2","y"});
       for (Instance i : instances) {
          writer.writeNext(new String[] {i.x1, i.x2, i.y};
       }
       writer.close();
    }
  2. Pass the location of this file to R using JRI.
    public void trainRModel() {
       File trainingData = new File(TRAINING_DATA_PATH);
       writeTrainingDataToFile(trainingData);
       rengine.eval(String.format("trainingDataPath <- '%s'",
          trainingData.getAbsolutePath()));
       // trigger execution of .R script here
    }
  3. Execute a .R script to build the model. The script needs to be syntactically compatible with the extracted .csv file.
    # trainingDataPath injected from Java
    data <- read.csv(trainingDataPath, header=TRUE, sep=";");
    # use linear regression model as a trivial example
    model <- lm(y ~ x1+x2, data);

After executing this script, the resulting model will be available for any future calls until the entire application or the Rengine instance is re-initialized.

On-Demand Predictions using the R Model

With the knowledge we already have, predicting a new instance (x1,x2) with unknown y is now pretty straightforward:

public double predictInstance(int x1, int x2) {
   rengine.eval(String.format("newData <- data.frame(x1=%s,x2=%s)",
      x1, x2));
   REXP result = rengine.eval("predict(model, newData)");
   return result.asDouble();
}

If you have any feedback, please write to Christian.Kroemer@comsysto.com!

About these ads

44 thoughts on “Java-R-Integration with JRI for On-Demand Predictions

  1. Pingback: R Exception Handling in Java | comSysto Blog

      1. Swarga Bera

        I want to integrate R with my web application which is in java using struts framework.I am unable to integrate.

      2. Swarga Bera

        Actually i have define a java class with two method
        package com.test;

        import org.rosuda.JRI.REXP;
        import org.rosuda.JRI.Rengine;

        public class test3 {
        Rengine rengine; // initialized in constructor or autowired

        public void helloRWorld() {

        rengine.eval(String.format(“greeting <- '%s'", "Hello R World"));

        REXP result = rengine.eval("greeting");

        System.out.println("Greeting from R: "+result.asString());

        }

        public double predictInstance(int x1, int x2) {

        rengine.eval(String.format("greeting <- data.frame(x1=%s,x2=%s)",x1, x2));

        REXP result = rengine.eval("predict(model, newData)");

        return result.asDouble();

        }
        }

        now i want to fetch any of this metods from another class. At that time i am getting null point exception

      3. chkroemer Post author

        if that’s your actual code I would assume the Rengine object cannot be resolved as it never gets initialized. The comment “// initialized in constructor or autowired” indicates that you need to choose either of these options.

        As a minimal example, this code works for me:

        public static void main(String[] args) {
        org.rosuda.JRI.Rengine rengine = new org.rosuda.JRI.Rengine(new String[]{“–vanilla”}, false, null);
        rengine.eval(String.format(“greeting <- '%s'", "Hello R World"));
        org.rosuda.JRI.REXP result = rengine.eval("greeting");
        System.out.println("Greeting from R: "+result.asString());
        }

      4. Swarga Bera

        I have run your code but now it showing
        Cannot find JRI native library!
        Please make sure that the JRI native library is in a directory listed in java.library.path.

        java.lang.UnsatisfiedLinkError: no jri in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1709)
        at java.lang.Runtime.loadLibrary0(Runtime.java:823)
        at java.lang.System.loadLibrary(System.java:1030)
        at org.rosuda.JRI.Rengine.(Rengine.java:9)
        at com.test.maintest.main(maintest.java:9)

        But i have already attached JRI.jar,JRIEngine.jar & REngine.jar
        So why this type of exception throws

      5. Swarga Bera

        Yes i have installed R properly & i have written r program in my R console. Also i have set correctly R_HOME with my R installed path ‘C:\Program Files\R\R-3.0.1′.
        But i am unable to understand why your send code has not run in my computer.

      6. chkroemer Post author

        I haven’t had the opportunity to test my code on a Windows machine yet, it might be there is additional configuration required.

        How about this note on the JRI website?

        “(Windows): The directory containing R.dll must be in your PATH”

      7. Swarga Bera

        I have changed R_HOME path ‘C:\Program Files\R\R-3.0.1\bin\i386′ where R.dll is present.But still now same problem occur and throws same exception

        Cannot find JRI native library!
        Please make sure that the JRI native library is in a directory listed in java.library.path.

        java.lang.UnsatisfiedLinkError: no jri in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1709)
        at java.lang.Runtime.loadLibrary0(Runtime.java:823)
        at java.lang.System.loadLibrary(System.java:1030)
        at org.rosuda.JRI.Rengine.(Rengine.java:9)
        at com.test.maintest.main(maintest.java:9)

      8. Swarga Bera

        I have set ‘C:\Program Files\R\R-3.0.1\bin\i386\R.dll’ in global system path.
        But same error occur

      9. chkroemer Post author

        Swarga,

        you might experience some OS-specific issues but I’m not sure what to try next. I fear that I can’t suggest anything else at the moment if even the minimal example does not work for you.

        I will post my results as soon as I find the time to test it on a Windows computer.

        Cheers,
        Christian

      10. chkroemer Post author

        Hi Swarga,

        I just played around with a Windows Server 2008 on AWS and maybe I found the missing clue:

        1. In R, command install.packages(“rJava”) and inspect the console output for the location of a zip file that has been downloaded.
        2. After unpacking this zip file, you should be able to find the file rJava/jri/x64/jri.dll (if you’re not on a 64-bit system choose the other one).
        3. This dll file needs to be anywhere in your path, e.g. in “C:\Program Files\R\R-3.0.1\bin\x64″ where you can already find the regular R.dll

        Try again, I hope this helps!

      11. Swarga Bera

        Hi,
        I have installed ‘rJave’ as your instruction. But now a new exception throws

        Cannot find JRI native library!
        Please make sure that the JRI native library is in a directory listed in java.library.path.

        java.lang.UnsatisfiedLinkError: C:\Program Files\R\R-3.0.1\bin\x64\jri.dll: Can’t load AMD 64-bit .dll on a IA 32-bit platform
        at java.lang.ClassLoader$NativeLibrary.load(Native Method)
        at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1778)
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1703)
        at java.lang.Runtime.loadLibrary0(Runtime.java:823)
        at java.lang.System.loadLibrary(System.java:1030)
        at org.rosuda.JRI.Rengine.(Rengine.java:9)
        at com.test.maintest.main(maintest.java:9)

        Is it for the problem of version of R.As my system is 64 bit with windows 8 but I am using R which was in 32 bit exe.

      12. chkroemer Post author

        Yeah you should make sure the versions (ideally Java, R, and all native R libs match). It doesn’t necessarily have to be the same as your OS.

  2. chris bedford

    Hi, Christian –

    Very useful article..Thanks for writing it.

    I’ve been trying to get my Java code to execute predictions with R linear models just like you have accomplished. However, when I try to use R’s
    predict facility I just get back zero, rather than what I would get if I ran the same script in R.

    I am using the script below, which relies on ‘prepackaged’ R data which, at least for me is available right out of the box >

    stackloss.lm = lm(stack.loss ~ Air.Flow + Water.Temp + Acid.Conc., data=stackloss)
    newdata = data.frame(Air.Flow=72, Water.Temp=20, Acid.Conc.=85)
    x = predict(stackloss.lm, newdata)
    x

    When run in R this gives me the (non-zero) output:

    > x
    1
    24.58173
    >

    I set up a simple test program riffing off the method described in your blog post. I wrote some ‘sanity check’ code to make sure that I was able to do an eval of an R script that returned a floating point number, and that works fine. But when I invoke my test method so that it runs the ‘stackloss’ linear model. I get back zero. I’m really stumped, and I have no idea how to debug this… I get no exception and I don’t know if there are any logs. I don’t want to debug this at the source level because my JNI is not very strong. If you have any ideas, I’d greatly appreciate the advice. My test program is shown below >

    package com.lackey;

    import org.rosuda.JRI.REXP;
    import org.rosuda.JRI.Rengine;

    public class JriTest {
    public static void main (String[] args)
    {
    String []engineArgs = new String[1];
    engineArgs [0] = “–vanilla”;

    // new R-engine
    Rengine re=new Rengine (engineArgs, false, null);
    if (!re.waitForR())
    {
    System.out.println (“Cannot load R”);
    return;
    }

    // print a random number from uniform distribution — THIS SEEMS TO WORK WELL !
    System.out.println (re.eval (“runif(1)”).asDouble ());

    String script = “”;
    if (args[0].equals(“return-num”) ){
    script = getScript1();

    } else if (args[0].equals(“linear-model”) ){
    script = getScript2();
    System.out.println(“we are going to evaluate: \n” + script);

    } else {
    throw new RuntimeException(“invalid argument”);
    }

    System.out.println(“22 getting ready”);
    REXP result = re.eval(script);
    System.out.println(“eval’d and got ” + result.asDouble());

    // done…
    re.end();
    }

    public static String getScript1() {
    String retval =
    “x = 4.02 + 1\n” +
    “x\n”;
    return retval;

    }

    public static String getScript2() {
    String retval =
    “stackloss.lm = lm(stack.loss ~ Air.Flow + Water.Temp + Acid.Conc., data=stackloss)\n” +
    “newdata = data.frame(Air.Flow=72, Water.Temp=20, Acid.Conc.=85)\n” +
    “x = predict(stackloss.lm, newdata)\n” +
    “x\n”;
    return retval;

    }

    }

    RESULT OF RUNNING the ‘stackloss’ linear model :

    we are going to evaluate:
    stackloss.lm = lm(stack.loss ~ Air.Flow + Water.Temp + Acid.Conc., data=stackloss)
    newdata = data.frame(Air.Flow=72, Water.Temp=20, Acid.Conc.=85)
    x = predict(stackloss.lm, newdata)
    x

    22 getting ready
    eval’d and got 0.0 <<<<<—– Note, this is very wrong ;^(

    Reply
    1. chkroemer Post author

      Chris,

      the problem is that JRI cannot evaluate multiple expressions at once. If you print “result” instead of “result.asDouble” you will see that R has simply evaluated the first statement of your script. When trying to cast this complex vector as double it returns 0 (an exception would also be no suprise).

      This problem can be avoided when providing the full script as a .R file instead of a String as in my example above. Alternatively, this code based on your example works for me:

      import org.rosuda.JRI.REXP;
      import org.rosuda.JRI.Rengine;

      public class JriTest {
      public static void main (String[] args)
      {
      String []engineArgs = new String[1];
      engineArgs [0] = “–vanilla”;
      Rengine re=new Rengine (engineArgs, false, null);

      re.eval(“stackloss.lm = lm(stack.loss ~ Air.Flow + Water.Temp + Acid.Conc., data=stackloss)”);
      re.eval(“newdata = data.frame(Air.Flow=72, Water.Temp=20, Acid.Conc.=85)”);
      REXP result = re.eval(“predict(stackloss.lm, newdata)”);
      System.out.println(“eval’d and got ” + result.asDouble());

      re.end();
      }
      }

      Hope this helps,
      Christian

      Reply
      1. titus id

        Hi, could you please help me.
        I try to run above code, I already setup the environment and variable in VM, however I still get this following error
        “cannot find system Renviron”
        although I already change the permission using this command
        “sudo chmod 644 /usr/lib/R/etc/Renviron”

        Thanks in advance

    1. Chris Bedford

      Christian – here’s an entry from my blog that builds on the work you did. I do a bit of contrasting of JRI with other solutions as well as provide a maven code example and set up instructions. Hope you find it interesting: buildlackey.com/integrating-r-and-java-with-jrirjava-a-jni-based-bridge/

      Reply
      1. chkroemer Post author

        thanks for writing and sharing this article Chris! I particularly like how you took the time to explain how this all fits into a bigger context: “data scientists often develop models in R, which are then handed off to data engineers to be re-implemented in languages like C++ or Java” is exactly why we have a look at such solutions here at comSysto.

  3. Dionysios

    Thanks a lot for this. I have an additional question on top of this. Is it possible to load a R file (e.g. something like source(“….R”) which will contain the code for building the model instead of executing each line with r.eval?

    Reply
    1. chkroemer Post author

      Dionysios,
      yes, executing R source files works and I would highly recommend it in a more practical use case. After all, you probably want to separate your R logic from the Java application wrapped around it. You would use something like this code:


      rengine.parseAndEval(String.format("source('%s')", rSourceFile.getAbsolutePath()));

      This is trivial if you have a proper File reference to the script, if you want to deploy the R script as a resource within your JAR/WAR you will need some sort of workaround like so:


      ClassPathResource script = new ClassPathResource("/com/sample/package/scripts/myScript.R");
      String javaTmpDirPath = System.getProperty("java.io.tmpdir");
      File tmpScriptFile = new File(javaTmpDirPath+"myScript.R");
      IOUtils.copy(script.getInputStream(), new FileOutputStream(tmpFile)); // import org.apache.commons.io.IOUtils;
      rengine.parseAndEval(String.format("source('%s')", tmpScriptFile.getAbsolutePath()));

      This gives R the opportunity to actually read the file from disk.

      Reply
      1. Dionysios

        Thanks a lot for the fast reply.
        In my .R program, I create a model with name model.
        However, when I execute the summary of the model or just print the content of the model I get null.

        REXP result = re.eval(“summary(model)”);
        System.out.println(“Model summary: ” + result.asString());

        If I test the .R program separetely, everything works fine, so I guess this is an issue with the Java part.

      2. chkroemer Post author

        just to make sure there is no easy solution to your problem: have you checked whether trivial R code (without reading it from an external script) works via JRI? Something like storing a value in a variable and reading it again. Furthermore, it is often a good idea to check the “REXP result” object itself without calling any method such as asString() just to make sure you understand what JRI returns.

    1. chkroemer Post author

      thanks for sharing! Handling the result you get from R if it’s more than a single value is often quite complex with JRI. In my opinion this is a major reason why so many people refuse to use it at all…

      So you got your code working now?

      Reply
  4. Dionysios

    Hi again Christian,

    Did you manage to run the source command for a .R file in the resources folder when you package your code in a .jar? I am struggling to achieve this and even if I hardcopy the resources folder to the same folder as the jar and do re.eval(“source(‘resources/buildRF.R’)”), I get the following:

    Exception in thread “main” java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)

    Reply
    1. chkroemer Post author

      Dionysios,

      reading files via the standard API of a framework when running in a jar is indeed not trivial. I don’t know if this can be considered a “recommended” solution, but some time ago I fixed a similar issue like this (sorry for ambiguous description, don’t have the code in front of me atm):

      1. get a stream to the resource within your jar (e.g. using Spring’s ClassPathResource)
      2. copy the content of this stream to a new file in java’s temp folder on the actual hard disk of whatever environment you’re running on (System.getProperty(“java.io.tmpdir”))
      3. use the path to this file on the hard disk, pass it to whatever framework (in your case to the REngine)

      not very elegant, but it worked for me and I was fine with it for a POC that only needs to read this particular script once

      hope this helps,
      Christian

      Reply
      1. Dionysios

        Hi Christian,

        I tried this approach and it worked on my local machine (Java 1.7.11) by using the .jar file.

        However, I tried the same jar on a cluster (Java 1.7.01) that I have access on and I could not run it by getting the same error message. For some reason, it cannot read the created temp file (which I checked that it properly exists) and it throws java.lang.NullPointerException on the following:
        String.format(“source(‘%s’)”, tempFile.getAbsolutePath())

  5. Dionysios

    Oh God! I made a silly mistake, a library of R was not installed on the new machine and it could not load my file (I thought that I will get some error message back at least). But now everything works fine.

    Thanks a lot again!

    Reply
  6. Irene

    hey Christian and everyone :)
    I’ve used this “tutorial” for my first baby steps and it helped me so very much (thank you for that!). Everything was working perfectly fine however a few days back that i got back into coding i noticed that the java application has been crashing on Rengine initialization (Rengine re=new Rengine(args, false, new TextConsole()); ) No R input, no nothing. Just on my debugging console:
    “Creating Rengine (with arguments)
    Java Result: 10
    BUILD SUCCESSFUL (total time: 8 seconds)”

    tested both with eclipse and netbeans, same issue…
    Any ideas????

    thanks in advance :)

    Reply
    1. Irene

      hi, me again (sorry for spamming). I just solved previous issue and just in case someone goes through the same trouble…
      obviously some java update was the root of all evil. So I updated everything (R and JRI). Updating only JRI didn’t do any good in my case. Now everything’s back to normal and order is restored in universe :)
      Cheers.

      Reply
  7. Pingback: Getting JRI works for R and Java | Ungeek Traveler

  8. kirdape

    Thanks Christian for the great post and so many replies.

    I’m having an issue in WIndows 8. I’ve installed R 3.1.0, set R_HOME, installed rJava (with install.packages(“rJava”), set up my java path.

    When I run your example I get no output and when debugging “terminated, exit value 10″

    Any help?

    Thanks in advance!

    Reply
    1. chkroemer Post author

      unfortunately I have no experience with R/rJava/JRI in the Windows environment. If even the Hello World example is not working for you it’s definitely a configuration issue, I’m afraid I can’t be of much help there… The release notes here (http://rforge.net/JRI/news.html) indicate that v0.5-5 will fix some x64 issues on Windows, I assume you are on Win64? Have you tried using a 32-bit setup (complete stack, i.e. JDK, R, …)?

      Reply
      1. kirdape

        Thanks a lot for your quick answer :)

        My main webapp is already production deployed with Java64 so 32-bit will not do.

        I need to run some R macros that read and writes some CSV files.

        RCaller is also not working http://mhsatman.com/rcaller/

        I’ll be trying to call R through the command line and parse the CSV file,

  9. domain

    You actually make it appear so easy along with your presentation but I to
    find this matter to be actually one thing that I
    believe I’d by no means understand. It seems too complex and very broad for me.
    I am looking ahead in your next submit, I will try to get the
    grasp of it!

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s