Introduction
Relation to F# RProviderGetting set up
There is a page gathering Software Prerequisites listing the platforms on which R.NET is known to run.As of version 1.6, R.NET binaries are platform independent. You might need to set up a small add-on workaround on some Linux distributions (CentOS a known one), but you can just move and use the R.NET binaries across platforms.
Assuming you have the right Software Prerequisites, you can obtain R.NET binaries from two sources
- The codeplex web site, by downloading a zip file
- The nuget.org web site
Visual Studio
NuGet is the preferred way to manage dependencies on R.NET. Consider giving it a try: this is the way of the future for .NET projects...If you are using the binaries from the zip file distribution, unzip the file and copy the content to a location of your choice. Add project references to RDotNet.dll and RDotNet.Native.dll the "usual" way.
If you are using the NuGet packages:
You first have to install, if you have not already, the NuGet package manager via Tools - Extension and Updates:
You can add the R.NET package as a dependency to one or more projects in your solution. For one project:
The NuGet system then adds a couple of references.
You can manage several projects in one go at the solution level:
You can find more general information about NuGet at NuGet documementation
Xamarin Studio
This section is a placeholder.Getting started
Version 1.6 of R.NET includes significant changes notably to alleviate two stumbling blocks often dealt with: paths to the R shared library, and preventing multiple engine initializations.The following "Hello World" sample illustrates how the new API is simpler in 90% of use cases on Windows:
staticvoid Main(string[] args) { REngine.SetEnvironmentVariables(); // <-- May be omitted; the next line would call it. REngine engine = REngine.GetInstance(); // A somewhat contrived but customary Hello World: CharacterVector charVec = engine.CreateCharacterVector(new[] { "Hello, R world!, .NET speaking" }); engine.SetSymbol("greetings", charVec); engine.Evaluate("str(greetings)"); // print out in the consolestring[] a = engine.Evaluate("'Hi there .NET, from the R engine'").AsCharacter().ToArray(); Console.WriteLine("R answered: '{0}'", a[0]); Console.WriteLine("Press any key to exit the program"); Console.ReadKey(); engine.Dispose(); }
You retrieve a single REngine object instance, after setting the necessary environmental variables. Even the call to SetEnvironmentVariables can be omitted, though we'd advise you keep it explicit. SetEnvironmentVariables looks at the Registry settings set up by the R installer on Windows. If need be, you can override the behaviours setting the environment variables and engine initialization with your own steps, detailed in the Appendix.
On Linux/MacOS, the path to libR.so (for Linux) must be in the environment variable LD_LIBRARY_PATH before the process start, otherwise the R.NET engine will not properly initialize. If this is not set up, R.NET will throw an exception with a detailed message. Read the Appendix at the end of this page if R.NET complains about your LD_LIBRARY_PATH.
Sample code
The following sample code illustrate the most use capabilities. It is extracted from the sample code 2 at https://github.com/jmp75/rdotnet-onboarding, as of 2014-04.You usually interact with the REngine object with the methods Evaluate, GetSymbol, and SetSymbol. To create R vector and matrices, the REngine object has methods such as CreateNumericVector, CreateCharacterMatrix, etc. Finally, you can invoke R functions in a variety of ways, using Evaluate, and also more directly.
Let's re-emphasize the need to set up the environment variables, creating the engine, and also initializing it.
SetupHelper.SetupPath(); // See earlier sample code REngine engine = REngine.CreateInstance("RDotNet") engine.Initialize();
Basic operations with numeric vectors
var e = engine.Evaluate("x <- 3"); // You can now access x defined in the R environment NumericVector x = engine.GetSymbol("x").AsNumeric(); engine.Evaluate("y <- 1:10"); NumericVector y = engine.GetSymbol("y").AsNumeric();
While you may evaluate function calls by generating a string and call the Evaluate method, this is quickly unwieldy for cases where you pass large amounts of data. The following demonstrates how you may call a functions, a bit like how you would invoke a function reflectively in .NET. Note that the sample is designed for R.NET 1.5.5; there are syntactic improvements under development that should be available in the next release.
// Invoking functions// invoking expand.grid directly would not work as of R.NET 1.5.5, // because it has a '...' pairlist argument. We need a wrapper function.var expandGrid = engine.Evaluate("function(x, y) { expand.grid(x=x, y=y) }").AsFunction(); var v1 = engine.CreateIntegerVector(new[] { 1, 2, 3 }); var v2 = engine.CreateCharacterVector(new[] { "a", "b", "c" }); var df = expandGrid.Invoke(new SymbolicExpression[] { v1, v2 }).AsDataFrame();
Continuing with the results of our use of expand.grid, the following code illustrate that while R.NET tries to emulate the coercion behavior of R, you should be circumspect with it, notably when you know you have factors as columns in your data frame.
engine.SetSymbol("cases", df); // Not correct: the 'y' column is a "factor". This returns "1", "1", "1", "2", "2", etc. var letterCases = engine.Evaluate("cases[,'y']").AsCharacter().ToArray(); // This returns something more intuitive for C# Returns 'a','a','a','b','b','b','c' etc. letterCases = engine.Evaluate("as.character(cases[,'y'])").AsCharacter().ToArray(); // In the manner of R.NET, try letterCases = engine.Evaluate("cases[,'y']").AsFactor().GetFactors();
To reuse whole scripts, the simplest method is to use the 'source' function in R
engine.Evaluate("source('c:/src/path/to/myscript.r')");
Data Types
All expressions in R are represented as SymbolicExpression objects in R.NET. For data access, the following special classes are defined. Note that there is no direct equivalent in .NET for 'NA' values in R. Special values are used for some types but pay attention to the behaviour, so as not to risk incorrect calculations.Table. Classes in R.NET bridges between R and .NET Framework.
R | R.NET | .NET Framework | Note |
---|---|---|---|
character vector | RDotNet.CharacterVector | System.String[] | |
integer vector | RDotNet.IntegerVector | System.Int32[] | The minimum value in R is -2^31+1 while that of .NET Framework is -2^31. Missing values are int.MinValue |
real vector | RDotNet.NumericVector | System.Double[] | Missing values are represented as double.NaN |
complex vector | RDotNet.ComplexVector | System.Numerics.Complex[] | System.Numerics assembly is required for .NET Framework 4. |
raw vector | RDotNet.RawVector | System.Byte[] | |
logical vector | RDotNet.LogicalVector | System.Boolean[] | |
character matrix | RDotNet.CharacterMatrix | System.String[, ] | |
integer matrix | RDotNet.IntegerMatrix | System.Int32[, ] | The minimum value in R is -2^31+1 while that of .NET Framework is -2^31. |
real matrix | RDotNet.NumericMatrix | System.Double[, ] | |
complex matrix | RDotNet.ComplexMatrix | System.Numerics.Complex[, ] | Reference to System.Numerics assembly is required. |
raw matrix | RDotNet.RawMatrix | System.Byte[, ] | |
logical matrix | RDotNet.LogicalMatrix | System.Boolean[, ] | |
list | RDotNet.GenericVector | From version 1.1. | |
data frame | RDotNet.GenericVector | From version 1.1. RDotNet.DataFrame class is also available (below). | |
data frame | RDotNet.DataFrame | From version 1.3. And from version 1.5.3, DataFrameRowAttribute and DataFrameColumnAttribute are available for data mapping. | |
function | RDotNet.Function | From version 1.4. Including closure, built-in function, and special function. | |
factor | RDotNet.Factor | System.Int32[] | From version 1.5.2. |
S4 | RDotNet.S4Object | Not Available Yet. See S4 branch in the source control. |
Acknowledgements
- Daniel Collins found the workaround for the native library "libdl" loader that was not working on at least some CentOS Linux distributions.
- evolvedmicrobe contributed to several features to run on MacOS and Linux, and initiated the changes to make R.NET platform independent.
- Kosei initiated R.NET
- gchapman
Appendices
Updating environment variables on Linux (MacOS?)
The path to libR.so (for Linux) must be in the environment variable LDLIBRARYPATH before the process start, otherwise the R.NET engine will not properly initialize. If this is not set up, R.NET will throw an exception with a detailed message.What you will need to do there depends on the Linux (MacOS?) machine you are.
Let's say you needed to compile your own R from source, to get a shared R library:
LOCAL_DIR=/home/username/local JAVAHOME=/apps/java/jdk1.7.0_25 cd ~src cd R/ tar zxpvf R-3.0.2.tar.gz cd R-3.0.2 ./configure --prefix=$LOCAL_DIR --enable-R-shlib CFLAGS="-g" make make install
Then prior to running a project with R.NET, you may need to update your LDLIBRARYPATH, and quite possibly PATH (though the latter can be done at runtime too).
LOCAL_DIR=/home/username/local if [ "${LD_LIBRARY_PATH}" != "" ] then export LD_LIBRARY_PATH=$LOCAL_DIR/lib:$LOCAL_DIR/lib64/R/lib:/usr/local/lib64:${LD_LIBRARY_PATH} else export LD_LIBRARY_PATH=$LOCAL_DIR/lib:$LOCAL_DIR/lib64/R/lib:/usr/local/lib64 fi # You may as well update the PATH environment variable, though R.NET does update it if need be. export PATH=$LOCAL_DIR/bin:$LOCAL_DIR/lib64/R/lib:${PATH}