Remember Me
Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:

OpenAIRE is about to release its new face with lots of new content and services.
During September, you may notice downtime in services, while some functionalities (e.g. user registration, login, validation, claiming) will be temporarily disabled.
We apologize for the inconvenience, please stay tuned!
For further information please contact helpdesk[at]openaire.eu

fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Bisht, Reshu; Martin Marquez, Manuel; Romero Marin, Antonio; Hornick, Mark (2016)
Types: Report
Subjects: CERN openlab summer student
Project Specification The goal of this Openlab summer student project is to evaluate Oracle R Advanced Analytics for Hadoop (ORAAH) as a platform for CERN Big Data Analytics. It provides R interface to manipulate the data stored in HDFS, relational databases and file systems along with leveraging the capabilities of CRAN R packages. ORAAH also uses both HIVE transparency capabilities and maps HDFS as direct input into Machine Learning algorithms that can run as Map Reduce jobs or inside an Apache Spark container. The objective was to determine if the Cryogenic Valve Degradation analyses and Automatic Detection of Faulty valve can be efficiently done, in terms of CPU usage and time for completion of the job using ORAAH and Apache Spark. ORAAH allows you to run your R queries on Hadoop against data in HDFS, giving you the benefits of using R while taking advantage of the horizontal scalability of Hadoop and MapReduce. ORAAH is designed for parallel reads and writes, has resource management and database connectivity features, so it can be used together with ORE. Apart from this, the components like ORAAH Spark MLlib algorithms make the analysis, classification and prediction problems easier. Software Specifications:  BigDataLite VM 4.5.0  Oracle R Advanced Analytics for Hadoop 2.6.0 o ORCH package  Interaction with HDFS  Map-Reduce Jobs  Orch.ml.svm (Spark MLlib container for Support Vector Machine Algorithm)  Apache Spark 1.6.0 o Spark SQL  com.databricks:spark-csv_2.10:1.4.0 o Spark MLlib  org.apache.spark.mllib.classification{SVMModel,SVMWithSGD}  org.apache.spark.mllib.evaluation.BinaryClassificationMetrics  org.apache.spark.mllib.util.MLUtils  R 3.3.0 o caret o rpart o e1071 o ggplot2 o reshape Abstract I present an evaluation of Oracle R Advanced Analytics for Hadoop as a Big Data Analysis platform for advance analytics and machine learning. I have used R as a basic modelling tool as it one of the most powerful statistical and computing languages with a number of predefined functionalities available to allow an easy analysis and testing of data. To provide a comparison and truly judge the performance of ORAAH, Apache Spark has been used to model the same approaches. The performance has been measured on the basis of the time consumed to build the model and the accuracy of the model. The task in this project was aimed to study the potential applicability of the aforementioned technologies using real CERN analytics use cases: (a) The degradation analysis of cryogenic valves in LHCb (b) Predict faulty cryogenic valves. The above mentioned use cases were duly run and modelled using the technologies mentioned earlier and the results computed provided very promising statistics for future use of scalable services as CERN Big Data Analytics platform.
  • No references.
  • No related research data.
  • No similar publications.

Share - Bookmark

Download from

Cite this article

Collected from

Cookies make it easier for us to provide you with our services. With the usage of our services you permit us to use cookies.
More information Ok