Packt Publishing, 2016. — 498 p. — ISBN10: 1786466457, ISBN13: 978-1786466457.
Big Data analytics is the process of examining large and complex data sets that often exceed computational capabilities. R is a leading programming language of data science, consisting of powerful functions to tackle all problems related to Big Data processing.
The book will begin with a brief introduction to the Big Data world and its current industry standards. With an introduction to the R language and presenting its development, structure, applications in the real world, and its shortcomings. Book will progress towards the revision of major R functions for data management and transformations. Readers will be introduced to Cloud-based Big Data solutions (e.g. Amazon EC2 instances and Amazon RDS, Microsoft Azure and its HDInsight clusters) and also provide guidance on R connectivity with relational and non-relational databases such as MongoDB and HBase, etc. It will further expand to include Big Data tools such as Apache Hadoop ecosystem, HDFS and MapReduce frameworks. Also other R compatible tools such as Apache Spark, its machine learning library Spark MLlib, as well as H2O.
True PDF