Hello Friends, SQL Server always comes with exciting features in every new version upgrades. Finally; I have been fortunate enough to start working in SQL SERVER 2016.
In this blog, We will see about integration of “R” language in SQL SERVER.I have gone through multiple MSDN chapters, white papers and learning videos about R-Services, R-language, statistical mathematics including regression analysis and found that its really awesome to have “R” with SQL SERVER.
As we all know, now a day’s data analytics is very exciting stuff to work on and “R” is a very powerful tool to do statistical analysis. The integration of “R” with SQL SERVER is a very powerful combo in terms of data processing and statistical analysis on heavy data.
I would like to break this topic in following points:
- Introduction of R language.
- What is Statistical analysis?
- How to call “R” scripts in SQL Server 2016.
- Launchpad Service.
- Different flavors of “R” in SQL Server 2016
R language : “R” is an open source, statistical programming software, which is used for statistical analysis. It can be download from software repository CRAN. “R” is widely used by data scientists for data analysis and also have very wide range of user group. “R” version used in SQL SERVER 2016 is R-Open and called R-Services in SQL SERVER.
Statistical Analysis: Statistical analysis is a mathematical way for data sampling & data analysis. Using it we try to build relationship between dependent and independent variables and find a valid pattern so we can predict the future consequences.
Dependent variable: For which we are concerned about OR we wants to predict.
Independent variable: On which dependent variable depends.
for e.g. Consider a simple data-set of sales and purchase as follows
Revenue | Production_Cost | Advertisement_Cost
$10,000 | $5000 | $3000
$20,000 | $1000 | $4000
Here; “Revenue” depends upon “Production_Cost” & “Advertisement_Cost”. So “Revenue” is dependent variable and “Production_Cost” & “Advertisement_Cost” are independent variable. In other words we can say that we can control the independent variable but we can’t control dependent variable; that’s why we wants to predict it…Right!!!
Launchpad Service: Its a new service introduced in SQL Server 2016 to launch external scripts. As of now its “R” and most probably “python” in SQL 2017. You can see a new service in SQL Configuration Manager called “SQL Server Launchpad (<Instance Name>)”
It has introduced to reduce main SQL engine work load while executing external scripts from SSMS. Some external commands can be executed using xp_cmdshell like powershell scripts and shares the resource of SQL SERVER. So Microsoft has introduced new Launchpad Services to execute such external scripts.
SP_EXECUTE_EXTERNAL_SCRIPT: Sp_Execute_External_Script system stored procedure helps to execute external scripts (currently, R-Services scripts) from SSMS. It can be enabled using “sp_configure ‘external scripts enabled’, 1;”. A sample code is as follows:
EXEC sp_execute_external_script @language = N'R' , @script = N'<R Script>' , @input_data_1 = N'<SELECT Query from database>' , @output_data_1_name = N'output dataset name' WITH RESULT SETS ((<OUTPUT dataset''s column name and its datatype>));
Flavors of “R-Services” in SQL SERVER 2016: Its has 2 server components and 1 client component as follows
- R Services (In-Database) : Specially designed for SQL SERVER data-sources.
- Microsoft R Server (Standalone) : It supports multiple data-sources along with SQL SERVER.
- Microsoft R Client: Its client environment to build code.
Note: You can install “R Services (In-Database)” and “Microsoft R Server (Standalone)” in same instance and it will work as well but Microsoft recommends, not to install both features in a single instance.
I have done some practical with R-Services (In-database) and its really amazing; will share in my future posts.
Hope! you like this introduction of R-Services in SQL 2016.