Friday, August 11, 2017

Execute a R Script in SQL Server 2016


     SQL Server started to support R Scripts with SQL Server 2016. To be able to run R Scripts in SQL Server you need to install R Services in SQL Server setup. R Services is renamed to Machine Learning Services in SQL Server 2017. By Default R Services is not enabled in SQL Server even if its installed. You need to enable the external scripts option by using sp_configure.

EXEC sp_configure 'external scripts enabled', 1
RECONFIGRE WITH OVERRRIDE

     Now, you need to restart your SQL Server, SSMS has the easiest option to do this. Right click on the server and click on Restart. When restart is completed, you should be good to go. To check if the external scripts option is enabled, you can run the following statement and it should return 1.

EXEC sp_configure 'external scripts enabled'

    To run R scripts in SQL Server, you need to use a stored procedure named sp_execute_external_script. Here is the syntax of this magic stored procedure.

sp_execute_external_script   
    @language = N'language' ,   
    @script = N'script',  

    [   @input_data_1 = ] 'input_data_1'   
    [ , @input_data_1_name = ] N'input_data_1_name' ]   
    [ , @output_data_1_name = 'output_data_1_name' ]  
    [ , @parallel = 0 | 1 ]
    [ WITH [execute_option] ]  

@language : This is the supported language option. In SQL Server 2016, your only option is R if you have 2017 you can use R or Python for this parameter.

@script : This is the text that contains your R or Python script. It's datatype is nvarchar(max)

@input_data_1 : This is the data you want to pass to your script, It's data type is nvarchar(max). This is your SQL statement that return some kind of data which will pass to your script.This is not a required field, you might not need to pass any data to your script.

@input_data_1_name : This is the name of the variable in the R script which contains the result set of the input query (@input_data_1). The variable in the R script must be data frame type. This is an optional parameter and its default value is inputDataSet

@output_data_1_name: This is the name of the variable in the R script which contains the output data. Again its type in the R script must be data frame type. It is optional parameter and its default value is outputDataSet

@parallel: R Scripts does not use parallelism as default setting. To enable it you can pass 1 to this parameter.

WITH RESULTS SETS You need to use WITH RESULTS SETS clause at the end of the stored procedure if you are returning any result set from R. The specified data types in WITH RESULTS SETS must be one of the data types supported in R.
                                bit              int              float              datetime              varchar


    Following data types are not supported in the input query (input_data_1). To use them in R, you need to convert them to varchar type if you can.
  • cursor
  • timestamp
  • datetime2/datetimeoffset/time
  • sql_variant
  • text/image
  • xml
  • hierarchyid/geometry/geography
  • CLR Types

      I think we are ready to execute a R Script in SQL Server first time, I am going to use WideWorldImporters database in this demo. I am going to pass a list of sales person names and their total orders to my R script. Then In my R script, I am going to sum the all orders and return it back to SQL Server. It is simple and better than useless Hello World examples.

EXECUTE sp_execute_external_script
@language =N'R',
@script = N'OutputDataSet <- InputDataSet;
OutputDataSet <- data.frame(sum(OutputDataSet$Orders));',
@input_data_1 = N'Select p.FullName, count(o.OrderId) as Orders
 from Sales.Orders as o 
join Application.People as p on o.SalespersonPersonId = p.PersonId
 group by p.fullname'
 WITH RESULT SETS (([total] int));

   Don't forget OutputDataSet variable must be data frame type. You can pass the OutputDataSet back to SQL Server as it is.

EXECUTE sp_execute_external_script
@language =N'R',
@script = N'OutputDataSet <- InputDataSet;',
@input_data_1 = N'Select p.FullName, count(o.OrderId) as Orders
 from Sales.Orders as o 
join Application.People as p on o.SalespersonPersonId = p.PersonId
 group by p.fullname'
 WITH RESULT SETS ((Name varchar(50), Total int));


No comments:

Post a Comment