There are many ways to access SQL from C#. I used the internally developed Data Access Layer for this. Here is the preparation of the SQL statement:
string startSql = EscapeQuote(string.Format(@"Select words = '{0}', emailFrom = '{1}', emailTo = '{2}', emailCc =
'{3}', emailFileRef = '{4}'", CheckNullOrEmpty(d.Words),
CheckNullOrEmpty(d.EmailFrom), CheckNullOrEmpty(d.EmailTo),
CheckNullOrEmpty(d.EmailCC), CheckNullOrEmpty(d.FileName)));
//Apply
learning model:
string pythonScript = GetPythonScript();
string sql = string.Format(@"EXECUTE sp_execute_external_script
@language
= N'Python'
,
@script = N'{0}'
,
@input_data_1 = N'{1}'
,
@input_data_1_name = N'doc_data'
WITH RESULT SETS(([Matter] CHAR(100) NOT NULL));
(one has to be sure to also escape the quote in the Python script string)
It appeared that for accessing Python from SQL and back, the 'panda' structure is used:
import
pandas as pd
…
#Input
params:
words=str(doc_data.iat[0,0])
emailFrom=str(doc_data.iat[0,1])
emailTo=str(doc_data.iat[0,2])
emailCc=str(doc_data.iat[0,3])
fileName=str(doc_data.iat[0,4])
…
OutputDataSet
= pd.DataFrame(best5idx)
Somehow assigning the results to the OutputDataSet is enough to return it to SQL. The 'WITH RESULT SETS' will assign headers to the output table.
However I encountered 2 issues:
- When deployed on the server I got a 'memory error'. The model has to load more than 600MB of weight-values. I found that when more then 70% of the servers memory is used, one can get these kind of errors. The problem could possibly be resolved just by ordering more memory for the virtual server (7,50 euro/month).
- It looked like for each prediction the model had to be recompiled, and the weights-file reloaded, because the SQL server starts the Python script, executes it, and then closes the session after getting the results. A sort-of solution for this could have been to speed up the process by declaring a virtual disk.
The developed system involved going from the rest service in C# to SQL to Python and back. Apart from the 2 issues it seemed overcomplicated, so I decided to try another approach: the Python Microservice.
Geen opmerkingen:
Een reactie posten