Tuesday, December 28, 2010

Retrieve data from Google spreadsheet and insert data into a database using WSO2 Data Service Server

WSO2 Data Service Server allows to expose the data stored in data sources such as relational databases, CSV files , MS-Excel files and Google spreadsheets as web services.

Think a situation where you have added thousands of data in some google spreadsheets and now you want to insert these data into a database for the propose of using data for multiple applications. If you are going to do this by manually then you will be in a big trouble. You can simply write a data service using the WSO2 Data Service Server which will automate this task. Here no need to have any programming language, we can do it by using some MySQL and data service.

You will be using following features provide by WSO2 Data Service Server when writing this data service.
    - using a Google Spreadsheet as a data source. 
    - nested query feature.
    - call MySQL Stored Procedure from Data Service

The spread sheet that contains data is,
https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&hl=en#gid=0
Column titles in the spread sheet is important when writing the query to retrieve data.



And database that going to insert data is patch_details and we are going to insert data into two tables patch & customer.

mysql> create database patch_details;
mysql> use patch_details;
mysql> CREATE TABLE patch(PATCH_NAME VARCHAR(256), SVN_REVISION VARCHAR(30), DESCRIPTION VARCHAR(256), RELEASE_DATE DATE, PRIMARY KEY(PATCH_NAME));
mysql> CREATE TABLE customer(CUSTOMER_ID INT NOT NULL AUTO_INCREMENT, CUSTOMER_NAME VARCHAR(256),PRIMARY KEY(CUSTOMER_ID)) AUTO_INCREMENT=1 ;

Since there is more than one table to insert data, we can write a stored procedure that will insert the data into tables at once.
If data taken from the google spreadsheet is going to insert into a one table, then no need to write a stored procedure.

CREATE PROCEDURE `insertPatchData`(IN patchName varchar(256), IN svnRevision varchar(256), IN releaseDate varchar(256), IN customerName varchar(256))
BEGIN
    DECLARE customerAvailable INT DEFAULT 0;
    DECLARE patchAvailable INT DEFAULT 0;
    SET customeravAilable = 0;
    SET patchAvailable = 0;

    SELECT STR_TO_DATE(releaseDate, '%m/%d/%Y') INTO @date;
        
    SELECT 1 INTO patchAvailable FROM patch WHERE PATCH_NAME = patchName;
    IF patchAvailable !=1     
    THEN
        INSERT INTO patch (PATCH_NAME, SVN_REVISION, DESCRIPTION, RELEASE_DATE) VALUES (patchName, svnRevision,description, @date);
    END IF;
         
    IF NOT customerName IS NULL
    THEN 
        SELECT 1 INTO customerAvailable FROM customer WHERE CUSTOMER_NAME = customerName;
        IF customerAvailable != 1
        THEN 
            INSERT INTO customer (CUSTOMER_NAME) values (customerName);
        END IF;
    END IF;
END


Now create the data service:
Add google spreadsheet as a data source

<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
</data>

Add patch_details database as a data source.

<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
   <config id="PatchDB">
      <property name="org.wso2.ws.dataservice.driver">com.mysql.jdbc.Driver</property>
      <property name="org.wso2.ws.dataservice.protocol">jdbc:mysql://localhost:3306/patch_details</property>
      <property name="org.wso2.ws.dataservice.user">root</property>
      <property name="org.wso2.ws.dataservice.password">root</property>
      <property name="org.wso2.ws.dataservice.minpoolsize"></property>
      <property name="org.wso2.ws.dataservice.maxpoolsize"></property>
      <property name="org.wso2.ws.dataservice.validation_query"></property>
   </config>
</data>

Create a query that retrieve the data from google spread sheet.
In this query the column names of the output mappings should be same as the column titles in spreadsheet.

<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">  
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
   <config id="PatchDB">     
      <property name="org.wso2.ws.dataservice.driver">com.mysql.jdbc.Driver</property>     
      <property name="org.wso2.ws.dataservice.protocol">jdbc:mysql://localhost:3306/patch_details</property>     
      <property name="org.wso2.ws.dataservice.user">root</property>     
      <property name="org.wso2.ws.dataservice.password">root</property>     
      <property name="org.wso2.ws.dataservice.minpoolsize"></property>     
      <property name="org.wso2.ws.dataservice.maxpoolsize"></property>     
      <property name="org.wso2.ws.dataservice.validation_query"></property>  
   </config>  
   <query id="getPatches" useConfig="Gspread">     
      <gspread>        
         <worksheetnumber>1</worksheetnumber>        
         <startingrow>2</startingrow>        
         <maxrowcount>-1</maxrowcount>        
         <hasheader>false</hasheader>     
      </gspread>     
      <result element="patches" rowName="patch">        
         <element name="PatchName" column="Patch Name" xsdType="xs:string" />        
         <element name="Client" column="Client" xsdType="xs:string" />        
         <element name="Date" column="Date" xsdType="xs:date" />        
         <element name="Description" column="JIRA/Issue Description" xsdType="xs:string" />        
         <element name="SVNRevision" column="SVN Revision" xsdType="xs:string" />        
         <element name="Fixedby" column="Fixed by" xsdType="xs:string" />        
         <element name="Notes" column="Additional Comments" xsdType="xs:string" />        
         <element name="Jars" column="Jars " xsdType="xs:string" />     
      </result>  
   </query>
</data>

Then create a operation that use the above query. Using that operation we can test whether query works properly.

<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">  
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
   <config id="PatchDB">     
      <property name="org.wso2.ws.dataservice.driver">com.mysql.jdbc.Driver</property>     
      <property name="org.wso2.ws.dataservice.protocol">jdbc:mysql://localhost:3306/patch_details</property>     
      <property name="org.wso2.ws.dataservice.user">root</property>     
      <property name="org.wso2.ws.dataservice.password">root</property>     
      <property name="org.wso2.ws.dataservice.minpoolsize"></property>     
      <property name="org.wso2.ws.dataservice.maxpoolsize"></property>     
      <property name="org.wso2.ws.dataservice.validation_query"></property>  
   </config>  
   <query id="getPatches" useConfig="Gspread">     
      <gspread>        
         <worksheetnumber>1</worksheetnumber>        
         <startingrow>2</startingrow>        
         <maxrowcount>-1</maxrowcount>        
         <hasheader>false</hasheader>     
      </gspread>     
      <result element="patches" rowName="patch">        
         <element name="PatchName" column="Patch Name" xsdType="xs:string" />        
         <element name="Client" column="Client" xsdType="xs:string" />        
         <element name="Date" column="Date" xsdType="xs:date" />        
         <element name="Description" column="JIRA/Issue Description" xsdType="xs:string" />        
         <element name="SVNRevision" column="SVN Revision" xsdType="xs:string" />        
         <element name="Fixedby" column="Fixed by" xsdType="xs:string" />        
         <element name="Notes" column="Additional Comments" xsdType="xs:string" />        
         <element name="Jars" column="Jars " xsdType="xs:string" />     
      </result>  
   </query>  
   <operation name="getPatchData">     
      <description></description>     
      <call-query href="getPatches" />  
   </operation>
</data>

Click on the try it and check whether operation works properly.
It should return the all patch details when you click on the operation.

Create a query that calls to the 'insertPatchData' stored procedure which is used to insert data into tables.
<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
   <config id="PatchDB">
      <property name="org.wso2.ws.dataservice.driver">com.mysql.jdbc.Driver</property>
      <property name="org.wso2.ws.dataservice.protocol">jdbc:mysql://localhost:3306/patch_details</property>
      <property name="org.wso2.ws.dataservice.user">root</property>
      <property name="org.wso2.ws.dataservice.password">root</property>
      <property name="org.wso2.ws.dataservice.minpoolsize"></property>
      <property name="org.wso2.ws.dataservice.maxpoolsize"></property>
      <property name="org.wso2.ws.dataservice.validation_query"></property>
   </config>
   <query id="getPatches" useConfig="Gspread">
      <gspread>
         <worksheetnumber>1</worksheetnumber>
         <startingrow>2</startingrow>
         <maxrowcount>-1</maxrowcount>
         <hasheader>true</hasheader>
      </gspread>
      <result element="patches" rowName="patch">
         <element name="Client" column="Client" xsdType="xs:string" />
         <element name="Date" column="Date" xsdType="xs:date" />
         <element name="Description" column="JIRA/Issue Description" xsdType="xs:string" />
         <element name="SVNRevision" column="SVN Revision" xsdType="xs:string" />
         <element name="Fixedby" column="Fixed by" xsdType="xs:string" />
         <element name="Notes" column="Additional Comments" xsdType="xs:string" />
         <element name="Jars" column="Jars " xsdType="xs:string" />
         <element name="PatchName" column="Patch Name" xsdType="xs:string" />
      </result>
   </query>
   <query id="insertPatchTodb" useConfig="PatchDB">
      <sql>CALL insertPatchData(?, ?, ?, ?)</sql>
      <param name="patchName" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="1" />
      <param name="svnRevision" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="2" />
      <param name="releaseDate" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="3" />
      <param name="customerName" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="4" />
   </query>
   <operation name="getPatchData">
      <description></description>
      <call-query href="getPatches" />
   </operation>
</data>


Then you need to give the data retrieved from the google spreadsheet as input to the "insertPatchTodb" query.
To do this use Data Service nested query feature and call insertPatchTodb query inside getPatches query.

<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
   <config id="PatchDB">
      <property name="org.wso2.ws.dataservice.driver">com.mysql.jdbc.Driver</property>
      <property name="org.wso2.ws.dataservice.protocol">jdbc:mysql://localhost:3306/patch_details</property>
      <property name="org.wso2.ws.dataservice.user">root</property>
      <property name="org.wso2.ws.dataservice.password">root</property>
      <property name="org.wso2.ws.dataservice.minpoolsize"></property>
      <property name="org.wso2.ws.dataservice.maxpoolsize"></property>
      <property name="org.wso2.ws.dataservice.validation_query"></property>
   </config>
   <query id="getPatches" useConfig="Gspread">
      <gspread>
         <worksheetnumber>1</worksheetnumber>
         <startingrow>2</startingrow>
         <maxrowcount>-1</maxrowcount>
         <hasheader>true</hasheader>
      </gspread>
      <result element="patches" rowName="patch">
         <element name="Client" column="Client" xsdType="xs:string" />
         <element name="Date" column="Date" xsdType="xs:date" />
         <element name="Description" column="JIRA/Issue Description" xsdType="xs:string" />
         <element name="SVNRevision" column="SVN Revision" xsdType="xs:string" />
         <element name="Fixedby" column="Fixed by" xsdType="xs:string" />
         <element name="Notes" column="Additional Comments" xsdType="xs:string" />
         <element name="Jars" column="Jars " xsdType="xs:string" />
         <element name="PatchName" column="Patch Name" xsdType="xs:string" />
         <call-query href="insertPatchTodb">
            <with-param name="patchName" query-param="Patch Name" />
            <with-param name="svnRevision" query-param="SVN Revision" />
            <with-param name="releaseDate" query-param="Date" />
            <with-param name="customerName" query-param="Client" />
         </call-query>
      </result>
   </query>
   <query id="insertPatchTodb" useConfig="PatchDB">
      <sql>CALL insertPatchData(?, ?, ?, ?)</sql>
      <param name="patchName" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="1" />
      <param name="svnRevision" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="2" />
      <param name="releaseDate" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="3" />
      <param name="customerName" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="4" />
   </query>
   <operation name="getPatchData">
      <description></description>
      <call-query href="getPatches" />
   </operation>
</data>

Now when you execute the 'getPatchData' operation both data retrieving from google spreadsheet and inserting data to tables is happenning.
Check the two tables.Data should have entered.
mysql> select * from patch;
mysql> select * from customer;

Monday, December 27, 2010

Split Strings in MySQL

MySql does not include a built in function to split a comma seperated string. But it is importantat to have such a function in some cases.

Think there is a string that contains several names seperated by commas and you want to insert each of them into a tabel.

You can write your own mysql function to do this.

Syntax for create a function is as follows,

CREATE FUNCTION function_name ([parameterlist]) RETURNS datatype
BEGIN
     RETURN value_to_return;
END

Follwing function takes three arguments and it returns the pointed element in given string.

CREATE FUNCTION split_string(
stringToSplit VARCHAR(256), sign VARCHAR(12), position INT) RETURNS VARCHAR(256)
BEGIN
        RETURN REPLACE(SUBSTRING(SUBSTRING_INDEX(stringToSplit, sign, position),LENGTH(SUBSTRING_INDEX(stringToSplit, sign, position -1)) + 1), sign, '');
END


Example
Given String = 'Dinusha,Nuwan,Nirosh'
Need to take second name seperated by comma. For that  function call will look likes follows,

mysql> select split_string('Dinusha,Nuwan,Nirosh',',', 2) as name;
+-----------+
| name     |
+-----------+
| Nuwan  |
+-----------+

This query can be used to get how many comma signs are in the input string

SELECT LENGTH(stringToSplit) - LENGTH(REPLACE(stringToSplit, ',', '')) INTO noOfCommas;

Example
mysql> SELECT LENGTH('Dinusha,Nuwan,Nirosh') - LENGTH(REPLACE('Dinusha,Nuwan,Nirosh', ',', '')) INTO @noOfCommas;
Query OK, 1 row affected (0.00 sec)

mysql> select @noOfCommas;
+--------------------+
| @noOfCommas|
+--------------------+
|           2             |
+--------------------+

Using above query and split_string function, we can write a stored procedure that do some SQL operations on each element of given string.

Following stored procedure will split the string and insert the slpited strings into a table.

CREATE PROCEDURE insertEngineer( IN engineerName varchar(256))
BEGIN
    CREATE TABLE IF NOT EXISTS `engineer` (`ID` int(11) NOT NULL auto_increment,`NAME` varchar(256) NOT NULL,PRIMARY KEY (`ID`)) AUTO_INCREMENT=1 ;
 
    DECLARE x INT DEFAULT 0;
    DECLARE y INT DEFAULT 0;
    SET y = 1; 
 
    IF NOT engineerName IS NULL
    THEN
           SELECT LENGTH(engineerName) - LENGTH(REPLACE(engineerName, ',', '')) INTO @noOfCommas;
     
           IF  @noOfCommas = 0
          THEN
                 INSERT INTO engineer(NAME) VALUES(engineerName);
          ELSE
                SET x = @noOfCommas + 1;
                WHILE y  <=  x DO
                   SELECT split_string(engineerName, ',', y) INTO @engName;
                   INSERT INTO engineer(NAME) VALUES(@engName);
                   SET  y = y + 1;
                END WHILE;
        END IF;
    END IF;
END


Example of using stored procedure

mysql> call insertEngineer('dinusha,nuwan,nirosh');
Query OK, 1 row affected (0.12 sec)

mysql> select * from engineer;
+----+----------+
| ID | NAME  |
+----+----------+
|  1 | dinusha  |
|  2 | nuwan   |
|  3 | nirosh    |
+----+----------+