Tuesday, December 28, 2010

Retrieve data from Google spreadsheet and insert data into a database using WSO2 Data Service Server

WSO2 Data Service Server allows to expose the data stored in data sources such as relational databases, CSV files , MS-Excel files and Google spreadsheets as web services.

Think a situation where you have added thousands of data in some google spreadsheets and now you want to insert these data into a database for the propose of using data for multiple applications. If you are going to do this by manually then you will be in a big trouble. You can simply write a data service using the WSO2 Data Service Server which will automate this task. Here no need to have any programming language, we can do it by using some MySQL and data service.

You will be using following features provide by WSO2 Data Service Server when writing this data service.
    - using a Google Spreadsheet as a data source. 
    - nested query feature.
    - call MySQL Stored Procedure from Data Service

The spread sheet that contains data is,
https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&hl=en#gid=0
Column titles in the spread sheet is important when writing the query to retrieve data.



And database that going to insert data is patch_details and we are going to insert data into two tables patch & customer.

mysql> create database patch_details;
mysql> use patch_details;
mysql> CREATE TABLE patch(PATCH_NAME VARCHAR(256), SVN_REVISION VARCHAR(30), DESCRIPTION VARCHAR(256), RELEASE_DATE DATE, PRIMARY KEY(PATCH_NAME));
mysql> CREATE TABLE customer(CUSTOMER_ID INT NOT NULL AUTO_INCREMENT, CUSTOMER_NAME VARCHAR(256),PRIMARY KEY(CUSTOMER_ID)) AUTO_INCREMENT=1 ;

Since there is more than one table to insert data, we can write a stored procedure that will insert the data into tables at once.
If data taken from the google spreadsheet is going to insert into a one table, then no need to write a stored procedure.

CREATE PROCEDURE `insertPatchData`(IN patchName varchar(256), IN svnRevision varchar(256), IN releaseDate varchar(256), IN customerName varchar(256))
BEGIN
    DECLARE customerAvailable INT DEFAULT 0;
    DECLARE patchAvailable INT DEFAULT 0;
    SET customeravAilable = 0;
    SET patchAvailable = 0;

    SELECT STR_TO_DATE(releaseDate, '%m/%d/%Y') INTO @date;
        
    SELECT 1 INTO patchAvailable FROM patch WHERE PATCH_NAME = patchName;
    IF patchAvailable !=1     
    THEN
        INSERT INTO patch (PATCH_NAME, SVN_REVISION, DESCRIPTION, RELEASE_DATE) VALUES (patchName, svnRevision,description, @date);
    END IF;
         
    IF NOT customerName IS NULL
    THEN 
        SELECT 1 INTO customerAvailable FROM customer WHERE CUSTOMER_NAME = customerName;
        IF customerAvailable != 1
        THEN 
            INSERT INTO customer (CUSTOMER_NAME) values (customerName);
        END IF;
    END IF;
END


Now create the data service:
Add google spreadsheet as a data source

<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
</data>

Add patch_details database as a data source.

<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
   <config id="PatchDB">
      <property name="org.wso2.ws.dataservice.driver">com.mysql.jdbc.Driver</property>
      <property name="org.wso2.ws.dataservice.protocol">jdbc:mysql://localhost:3306/patch_details</property>
      <property name="org.wso2.ws.dataservice.user">root</property>
      <property name="org.wso2.ws.dataservice.password">root</property>
      <property name="org.wso2.ws.dataservice.minpoolsize"></property>
      <property name="org.wso2.ws.dataservice.maxpoolsize"></property>
      <property name="org.wso2.ws.dataservice.validation_query"></property>
   </config>
</data>

Create a query that retrieve the data from google spread sheet.
In this query the column names of the output mappings should be same as the column titles in spreadsheet.

<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">  
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
   <config id="PatchDB">     
      <property name="org.wso2.ws.dataservice.driver">com.mysql.jdbc.Driver</property>     
      <property name="org.wso2.ws.dataservice.protocol">jdbc:mysql://localhost:3306/patch_details</property>     
      <property name="org.wso2.ws.dataservice.user">root</property>     
      <property name="org.wso2.ws.dataservice.password">root</property>     
      <property name="org.wso2.ws.dataservice.minpoolsize"></property>     
      <property name="org.wso2.ws.dataservice.maxpoolsize"></property>     
      <property name="org.wso2.ws.dataservice.validation_query"></property>  
   </config>  
   <query id="getPatches" useConfig="Gspread">     
      <gspread>        
         <worksheetnumber>1</worksheetnumber>        
         <startingrow>2</startingrow>        
         <maxrowcount>-1</maxrowcount>        
         <hasheader>false</hasheader>     
      </gspread>     
      <result element="patches" rowName="patch">        
         <element name="PatchName" column="Patch Name" xsdType="xs:string" />        
         <element name="Client" column="Client" xsdType="xs:string" />        
         <element name="Date" column="Date" xsdType="xs:date" />        
         <element name="Description" column="JIRA/Issue Description" xsdType="xs:string" />        
         <element name="SVNRevision" column="SVN Revision" xsdType="xs:string" />        
         <element name="Fixedby" column="Fixed by" xsdType="xs:string" />        
         <element name="Notes" column="Additional Comments" xsdType="xs:string" />        
         <element name="Jars" column="Jars " xsdType="xs:string" />     
      </result>  
   </query>
</data>

Then create a operation that use the above query. Using that operation we can test whether query works properly.

<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">  
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
   <config id="PatchDB">     
      <property name="org.wso2.ws.dataservice.driver">com.mysql.jdbc.Driver</property>     
      <property name="org.wso2.ws.dataservice.protocol">jdbc:mysql://localhost:3306/patch_details</property>     
      <property name="org.wso2.ws.dataservice.user">root</property>     
      <property name="org.wso2.ws.dataservice.password">root</property>     
      <property name="org.wso2.ws.dataservice.minpoolsize"></property>     
      <property name="org.wso2.ws.dataservice.maxpoolsize"></property>     
      <property name="org.wso2.ws.dataservice.validation_query"></property>  
   </config>  
   <query id="getPatches" useConfig="Gspread">     
      <gspread>        
         <worksheetnumber>1</worksheetnumber>        
         <startingrow>2</startingrow>        
         <maxrowcount>-1</maxrowcount>        
         <hasheader>false</hasheader>     
      </gspread>     
      <result element="patches" rowName="patch">        
         <element name="PatchName" column="Patch Name" xsdType="xs:string" />        
         <element name="Client" column="Client" xsdType="xs:string" />        
         <element name="Date" column="Date" xsdType="xs:date" />        
         <element name="Description" column="JIRA/Issue Description" xsdType="xs:string" />        
         <element name="SVNRevision" column="SVN Revision" xsdType="xs:string" />        
         <element name="Fixedby" column="Fixed by" xsdType="xs:string" />        
         <element name="Notes" column="Additional Comments" xsdType="xs:string" />        
         <element name="Jars" column="Jars " xsdType="xs:string" />     
      </result>  
   </query>  
   <operation name="getPatchData">     
      <description></description>     
      <call-query href="getPatches" />  
   </operation>
</data>

Click on the try it and check whether operation works properly.
It should return the all patch details when you click on the operation.

Create a query that calls to the 'insertPatchData' stored procedure which is used to insert data into tables.
<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
   <config id="PatchDB">
      <property name="org.wso2.ws.dataservice.driver">com.mysql.jdbc.Driver</property>
      <property name="org.wso2.ws.dataservice.protocol">jdbc:mysql://localhost:3306/patch_details</property>
      <property name="org.wso2.ws.dataservice.user">root</property>
      <property name="org.wso2.ws.dataservice.password">root</property>
      <property name="org.wso2.ws.dataservice.minpoolsize"></property>
      <property name="org.wso2.ws.dataservice.maxpoolsize"></property>
      <property name="org.wso2.ws.dataservice.validation_query"></property>
   </config>
   <query id="getPatches" useConfig="Gspread">
      <gspread>
         <worksheetnumber>1</worksheetnumber>
         <startingrow>2</startingrow>
         <maxrowcount>-1</maxrowcount>
         <hasheader>true</hasheader>
      </gspread>
      <result element="patches" rowName="patch">
         <element name="Client" column="Client" xsdType="xs:string" />
         <element name="Date" column="Date" xsdType="xs:date" />
         <element name="Description" column="JIRA/Issue Description" xsdType="xs:string" />
         <element name="SVNRevision" column="SVN Revision" xsdType="xs:string" />
         <element name="Fixedby" column="Fixed by" xsdType="xs:string" />
         <element name="Notes" column="Additional Comments" xsdType="xs:string" />
         <element name="Jars" column="Jars " xsdType="xs:string" />
         <element name="PatchName" column="Patch Name" xsdType="xs:string" />
      </result>
   </query>
   <query id="insertPatchTodb" useConfig="PatchDB">
      <sql>CALL insertPatchData(?, ?, ?, ?)</sql>
      <param name="patchName" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="1" />
      <param name="svnRevision" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="2" />
      <param name="releaseDate" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="3" />
      <param name="customerName" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="4" />
   </query>
   <operation name="getPatchData">
      <description></description>
      <call-query href="getPatches" />
   </operation>
</data>


Then you need to give the data retrieved from the google spreadsheet as input to the "insertPatchTodb" query.
To do this use Data Service nested query feature and call insertPatchTodb query inside getPatches query.

<data name="PatchDetailsDataService" enableBatchRequests="false" enableBoxcarring="false" serviceStatus="active">
   <config id="Gspread">
      <property name="gspread_datasource">https://spreadsheets0.google.com/ccc?key=tXMHzbjQmQy8KHiaSDR8_PA&amp;hl=en#gid=0</property>
      <property name="gspread_visibility">private</property>
      <property name="gspread_username">user@gmail.com</property>
      <property name="gspread_password">password</property>
   </config>
   <config id="PatchDB">
      <property name="org.wso2.ws.dataservice.driver">com.mysql.jdbc.Driver</property>
      <property name="org.wso2.ws.dataservice.protocol">jdbc:mysql://localhost:3306/patch_details</property>
      <property name="org.wso2.ws.dataservice.user">root</property>
      <property name="org.wso2.ws.dataservice.password">root</property>
      <property name="org.wso2.ws.dataservice.minpoolsize"></property>
      <property name="org.wso2.ws.dataservice.maxpoolsize"></property>
      <property name="org.wso2.ws.dataservice.validation_query"></property>
   </config>
   <query id="getPatches" useConfig="Gspread">
      <gspread>
         <worksheetnumber>1</worksheetnumber>
         <startingrow>2</startingrow>
         <maxrowcount>-1</maxrowcount>
         <hasheader>true</hasheader>
      </gspread>
      <result element="patches" rowName="patch">
         <element name="Client" column="Client" xsdType="xs:string" />
         <element name="Date" column="Date" xsdType="xs:date" />
         <element name="Description" column="JIRA/Issue Description" xsdType="xs:string" />
         <element name="SVNRevision" column="SVN Revision" xsdType="xs:string" />
         <element name="Fixedby" column="Fixed by" xsdType="xs:string" />
         <element name="Notes" column="Additional Comments" xsdType="xs:string" />
         <element name="Jars" column="Jars " xsdType="xs:string" />
         <element name="PatchName" column="Patch Name" xsdType="xs:string" />
         <call-query href="insertPatchTodb">
            <with-param name="patchName" query-param="Patch Name" />
            <with-param name="svnRevision" query-param="SVN Revision" />
            <with-param name="releaseDate" query-param="Date" />
            <with-param name="customerName" query-param="Client" />
         </call-query>
      </result>
   </query>
   <query id="insertPatchTodb" useConfig="PatchDB">
      <sql>CALL insertPatchData(?, ?, ?, ?)</sql>
      <param name="patchName" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="1" />
      <param name="svnRevision" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="2" />
      <param name="releaseDate" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="3" />
      <param name="customerName" paramType="SCALAR" sqlType="STRING" type="IN" ordinal="4" />
   </query>
   <operation name="getPatchData">
      <description></description>
      <call-query href="getPatches" />
   </operation>
</data>

Now when you execute the 'getPatchData' operation both data retrieving from google spreadsheet and inserting data to tables is happenning.
Check the two tables.Data should have entered.
mysql> select * from patch;
mysql> select * from customer;

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.