Friday, May 13, 2016

Invoke a BPEL workflow from WSO2 ESB proxy service


In this blog post I will illustrate how to invoke a BPEL workflow from a proxy service using WSO2 ESB.

1. Deploy a BPEL process on WSO2 BPS Server. (Login to the BPS Management Console and go to Processes -> Add then select the BPEL Archive(zip) file and upload it. If your BPEL process has external web service invocations, you can be hosted those web services WSO2 App Server or axis2Server)

2. Start WSO2 ESB Server and create a custom proxy as below. (Change the port offset to avoid conflicts with the WSO2 BPS Server).

<?xml version="1.0" encoding="UTF-8"?>
<proxy xmlns="http://ws.apache.org/ns/synapse"
       name="bpel_factory"
       transports="https,http"
       statistics="disable"
       trace="disable"
       startOnLoad="true">
   <target>
      <inSequence>
         <payloadFactory media-type="xml">
            <format>
               <p:MultiOperatorServiceRequest xmlns:p="http://wso2.org/bps/operator"><!--Exactly 1 occurrence--><x xmlns="http://wso2.org/bps/operator">$1</x>
                  <!--Exactly 1 occurrence--><y xmlns="http://wso2.org/bps/operator">$2</y>
               </p:MultiOperatorServiceRequest>
            </format>
            <args>
               <arg xmlns:m="http://wso2.org/bps/operator"
                    evaluator="xml"
                    expression="//m:MultiOperatorServiceRequest/x"/>
               <arg xmlns:m="http://wso2.org/bps/operator"
                    evaluator="xml"
                    expression="//m:MultiOperatorServiceRequest/y"/>
            </args>
         </payloadFactory>
         <send>
            <endpoint>
               <address uri="http://10.100.7.75:9763/services/MultiOperatorService.MultiOperatorServicehttpMultiOperatorServiceBindingEndpoint/"/>
            </endpoint>
         </send>
      </inSequence>
      <outSequence>
         <log level="full"/>
         <respond/>
      </outSequence>
   </target>
   <description/>
</proxy>
                     
Here you can find out the address URL in SOAP-UI after clicking on the TryIt wizard in WSO2 BPS Server.  Payload factory mediator is used to transform the payload in incoming request to the appropriate format that allow by the request for the BPEL process invocation.

you can copy and paste the request body in the SOAP-UI to the format element in payload factory mediator and replace ? with the arguments defined in the message context.  

3. Next you can send a POST request with the payload to the proxy service and see the response as following figure.


In this example I used a BPEL process to solve [(x * y) - (x + y)]^2  formula.
eg: x=5, y=3 =>[(5 * 3) - (5 + 3)]^2 => 49.0


Wednesday, May 11, 2016

Send SMS through WSO2 Twilio Connector

Introduction


The WSO2 Twilio connector allows us to connect to Twilio, an online service that lets us embed phones, VoIP, and messaging in web, desktop, and mobile software. The connector uses the Twilio REST API to connect to Twilio and work with accounts, applications, calls, messages, and more. The underlying Java classes in the connector use the Twilio Java Helper Library to make the HTTP requests to the Twilio API.

How to enable Twilio Connector ?


1. Download the Connector from here.
2. Upload the connector to the ESB instance as below.


3. Once the connector is successfully uploaded, it's required to “enable” the connector in order to activate. For that click on List under Connectors in Main tab and click on disabled in Status column to enable it.


Example


This example basically describes how to send an SMS through Twilio utilizing WSO2 Twilio Connector. First of all you should have to setup a trial account to test this scenario in Twilio if you haven't an account yet please create one. Once you created an account you can able to get Account SID and Auth Token when you go to Dashboard Console. These tokens are essential to initialize the connector in your Proxy Service.

For enabling a phone number to send and receive SMS in Twilio click here. Please refer this  for more information.

To use the Twilio connector, add the <twilio.init> element in your configuration before any other Twilio operations. This Twilio configuration authenticates with Twilio by specifying the SID and auth token of your master Twilio account. You can find your SID and token by logging into your Twilio account and going to the API Credentials section on the dashboard.

<twilio.init>
    <accountSid>ACba8bc05eacf94afdae398e642c9cc32d</accountSid>
    <authToken>AC5ef8732a3c49700934481addd5ce1659</authToken>
</twilio.init>

For best results, save the Twilio configuration as a local entry. You can then easily reference it with the configKey attribute in your Twilio operations. For example, if you saved the above <twilio.init> entry as a local entry named MyTwilioConfig, you could reference it from an operation like as follows.

1. Go to Local Entries under Service Bus in Main tab and click on Add Local Entries.
2. Then Click on Add in-lined XML Entry and give configuration as below.


3. Now click on Save button to save the content.

4. Configure your proxy as below.

<?xml version="1.0" encoding="UTF-8"?>
<proxy xmlns="http://ws.apache.org/ns/synapse"
       name="testTwillio"
       transports="https http"
       startOnLoad="true"
       trace="disable">
   <description/>
   <target>
      <inSequence>
         <property name="name"
                   expression="get-property('myName')"
                   scope="default"
                   type="STRING"/>
         <property name="phone"
                   expression="get-property('myPhone')"
                   scope="default"
                   type="STRING"/>
         <property name="smsContent"
                   expression="fn:concat('WSO2 Twilio Connector : SMS Generated! - Name : ', get-property('name'))"
                   scope="default"
                   type="STRING"/>
         <twilio.sendSms configKey="MyTwilioConfig">
            <body>{get-property('smsContent')}</body>
            <to>{get-property('phone')}</to>
            <from>+14843160720</from>
         </twilio.sendSms>
         <respond/>
      </inSequence>
   </target>
</proxy>

If configKey is not saved or updated, please go to <ESB_HOME>/repository/deployment/server/synapse-configs/default/proxy-services directory and open your proxy service xml file and apply changes there and save it.

5. For testing the proxy service, click on Try this service and Send the message via SOAP UI.



6. Finally you can see the server output as below.



Tuesday, May 10, 2016

Task scheduling through WSO2 ESB 4.9.0

Introduction

WSO2 ESB is capable of scheduling and executing tasks periodically. A task can be scheduled to run n number of times in a given t time duration. Also we can schedule a task to run just once after WSO2 ESB starts. If we want to get more control in task scheduling then we can use cron-expression. Let's say as an example we can schedule a task to run everyday at 5 PM or on 25th at 5 PM every month, likewise.

Example

As a first step we have to create a sample back-end service. Here I use a sample service which exists in WSO2 ESB samples folder.

step 1


If you go to  <ESB_HOME>/samples/axis2Server/src  directory,  you will see several back-end samples are available there. They can be built and deployed using Ant from each service directory. You can do this by typing "ant" without quotes on a console from a selected sample directory.  Here I will choose SimpleStockQuoteService service.


Step 2 


As a next step, go to <ESB_HOME>/samples/axis2Server directory and execute axis2server.sh (for Linux) to start the Axis 2 server. This starts the Axis2 server with the HTTP transport listener on port 9000 and HTTPS on 9002 respectively.

Step 3 


Now add a sample sequence in the WSO2 ESB as below.
Click on Sequences under Service Bus in Manage menu and click Add Sequences



Then switch to the source view by clicking on  switch to source view and replace the following content and click save and close button to save the content.

<?xml version="1.0" encoding="UTF-8"?>
<sequence name="iterateSequence" xmlns="http://ws.apache.org/ns/synapse">
    <iterate attachPath="//m0:getQuote"
        expression="//m0:getQuote/m0:request" preservePayload="true"
        xmlns:m0="http://services.samples"
        xmlns:ns="http://org.apache.synapse/xsd" xmlns:ns3="http://org.apache.synapse/xsd">
        <target>
            <sequence>
                <call>
                    <endpoint>
                        <address uri="http://localhost:9000/services/SimpleStockQuoteService"/>
                    </endpoint>
                </call>
                <log level="custom">
                    <property
                        expression="//ns:return/ax21:lastTradeTimestamp/child::text()"
                        name="Stock_Quote_on" xmlns:ax21="http://services.samples/xsd"/>
                    <property
                        expression="//ns:return/ax21:name/child::text()"
                        name="For_the_organization" xmlns:ax21="http://services.samples/xsd"/>
                    <property
                        expression="//ns:return/ax21:last/child::text()"
                        name="Last_Value" xmlns:ax21="http://services.samples/xsd"/>
                </log>
            </sequence>
        </target>
    </iterate>
</sequence>



Step 4 


As a next step, add a Scheduled Task by clicking on Scheduled Tasks under Service Bus in Manage menu. And set its configuration as below.


Step 5


As final step click on Schedule button and task will start execution according to the Interval. Here in this example task will run 50 times and start in 10 seconds. 

Friday, May 6, 2016

Service Task Example in WSO2 Business Process Server

This tutorial illustrates how to model a business process with Service Tasks. Here as an example I will describe a sample bonus payment process.

Steps of the use-case


1. Obtaining the customer information with their salary and working period
2. Adding (working period * random number) to the salary of the employee
3. Manager approval (approve/reject) the bonus amount

The above components can be matched with the components in activiti pallet section as below.

Start Event : Filling the details about the employee such as employee id, name, salary and the working period.
Service Task : use to automate the calculation mention in step 2
User Task : is used to approve or reject the bonus amount for the given employee

Prerequisites


  • Java
  • WSO2 BPS
  • Eclipse activiti-designer plugin

Implementing the java class  to be used in Service Task


First create a maven project in eclipse and apply the following dependency to the pom.xml file.
(The jar file should be copied into <BPS_HOME>/repository/components/lib folder before starting the BPS Server)

<dependency>
<groupId>org.activiti</groupId>
<artifactId>activiti-engine</artifactId>
<version>5.19.0</version>
</dependency>

Java class implementation.

package org.wso2.bps.serviceTask;

import java.util.Random;

import org.activiti.engine.delegate.DelegateExecution;
import org.activiti.engine.delegate.JavaDelegate;

/**
 * Service task to calculate Bonus for employees
 *
 */
public class App implements JavaDelegate {
public void execute(DelegateExecution execution) throws Exception {
int salary = Integer.parseInt((String) execution.getVariable("employeeSalary"));
int numOfWorkingDays = Integer.parseInt((String) execution.getVariable("workingPeriod")); 
Random randomGenerator = new Random();
int value = randomGenerator.nextInt(10);
int result = salary + (numOfWorkingDays * value);
execution.setVariable("result", result);
}
}

Modeling the process utilizing eclipse activiti-designer tool


First create an activiti project in eclipse (See the steps below)

  1. Go to File -> New -> Other - > Activiti and select Activiti Project.
  2. Then enter activiti project name and click Finish.
  3. Right click on the created project and select New -> Other -> Activiti Diagram and click Next.
  4. Give a name for the process and click Finish.


Design the user-case using the pallet components dragging and dropping to the canvas layer in the tool as showing in the following figure.

Configuration of each components


Configure properties tab of the start event as below.

Form tab - Define the variables to be initialized in the start event


Main config tab -  Define the initiator of the process




 Configure properties tab of the service task as below.

Main config tab - the relevant java class that contains the logic should be specified here.  You should have to create the java class before configure the main config of the service task. (I will explain this class implementation later.)



Configure the properties tab of the user task as below.

Form tab
(Here the result variable coming from the Service task)


Main config tab - Assign a candidate group for the user task. For the claiming purpose you should have to create a user and assign the candidate group role to him and login with the newly created user to approve the bonus amount when process execution comes to the user task.



Save the project and right click on the activiti project in package-explorer in eclipse and click Create Deployment Artifacts.

Now you can see that a .bar file is generated inside of the deployment folder in the project.


Deploying and testing the bonus payment process using WSO2 BPS 


Start the BPS Server
Log in the management console and navigate to Home>Manage>Add>BPMN. Upload the .bar file to deploy it as seen below. 



Create a new user called kermit and assign him to the bonusApproval role.







Login to the BPMN-explorer using admin/admin credentials (default admin user credentials) and start the bonus payment Process.






Now login to the BPMN-explorer using kermit/kermit credentials to approve the bonus payment.




Wednesday, December 30, 2015

Data visualization with D3.js

What is Data Visualization ?

Data visualization is the way of present the data in a pictorial or graphical format and it helps people to understand the importance of data in a visual context. This is very crucial because data on its own can be very hard to understand and analyze.

Why Data Visualization ?

As millions of data is collected and analyzed, the decision makers use data visualization tools which enable them to see analytical results presented visually, find relevance among the variables, communicate concepts and hypotheses to others and even predict the results for future. Because of the way the human brain processes information, it is faster for people to gather the significance of many data points when they are displayed in charts and graphs rather than representing them over piles of spreadsheets, flat files or reading tables of reports and it helps to easily interpret the data, saving time and energy.

What is D3 ?

D3 is a JavaScript library which is used to manipulate documents based on data (interactive visualization). D3 helps bring data to life using HTML, SVG, and CSS. D3 stands for Data Driven Documents. Here documents refer to the DOM (Document Object Model) structure in html. It allows developers to bind arbitrary data to a DOM, and then apply data-driven transformations to the document.

Selections in D3

Before moving into the details, first look at the initial version of our html document below. (I'm referring a local copy of D3.js library here)

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Example in D3</title>
    <script src="d3.min.js"></script>
</head>
<body>
<script>
    //D3 code goes here
</script>
</body>
</html>

Similar to jQuery, D3 allows us to select elements from the DOM based on CSS selectors, for instance by id, class attribute or tag name. The result of a select operation is an array of selected elements.

In D3 using select(), we can select a single element from the DOM in html. Let's say as an example if we want to color the background of the body tag using color blue then we can do it as follows.

d3.select("body").style("background-color", "blue");

For every data value in the selection, we can append a new DOM element given by element name and associate the data value to it. An example is showing below.

d3.select("body").append("h1").html("Support Vector Machines");

In D3.js selectAll() method uses CSS3 selectors to grab DOM elements. Unlike the Select() method which previously mentioned, the SelectAll() selects all the elements in the DOM that match the specific selector string. 

d3.selectAll("p").style("background-color","blue");

In the above example it selects all the <p> elements available on the page. If there is none then it returns an empty selection. Here most important thing is that we don’t need to loop over our set of elements in order to apply modification on them. Instead, we apply the style operator to the selection, and D3 takes care of invoking it on every single element within.

Scales in D3

Scales transform numbers or discrete values in a certain interval (called the domain) into numbers in another interval (called the range).  For instance, let’s suppose we have a  dataset which is always over 100 and always below 800. We would like to plot it, say, in a bar chart, which can be only within 100 pixels length.

Domain (data space) rep­re­sents the bound­aries within which our data lies. Let's say as an example if I have an array of num­bers with no num­ber smaller than 1 and no num­ber larger than 1000, my domain would be 1 to 1000.

There will not always be a direct map­ping between the data points and actual pixels on the screen. Let's say as an example, if we are plot­ting a graph of sales and the sales is in tens of thou­sands, it is unlikely that we will be able to have a bar graph with the same pixel length as the data. In that case, we need to spec­ify the bound­aries within which the orig­i­nal data can be trans­formed. These bound­aries are called the range.

The most com­mon types of scales are quan­ti­ta­tive scales and ordi­nal scales. Quan­ti­ta­tive scale func­tions are those that trans­late one numeric value into another numeric value using dif­fer­ent types of equa­tions such as lin­ear, log­a­rith­mic etc. Data may not always in the numeric format. It may con­tain ordinal/categorical/discrete values. For example, alpha­bets. Alpha­bets are ordi­nal (means categorical with clear ordering of the variables) val­ues, i.e. they can be arranged in an order, but you can­not derive one alpha­bet from the other unlike numbers.

d3.scale.linear() - It transforms numeric data in a given dataset into pixel space.
eg: d3.scale.linear().domain([0,1000]).range([0, 100]); 

d3.scale.ordinal() -  It transforms data that has discrete values into pixel space.
eg: d3.scale.ordinal().domain(["A", "B", "C", "D"]).rangePoints([0, 100]);

If you write the output of scale("A"), scale("B"), scale("C") and scale("D") to the console (eg: console.log(scale("A"))) it prints 0, 33.33, 66.66 and 100 respectively.

With rangePoints(interval), d3 fits to the number of categories (eg: n number of points or categories) in the domain within the interval. In that case, the value of the first point is the beginning of the interval, that of the last point is the end of the interval.

With rangeBands(interval), d3 fit n bands within the interval. Here, the value of the last item in the domain is less than the upper bound of the interval.

If we use rangeRound() instead of range(), this will guarantee that the output of the scales are integers, which is better to position marks on the screen with pixel precision than numbers with decimals.

How to visualize dataset using D3 ?

In this section I will able to describe how to plot a horizontal bar-chart for the following sample dataset.

 [{
    "tableName": "PROCESS_USAGE_SUMMARY_DATA",
    "timestamp": 1447495522776,
    "values": {
        "processDefKey": ["manualTaskProcess:1:14"],
        "avgExecutionTime": 31066
    }
}, {
    "tableName": "PROCESS_USAGE_SUMMARY_DATA",
    "timestamp": 1447495522890,
    "values": {
        "processDefKey": ["VacationRequest:1:22"],
        "avgExecutionTime": 16215.5
    }
}, {
    "tableName": "PROCESS_USAGE_SUMMARY_DATA",
    "timestamp": 1447495522987,
    "values": {
        "processDefKey": ["OrderProcess:1:10"],
        "avgExecutionTime": 54892
    }
}, {
    "tableName": "PROCESS_USAGE_SUMMARY_DATA",
    "timestamp": 1447495523074,
    "values": {
        "processDefKey": ["LoanProcess:1:6"],
        "avgExecutionTime": 25149
    }
}, {
    "tableName": "PROCESS_USAGE_SUMMARY_DATA",
    "timestamp": 1447495523145,
    "values": {
        "processDefKey": ["SubProcess:1:18"],
        "avgExecutionTime": 54145
    }
}]


In this example you can see that it has a set of five data items which was collected from the WSO2 DAS (Data Analytics Server) analytics REST API. In order to display the dataset in a horizontal bar chart, need to connect each datum to a bar that will represent it by its length. In D3, we can achieve this by applying the data() operator on the selection of bars. Here I display average execution time of the processes against process id. Therefore before applying any D3 functionality first what I have to do here is make the data into appropriate format as follows. (The array called data is hold the above dataset)

for(var i = 0 ; i < data.length ; i++){
    dataset.push({
        "processDefKey": data[i].values.processDefKey,
        "avgExecutionTime": data[i].values.avgExecutionTime
    });


Now you can see that dataset variable holds the following JSON array.

[{
    "processDefKey": ["manualTaskProcess:1:14"],
    "avgExecutionTime": 31066
}, {
    "processDefKey": ["VacationRequest:1:22"],
    "avgExecutionTime": 16215.5
}, {
    "processDefKey": ["OrderProcess:1:10"],
    "avgExecutionTime": 54892
}, {
    "processDefKey": ["LoanProcess:1:6"],
    "avgExecutionTime": 25149
}, {
    "processDefKey": ["SubProcess:1:18"],
    "avgExecutionTime": 54145
}]
 
Now create a div element in the .html file like below to render the bar-chart.

<div class="main" style="width: 850px;height: 400px;border-style: solid">
    <h2 style="text-align: center;">Process Id VS Average execution time</h2>
</div>


Before we can add x and y axises in D3, we need to clear some space in the margins. Here margins in D3 are specified as an object with top, right, bottom and left properties (you can see it below). Then, the outer size of the bar-chart area, which includes the margins, is used to compute the inner size available for graphical region by subtracting the margins. For example, values for a 700×400 chart are:

var margins = {top: 30, right: 100, bottom: 30, left: 100};
var height = 400 - margins.left - margins.right;
var width = 700 - margins.top - margins.bottom;
var barPadding = 5;


Here barpadding variable is used to keep the space between two rectangles in the bar-chart. 700 and 400 are the outer width and height respectively, while the computed inner width and height are 630 and 200. These inner dimensions can be used to initialize scale ranges. To apply the margins to the SVG container, I set the width and height of the SVG element to the outer dimensions, and add a group (g tag in D3) element to offset the origin of the chart area by the top-left margin.

var chart = d3.select('.main')
                       .append('svg')
                       .attr('width', width + margins.left + margins.right)
                       .attr('height', height + margins.top + margins.bottom)
                       .append('g')
                    .attr('transform', 'translate(' + margins.left + ',' + margins.top + ')');


The next step is adding the x and y axises and label them for the human readability. Here, define x and y axises by binding them to the existing x-scale and y-scale declaring one of the four orientations. Since x-axis will appear below the bars, and therefore use the bottom orientation. For y axis, use the left orientation.

In the domain function we're using a helper called d3.max() and it looks at our data set and figures out what is the largest value. Here d3.max() will iterate over an entire dataset (Array) for us. 

// Create a scale for the x-axis based on data
// Domain - min and max values in the dataset
// Range - physical range of the scale


var xScale = d3.scale.linear()
               .domain([0, d3.max(dataset, function(d){
                    return d.avgExecutionTime;
                })]).range([0, width]);

// Implements the scale as an actual axis
// Orient - places the axis on the bottom of the graph
// Ticks - number of points on the axis, automated


var xAxis = d3.svg.axis()
              .scale(xScale)
              .orient('bottom')
              .ticks(10);


Here ticks(n) method will split the domain of the given axis into n number of points and show them on the axis.

// Creates a scale for the y-axis based on process definition keys

var yScale = d3.scale.ordinal()
               .domain(dataset.map(function(d){
                    return d.processDefKey;
               })).rangeRoundBands([height, 0]);

// Creates an axis based off the yScale properties


var yAxis = d3.svg.axis()
              .scale(yScale)
              .orient('left');



Now define a tooltip to show the additional informations (in this case it is average execution time) when mouse pointer move on to the particular rectangle element.

//add tooltip

var tooltip = d3.select(".main").append("div").attr("class", "d3-tip");
tooltip.append('div').attr('class', 'label');
tooltip.append('div').attr('class', 'contentBox');


The next thing is mapping data to the rectangles in bar-chart. For that we can use selectAll() method and it will select all the existing rectangle elements (in D3 we can define rectangles using "rect") on the SVG. At the beginning there is no any rect elements in the chart, but we have only data array.  when we will invoke enter() method, it will give us virtual selection. Here all the stuff after invoking enter() will execute only for the case where there is no DOM element, there is no rect but there is data element. (That means data elements are entering into the picture)


// Step 1: selectAll.data.enter.append
// Loops through the dataset and appends a rectangle for each value


chart.selectAll('rect')
     .data(dataset)
     .enter()
     .append('rect')

// Step 2: X & Y
// X - Places the bars in horizontal order, based on number of
//        points & the width of the chart
// Y - Places vertically based on scale


     .attr('x', 0)
     .attr('y', function(d){
                    return yScale(d.processDefKey);
            })

// Step 3: Height & Width
// Width - Based on barpadding and number of points in dataset
// Height - Scale using avgExecution Time and height of the chart area


     .attr('height', (height / dataset.length) - barPadding)
     .attr('width', function(d){
                    return xScale(d.avgExecutionTime);
                })
     .attr('fill', 'steelblue')

// Step 4: Info for hover interaction


     .attr('class', function(d){
                    return d.processDefKey;

                })
     .attr('id', function(d){
                    return d.avgExecutionTime;
                })
                .on("mouseover", function(d) {
                    var pos = d3.mouse(this);
                    console.log(pos);
                    tooltip.transition()
                           .duration(200)
                           .style("left", (d3.event.pageX) + "px")
                           .style("top", (d3.event.pageY - 30) + "px");
                    tooltip.select('.label').html('AVG Execution Time');
                    tooltip.select('.contentBox').html(d.avgExecutionTime + ' ms');
                    tooltip.style('display', 'block');
                })
                .on("mouseout", function() {
                    tooltip.style('display', 'none');
                });


So as the final step we can render the x axis as well as y axis once the chart is finished. To avoid the overlap with the rectangles, moves the y-axis 10 pixels left and also add the x and y axises labels as below.

// Renders the yAxis once the chart is finished
// Moves it to the left 10 pixels so it doesn't overlap


chart.append('g')
     .attr('class', 'axis')
     .attr('transform', 'translate(-10, 0)')
     .call(yAxis);

// Appends the xAxis


chart.append('g')
     .attr('class', 'axis')
     .attr('transform', 'translate(0,' + (height + 10) + ')')
     .call(xAxis);

// Adds xAxis title


chart.append('text')
     .text('AVG Execution Time (ms)')
     .attr('transform', 'translate('+(width/2 - 50)+', ' + (height + 50) + ')');

// Add yAxis title


chart.append('text')
     .text('Process Definition Key')
     .attr('transform', 'translate(-70, -20)');


Now I will show you the CSS code below and there you can see the styles which applies to the tooltip, SVG, x and y axies and div element.

.main {
       margin: 0px 25px;
}

svg {
    padding: 20px 40px;
}

.axis path,
.axis line {
      fill: none;
      stroke: black;
      shape-rendering: crispEdges;
}

text,
.axis text {
      font-size: 11px;
}

rect:hover {
      fill: orange;
}

.d3-tip {
        background: #eee;
        border-radius: 10px;
        box-shadow: 0 0 5px #999999;
        color: #333;
        display: none;
        font-size: 11px;
        left: 130px;
        padding: 12px;
        position: absolute;
        text-align: center;
        top: 95px;
        height: 20px;
        width: 100px;
        z-index: 10;
}


The resulting bar chart is now I will show you below —five bars representing the five items in our data set.



References

[1] https://medium.com/@c_behrens/enter-update-exit-6cafc6014c36#.ppi08m9d9
[2] http://www.jeromecukier.net/blog/2011/08/11/d3-scales-and-color/
[3] http://bost.ocks.org/mike/bar/3/

Saturday, November 14, 2015

Indexing and Searching through Lucene

Why Lucene in WSO2 Data Analytics Server ?

A common use-case for using Lucene indexing in Data Analytics Server (DAS) is to perform a full-text search on one or more persisted event stream data. DAS provides interactive data analysis (means it is used where a stored dataset can be queried in an ad-hoc manner in finding useful information more quickly and more accurately) for allowing you to search for persisted events using the Data Explorer .

What is Lucene ?

Lucene is an extremely rich and powerful full-text search (information retrieval) library which is written in Java. You can use Lucene to provide full-text indexing across both database objects and documents in various formats. Lucene provides search over documents. A document is essentially a collection of fields, where a field supplies a field name and value (name-value pair).

The primitive concept behind the Lucene is to take dataset and place it in fields to either be stored, indexed, or both indexed and stored. Indexed means you can search against that field, stored means you cannot search against the field but you can retrieve it’s contents. There are also non-stored and non-indexed fields but they are primarily used for the storage of metadata.

You can retrieve the dataset stored in the database, put it into fields (as name-value pair), put those fields into a "document", and then add the document to the indexing process. The index is a set of files on disk or in memory. There are multiple files contained in an an index and the files are platform independent.

Searching and Indexing through Lucene

Lucene is able to retrieve informations fast and efficiently because, instead of searching the text directly, it searches an index instead. This would be the equivalent of retrieving pages in a book related to a keyword by searching the index at the back of a book, as opposed to searching the words in each page of the book.

What actually gets indexed is a set of terms. A term (eg:- title:"Modern") combines a field name with a token that may be used for search. For instance, a title field like Modern Operating Systems, 2nd Edition might yield the tokens modern, operat, 2, and edition after case normalization, stemming and stoplisting. The index structure provides the reverse mapping from terms, consisting of field names and tokens, back to documents. This type of index is called an inverted index, because it inverts a page-centric data structure (page -> words) to a keyword-centric data structure (word -> pages). 

The following diagram shows how the indexing process happens in Lucene.
In WSO2 DAS, published events by data agents through event receivers can be persisted in RDBMS such as MySql and denormalizing the tables (RDBMS) into Lucene Documents when performing the lucene indexing.

The pseudo code will look something like this:

//The sql query to be performed
String sql = "SELECT DISTINCT processInstanceId, duration FROM PROCESS_USAGE_SUMMARY";
//ResultSet to hold the data retreived from the database  
ResultSet rs = stmt.executeQuery(sql);
while (rs.next()) {
    Document doc = new Document();
    doc.add(new Field("processInstanceId", rs,getString("processInstanceId"), Field.Store.YES, Field.Index.TOKENIZED));
    doc.add(new Field("duration", rs,getLong("duration"), Field.Store.YES, Field.Index.UN_TOKENIZED));
    // ... repeat for each column in result set
    writer.addDocument(doc);
}

When you perform a Search operation, it involves creating a Query (usually via a QueryParser) and handing this Query to an IndexSearcher, which returns a list of Hits. Actually this returns a set of documents according to the query you provided and from that extract the information in the documents and finally display the results. You can build the query string as the format provided in the WSO2 DAS (as a JSON string) and then pass it to its REST API  to return the result in the JSON format.

The Lucene query language allows the user to specify which field or fields to search on, which fields to give more weight, the ability to perform boolean queries (AND, OR, NOT) and other functionality as well. For more about Lucene query parser syntax click here.

References



Saturday, October 10, 2015

SPARK SQL User Defined Functions (UDFs) for WSO2 Data Analytics Server

What are UDFs ?

Generally SPARK-Sequel having some built-in functions, we can use that built-in functions in the Spark script without adding any extra code or calculation. However some times user requirement is not satisfied by that built-in functions. At that time user can write some own custom functions called UDFs and they are operate on distributed data-frames and works row by row unless you're creating an user defined aggregation function. WSO2 DAS has an abstraction layer for generic Spark UDF which makes it convenient to introduce UDFs to the server.

Here I will describe how to write Spark-Sequel UDF Example in Java.

Simple UDF to convert the date into the given date format

Step 1: Create the POJO class

The following example shows the UDF POJO for converting the date in the format of eg:Thu Sep 24 09:35:56 IST 2015 to the date in the format of yyyy-MM-dd. The name of the Spark UDF should be the name of the method defined in the class (in this example it is dateStr). This will be used when invoking the UDF through Spark-SQL. Here dateStr("Thu Sep 24 09:35:56 IST 2015") returns the String “2015-09-24”. (POJO class name: AnalyticsUDF)
















 Step 2: Packaging the class as jar

The custom UDF class you created should be bundled as a jar and added to <DAS_HOME>/repository/components/lib directory.

Step 3: Update spark UDF configuration file

Add the newly created custom UDF to the <DAS_HOME>/repository/conf/analytics/spark/spark-udf-config.xml file as shown in the example below.








(Here org.wso2.carbon.pc.spark.udfs is the package name of the class).