Monday, June 30, 2014

Unstructured Data and Business Context

The next time i get asked to give an example of unstructured data and how it has been used in BI solution, i am going to direct to this blog.

The data sets which are being labelled (to my knowledge) as unstructured are;

1. Text
2. Images
3. Audio
4. Video

I have been part of Business Intelligence projects in the past, which has used the first two and to some extent the third - Audio.

TEXT
I am going to sub categorize the "text" data set into two.

  1. Machine Generated. These are data generated by machines like digital signage player (which keeps playing content), wireless sensor (which captures temperature). The data captured can be massaged with a structure once we know the meta data the machine uses to generate the data. This then provides the context for the new data element.  
  2. Interaction Generated. These are the data streams generated by people when they interact with social networking site, feed back surveys, product/service feedback forms. These are text which people enter in open fields which brings out their likes, dislikes, sentiments, behavior, etc., This data can be brought into a structured form by the use of sentimental dimensions, which provide the ability to score the text stream. 

IMAGES
There is no dearth of images being captured especially when mobile devices are part of the business process. Images captured by themselves do not provide any business context. Ways and means have to be arrived to bring a structure to the data element. We had used images to calculate the visual appeal of the product packaging. In our case, a separate algorithm was run against the image to score the image for visual appeal and this provide the structure and the context to the image.

AUDIO
Audio is increasingly produced with the advent of smart devices in the support of business process. The audio by themselves do not provide any business context, but have to be processed to extract attributes in support of the business. Though we have used audio to time stamp the "greet time", to measure service levels in QSR solutions there are many more possibilities to use Audio data. One solution we considered is to use voice to text conversion and then apply the interaction text analysis. But if we have to capture the mood of the speaker (e.g. contact center) then we can apply separate set of interaction voice analysis.

To conclude It is obvious from that there are new forms of data beyond the traditional structured data which can be leveraged in a Business Intelligence or Big Data Solution. There is a need to create new data transformation techniques to add the business context and extract the value out of these new data types.

Monday, February 10, 2014

SharePoint Foundation

SharePoint Foundation is a free version of Office SharePoint Server. Is it really free? The answer I  think really depends on the use case.
If you are planning to use it as an intranet and have your users authenticated in your Active Directory then your licensing need is indirectly addressed as you anyways will have a valid windows server license. 
If you are planning to use it as extranet or as an external website and are wanting to provide access to your anonymous users, customer and vendors then you will have to look at Microsoft licensing as you will be either wanting CALs or Windows Server External Connector license.
So is it really free? I think so - as long as you are properly licensed for Microsoft Windows Server and are wanting to use it as an intranet. Having known that the next question is the have and have-not features of the product. By far the best compilation of comparison i have seen on share point is here.
http://blog.blksthl.com/2013/01/14/sharepoint-2013-feature-comparison-chart-all-editions/ 
In my experience it is a great free tool for document management and collaboration. In fact our entire practice and project repository resides on a share point foundation implemented with a little customisation. 

Thursday, January 2, 2014

Successful Analytics Practice

How do you start and grow a successful analytics practice? In my experience it is best to start with a three (or multiples) member team and then grow it based on the push and pull of project pipeline. The role of the three (described below) differ but come together for a successful implementation.

  1. Data Management Specialist 
    • Should possess good data management skills and excellent SQL skills
    • Should have worked in a data warehouse project 
    • Should have worked with major database vendor products and should be exposed to Hadoop, MapReduce, HBase and Cassandra
    • Responsible for creation of the data foundation layer 
  2. Data Modeler
    • Should have a statistical background and a flair for number crunching
    • Should be able to create algorithms and data mining models.
    • Responsible for exploitation of the data foundation layer
  3. Business Analyst 
    1. Should be able to interpret model output with business scenario and come up with actionable insight 
    2. Should be able to work with Analytical COTS products and have basic SQL knowledge
    3. Responsible for insight delivery from the exploitation layer

The three roles will come together for a successful data analytics project. The tool set choice will further determine the team composition but in essence the structure should be spun around the three core role.

Cost conscious companies unable or sometimes unwilling to build a team would argue and scout for the often difficult to find one resource who can do all three things - But to me that is not the right way to lay the foundation for growth in an emerging service area.