Fun With Big Data Using the Microsoft “Data Explorer” Pt. 1

Microsoft has released a new Community Tech Preview of a new tool for working with “Big Data”.  Just what is Big Data ? Well simply it’s compiled statistical information that is out and available in on the internet in the computing cloud.. There is a vast amount of data available today and data is now being collected and stored at a rate never seen before. Much, if not most, of this data however is locked into specific applications or formats and difficult to access or to integrate into new uses.  “Data Explorer as a tool allows you to start exploring these sources

Data comes from a number of different sources out there including:

SQL databases, Web Page Content (including RSS feeds), XML formatted metadata sources such as OData feeds, SharePoint Repositories and others..

image

Windows Azure Data Market Place

The Windows Azure™ Marketplace is an online market buying, and selling finished Software as a Service (SaaS) applications and premium datasets. The Windows Azure Marketplace helps connect companies seeking innovative cloud based solutions with partners who have developed solutions that are ready to use.

SharePoint

Every Microsoft SharePoint list and library in a site has a corresponding data source connection in the Data Source Library. To add a SharePoint list or library to the Data Source Library, you can either create a new list or library or create a new connection to an existing list or library.

Any SharePoint lists or libraries that you create will also automatically have a corresponding data source connection in the Data Source Library.

Odata

The Open Data Protocol (OData) is a Web protocol for querying and updating data that provides a way to unlock your data and free it from silos that exist in applications today. OData does this by applying and building upon Web technologies such as HTTP, Atom Publishing Protocol (AtomPub) and JSON to provide access to information from a variety of applications, services, and stores. The protocol emerged from experiences implementing AtomPub clients and servers in a variety of products over the past several years. OData is being used to expose and access information from a variety of sources including, but not limited to, relational databases, file systems, content management systems and traditional Web sites.

SQL Databases

SQL was initially developed at IBM by Donald D. Chamberlin and Raymond F. Boyce in the early 1970s.   Structured Query Language) is a programming language designed for managing data in relational database management systems (RDBMS). SQL databases have been the standard since they were invented.

Connection Walk-Thru

Let’s walk through a short sample of connecting to a data source. I will choose Netflix’s OData as a source, to make this example fun.

image

First press plus: at the top menu to create a new mash-up..

in the dialog box we will type in the name for our new mashup  and name it “NetflixMashup”

image

next we will add our data from the Netflix OData server.. Clicking on the “Data Feed” icon will allow us to create our new data source..

image

Our next step will be to add our NetFlix Feed URL

For this example I will use one of the feeds that are available as a top level resource, in this case the Netflix Catalog Titles http://odata.netflix.com/Catalog/

Next it will ask us how to connect to the feed.. We enter the feed URL

image

Add the URL to the mashup workflow wizard and click ‘Done’

Authentication Notes:

If the feed requires windows authentication, a name and password, or an OData feed key you will have an opportunity to enter it to set feed security options. Since the listing we are connecting to is public and has none of these we will press ‘continue’ to connect with it, leaving the ‘Use anonymous access’ option making sure that radio button is selected.

When the Data Feed is successfully parsed we will see the feed with the formatting schema.. then we can click ‘Done’ to continue.

Removing fields we won’t use

When the fields have been parsed on the data field we can  right click on the fields and select “remove fields”  on all of the the ones we won’t use and then click the ‘done’ button.

Next we are going to select the fields we are going to use to gather the data we are using.. For this demonstration I am going to just select ‘Titles’.

Next Steps

Now we can click on “more tools..

image

This will expose more menus..

image

Click or Double click on the Select Fields Icon/Toolbar and the screen should change..

image

Check the checkboxes for fields to include (I am selecting all of them) and click ‘Done’.

To Be Continued…

In part two we will output to a table and look at some results and then finish up with a look at an example using the Azure Data Marketplace with data.gov and do some statistical analysis..

More Information:

For another look at using this product check out Lynn Langit’s  Blog post on using Data Explorer on her blog..

http://lynnlangit.wordpress.com/2011/12/10/exploring-microsoft-data-explorer/

Advertisements

About Don Burnett

Changing how people interact with software

Posted on December 12, 2011, in Uncategorized. Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: