Category Archives: Machine Learning

Moving from Expression Web to Visual Studio 2012 for client-side HTML 5 Web Design by Don Burnett

A lot has happened this year in the world of Microsoft. Expression Studio has came to it’s end of life. This meant that Expression Web is now a community supported product.. This notice showed up on the Website:

“The web is now about applications as well as traditional web sites, and this requires a new set of tools. Microsoft is committed to offering a unified approach to focus on web design and development features in Microsoft Visual Studio 2012.

As part of this consolidation, Microsoft Visual Studio 2012 provides the leading web development tool, which enables you to design, develop, and maintain websites and web applications. Visual Studio 2012 makes it easy to build CSS-based websites from the ground up with new CSS layouts, HTML5 support and full featured capabilities for working with and debugging JavaScript. Learn more about Visual Studio Express 2012 for Web and WebMatrix 2.

Expression Web is now available as a free download from the Microsoft Download Center, and no new versions will be developed. Customers who previously purchased Expression Web will receive support through the established support lifecycle. Expression SuperPreview Remote Beta will continue running as a service through June 30, 2013.”

Download Expression Web 4 SP2, Free Version

So many Expression Community members were really dismayed about this as they really love using Web for their website design projects. Many thought that Microsoft was abandoning client-side web development. Well the good news is that really is the case, Microsoft is just moving to support web apps over websites.

Pure HTML Template for Visual Studio


Pure HTML is a basic Visual Studio project template for front-end web development using only HTML5, CSS3, and JavaScript that can be downloaded on CodePlex. Using this template, gives you a purely client-oriented web site ready for you to extend with just a few clicks.

The template works with the Professional, Premium, Ultimate, and Express for Web editions of Visual Studio 2012.  You can get Visual Studio 2012 Express for Web for free too.
The template is a installed as a visual studio extension. After you installed it you can create a new website

If you use VB:

  • File > New > Project > Visual Basic > Pure HTML Web Site
  • File > New > Web Site > Visual Basic > Pure HTML Web Site

If you use C#:

  • File > New > Project > Visual C# > Pure HTML Web Site
  • File > New > Web Site > Visual C# > Pure HTML Web Site


Expression Web Features versus Visual Studio 2012

While not quite the same, the user interface between the two products have a lot in common..




Feature comparisons: Publishing

Copy Website (FTP and site connectivity in Visual Studio 2012


Expression Web


Toolbox Comparison

Expression Web Toolbox Visual Studio 2012
image image

CSS Comparison Features (not a direct side-by-side comparison)

Expression Web Visual Studio 2012  
image image image

Code Editor Comparison Expression Web Versus Visual Studio 2012 (with Web Essentials Installed)

image image

While they look a lot the same the Visual Studio IDE is quite enhanced and can be enhanced further with additions including add-ons from companies like You will find yourself right at home in Visual Studio 2012 with a passel of new features added to it.

Microsoft Web Essentials


Another useful add-on is Microsoft’s Web Essentials which you can download here..

Quoting the link:


Full support for TypeScript preview and compilation. Remember to install the official plug-in for Visual Studio to take full advantage of TypeScript

When a TypeScript file (.ts) is saved in Visual Studio, Web Essentials will compile it automatically and generate a preview.

TypeScript regions

Some people hate them, other people loves them. This is a feature that was in the original Web Essentials 2010 an by popular request now made it in the 2012 version.


Source Maps

You can produce Source Map (.js.maps) files automatically by enabling it in Tools -> Options.

Compiler settings

You can set all the compiler settings from Tools -> Options


When a TypeScript file is compiled, it can now also be minified to produce a much smaller JavaScript file.


Option dialog

The most important features in Web Essentials can be turned on/off.


From version 1.9, options can be applied either globally or to individual solutions.

Vendor specific property generation

A lot of the new CSS 3 properties only work cross-browser if vendor specific properties are added. These include -moz, -webkit, -ms and -o.

The result is the insertion of the missing vendor specific properties in the right order.

If one or more of the vendor specific properties are already present, then only the missing ones are added.

Add missing standard properties

Invoke the Smart Tag to automatically insert any missing standard properties.

Keep vendor specific property values in sync while typing
Display browser support for properties and selectors

Just hover over any property name, value, pseudo or @-directive to get the relevant browser support matrix.

Modernizr support

Modernizr class names will now be bolded in the CSS editor, but more importantly, they will also be respected by the automatic hierarchical indentation feature of VS2012.


Intellisense for !important

By popular demand, here you go.

Always up-to-date with W3C and browsers

Web Essentials will automatically download the latest CSS schema files used to drive Intellisense and validation. It happens in the background automatically.

Intellinsense for “Add region…”

Regions are supported in the VS2012 CSS editor, but now it’s even easier to add them.

Choosing Add region… result in this snippet being inserted.

Intellisense for custom fonts

Click to open larger image

Intellisense for handling iOS scrollbars

VS2012 supports the different pseudo elements for customizing the iOS scrollbars. It can, however, be a little difficult to work with unless you know how to chain the pseudos correctly. That’s no longer a problem.


Intellisense for CSS 3 animation names

Inline URL picker for the url() function

Just start typing and the file system shows up in Intellisense.

Warning list guides for handling best practices
Warnings for browser compatibility issues

Is your stylesheet browser compliant? Let Web Essentials tell you.

More precise error messages
Removes warnings for using the \9 CSS hack
Document-wide remove duplicate properties

Intellisense for CSS gradients (all vendor specifics included)

Gradients are really difficult to write, so now examples are automatically inserted for all the different types of gradients, including the various vendor specific ones.

Option to hide unsupported CSS properties

Some of the CSS properties, such as the CSS 3 FlexBox Module are not supported by any browser yet. Now you can turn all unsupported properties and pseudos off.

CSS specificity tooltip

In case you’ve been wondering why certain styles are never applied in the browser, you can now see the specificity for each individual selector by hovering the mouse over them.

Option to hide “inherit” and “initial” global property values

Sometimes it can feel like these two properties are too noisy in Intellisense. Though they are completely valid, you might just want to hide them.

Easily darken and lighten color values

Place the cursor in a hex color value and hit SHIFT+CTRL+ARROW UP/DOWN to darken or lighten the color.

Move properties up and down

Place the cursor in a property and hit SHIFT+CTRL+ARROW UP/DOWN

F1 help opens the relevant specification

Uses to provide more accurate information than the W3C specifications.

Supports team-wide color schemes

More info on working with custom color palettes coming…

Up- and down arrows control numeric values

When the cursor is in or next to a numeric value such as 5px, .6em, 15% or just 23, you can use CTRL+SHIFT+UP to increase the number and CTRL+SHIFT+DOWN to decrease it. The feature is known from FireBug.

This works for CSS, SaSS and LESS files.

CoffeeScript and LESS preview window

Both LESS and CoffeeScript comes with a preview window located at the right side of the editor. It shows the compiled output every time you save the document.

If any LESS file name is prefixed with an underscore (_file.less), then it won’t generate a .css file automatically.

Embed url() references as base64 strings

This will take the referenced image and base64 encode it directly into your stylesheet. You have then eliminated an HTTP requst.

If the base64 string becomes to long, you can easily collapse it.

Remember to optimize your image files before embeding them. Use the Image Optimizer extension to make it effortless.

Color swatches

All color types are supported, including the new CSS 3 formats.


Right-click any CSS file in Solution Explorer to produce a *.min.css file. Whenever the source .css file is changed, the .min.css file is updated accordingly.


You can also minify the selected text in the editor by right-clicking the selection.

Font preview on mouse hover

Image preview on mouse hover

Sort properties

A Smart Tag on every selector enables you to easily sort all the properties within the rule.

As of version 1.9, the sorting is no longer alphabetically but instea uses the order specified by the CssComb project.

Drag and drop support for imaged and fonts

Drag and image onto the editor from either Solution Explorer or your desktop and a background-image property will be inserted with the relative path to the image file.

Do the same with a font file, but in this case all font files with the same name but different extensions (.ttf, .eot, .woff, .otf) will be added to the same @font-face rule.

Convert easily between hex, rgb and named color values

Adds SmartTags to selectors for targeting specific IE versions

Specific hacks can be used to target specific versions of IE on a selector level. These are all valid according to the W3C.

Selector Intellisense for HTML elements, classes and IDs

CSS/LESS document outline

Get a sneak-peek inside any CSS or LESS file directly from Solution Explorer.



JSHint for JavaScript

JSHint is a really good way of making sure your JavaScript follows certain coding guidelines and best practices. The default settings are very relaxed, but you can turn on more rules through the new options dialog.


The error window updates as you type, so you don’t have to right-click the .js file to kick off JSHint. It happens as you write.

Each indiviual JavaScript file can override the global settings by using the official JSHint comment format descripted in the JSHint documentation.

In version 1.8 you can also enable JSHint to run on build.

JavaScript regions

Some people hate them, other people loves them. This is a feature that was in the original Web Essentials 2010 an by popular request now made it in the 2012 version.


ZenCoding for HTML

Watch this short demo video and read more about the ZenCoding syntax.

Lorem Ipsum generator

As part of ZenCoding, you can now generate Lorem Ipsum code directly in the HTML editor. Simply type “lorem” and hit TAB and a 30 word Lorem Ipsum text is inserted. Type “lorem10” and a 10 word Lorem Ipsum text is inserted.

This can be used in conjuction with ZenCoding like so: ul>li*5>lorem3


See the compiled markdown in a preview window inside Visual Studio.”

February Meeting Announcement



● Improve Developer Technology Development
● Provide a unique forum To build sustainable technology driven developer communities and networks with other developers worldwide.

● Help us understand the role and benefits of technology and the shortening timeframes between game changing innovation.
● Provide an apolitical environment for exploring technology across a spectrum of vendors.

Group Photo

MeetUp to Discuss Cascading, A Streaming Map Reduce Workflow API
Thursday, February 23, 2012 at 7:00PM

Introducing Cascading: a Streaming Map Reduce Workflow API Tuesday, February 28,2012 at 7:00pm Richard W. Bailey Library, Washtenaw Community College Garden Level Lounge 1st floor.

Gunder Myran, Washtenaw Community College, Ypsilanti Township, MI 48197
Ypsilanti Township, Mi

See the full event details at

A look at Microsoft Research’s Infer.NET

By Don Burnett

What is and how is it useful ?

imageComputers make decisions for us every day, whether it’s using a search engine such as Microsoft’s Bing, or trying to show us advertising on a web site, that may be relevant to our interests. Behind what we see are powerful decision making programs that allow the computer to be “smart” about choices being made (in other words: less wrong). They are not perfect and “artificial intelligence” (also known as A.I.)  has been around for a very long time. These are usually hidden away under computer science topics such as “cognitive sciences and machine intelligence”.

My first encounter with such a system was in the 1980s and it was called an “Expert System” at the time and it was named Magellan. It was being done by a local Ann Arbor based company called Emerald Intelligence which went on to having much industry success. Over the years technology technology has improved as the internet has exploded.  At the heart of these systems there is something called an “Inference Engine”. 

According to Wikipedia:
“an inference engine is a computer program that tries to derive answers from a knowledge base. It is the “brain” that expert systems use to reason about the information in the knowledge base for the ultimate purpose of formulating new conclusions. Inference engines are considered to be a special case of reasoning engines, which can use more general methods of reasoning.”

Machine Intelligence at Microsoft Research

imageAs we all know there can be many applications of such an engine or framework for problem solving. The folks at Microsoft Research have been working on such an engine that you can use today for non-commercial projects.. It’s called Infer.NET.

is a framework for running Bayesian inference in graphical models. It can also be used for probabilistic programming.

You can use Infer.NET to solve many different kinds of machine learning problems, from standard problems like classification or clustering through to customized solutions to domain-specific problems. Infer.NET has been used in a wide variety of domains including information retrieval, bioinformatics, epidemiology, vision, and many others. “

Let’s take a look at how Infer.NET works and what it does for you..

The user creates a model definition using the API for modeling which specifies a set of inference queries relating to the model. The user then passes the model definition and inference queries to the model compiler, which creates the source code needed to perform those queries on the model, using the specified inference algorithm. Source code may be written to a file and used directly if you need to do so.

The source code is compiled to create a compiled algorithm. This can be manually executed to get refined control of how inference is execute or performed by the Infer method.  By passing the framework a set of observed values (arrays of data), the inference engine executes the compiled algorithm, to produce the marginal distributions requested in your query. This can be iterated/repeated for different settings of the observed values without recompiling it.

From the documentation:



What can we use this for and how is it useful ?


Problems that Infer.NET can solve for you.. The “Click Model Example”

One of the samples provided by Microsoft Research for instance allows us to glean what human relevance to be reconciled with document click counts.  These example models allow us the calibration of human judgment data against click data using query/document pairs for which we have both observations.

This can be used to identify data for which click data and human judgment data are inconsistent and and need clean up for a ranking model to be useful. It could also use the predicted labels or score and  supplement the human judgment training data.

The problem: 
A user submits a query to a search engine, the search engine returns a list of document hyperlinks to the user, along with a title and query-related snippet extracted from the document. The user looks at the list, and based on title and snippet, decides whether to click on a document in the list or whether to pass over it. These decisions are recorded in click log. The decision of a user to click or not click on a document in the list gives an indication as to whether the document is relevant or not.

The relevance of a document to a given query can also be determined by human judgments.
Judgments are usually  in the form of a set of labels with associated numeric values.

  1. Not Relevant
  2. Possibly Relevant
  3. Relevant

Building a successful search engine requires the collection of many human relevance judgments to create a valid document ranking system. These tend  to be much more expensive to collect and more valuable than the logs themselves in the grand scheme of things.

Code for Walk Through:

   1: using System;

   2: using System.Collections.Generic;

   3: using System.Text;

   4: using System.IO;

   5: using MicrosoftResearch.Infer;

   6: using MicrosoftResearch.Infer.Models;

   7: using MicrosoftResearch.Infer.Distributions;


   9: namespace MicrosoftResearch.Infer.Tutorials

  10: {

  11:     public class ClickModel

  12:     {

  13:         public void Run()

  14:         {

  15:             // Number of label classes for this example

  16:             int numLabels = 3;


  18:             // Train the model

  19:             ClickModelMarginals marginals = Model1(numLabels, false);

  20:             if (marginals == null)

  21:                 return;


  23:             //-----------------------------------------------------------------------------

  24:             // The prediction model

  25:             //-----------------------------------------------------------------------------


  27:             // The observations will be in the form of an array of distributions

  28:             Variable<int> numberOfObservations = Variable.New<int>().Named("NumObs");

  29:             Range r = new Range(numberOfObservations).Named("N");

  30:             VariableArray<Gaussian> observationDistribs = Variable.Array<Gaussian>(r).Named("Obs");

  31:             // Use the marginals from the trained model

  32:             Variable<double> scoreMean = Variable.Random<double>(marginals.marginalScoreMean).Named("scoreMean");

  33:             Variable<double> scorePrec = Variable.Random<double>(marginals.marginalScorePrec).Named("scorePrec");

  34:             Variable<double> judgePrec = Variable.Random<double>(marginals.marginalJudgePrec).Named("judgePrec");

  35:             Variable<double> clickPrec = Variable.Random<double>(marginals.marginalClickPrec).Named("clickPrec");

  36:             Variable<double>[] thresholds = new Variable<double>[numLabels + 1];


  38:             // Variables for each observation

  39:             VariableArray<double> scores = Variable.Array<double>(r).Named("Scores");

  40:             VariableArray<double> scoresJ = Variable.Array<double>(r).Named("ScoresJ");

  41:             VariableArray<double> scoresC = Variable.Array<double>(r).Named("ScoresC");

  42:             scores[r] = Variable.GaussianFromMeanAndPrecision(scoreMean, scorePrec).ForEach(r);

  43:             scoresJ[r] = Variable.GaussianFromMeanAndPrecision(scores[r], judgePrec);

  44:             scoresC[r] = Variable.GaussianFromMeanAndPrecision(scores[r], clickPrec);

  45:             // Constrain to the click observation

  46:             Variable.ConstrainEqualRandom(scoresC[r], observationDistribs[r]);

  47:             // The threshold variables

  48:             thresholds[0] = Variable.GaussianFromMeanAndVariance(Double.NegativeInfinity, 0.0).Named("thresholds0");

  49:             for (int i = 1; i < thresholds.Length - 1; i++)

  50:                 thresholds[i] = Variable.Random(marginals.marginalThresh[i]).Named("thresholds"+i);

  51:             thresholds[thresholds.Length - 1] = Variable.GaussianFromMeanAndVariance(Double.PositiveInfinity, 0.0).Named("thresholds"+(thresholds.Length-1));

  52:             // Boolean label variables

  53:             VariableArray<bool>[] testLabels = new VariableArray<bool>[numLabels];

  54:             for (int j = 0; j < numLabels; j++) {

  55:                 testLabels[j] = Variable.Array<bool>(r).Named("TestLabels" + j);

  56:                 testLabels[j][r] = Variable.IsBetween(scoresJ[r], thresholds[j], thresholds[j + 1]);

  57:             }


  59:             //--------------------------------------------------------------------

  60:             // Running the prediction model

  61:             //--------------------------------------------------------------------

  62:             int[] clicks = { 10, 100, 1000, 9, 99, 999, 10, 10, 10 };

  63:             int[] exams = { 20, 200, 2000, 10, 100, 1000, 100, 1000, 10000 };

  64:             Gaussian[] obs = new Gaussian[clicks.Length];

  65:             for (int i = 0; i < clicks.Length; i++) {

  66:                 int nC = clicks[i];    // Number of clicks 

  67:                 int nE = exams[i];     // Number of examinations

  68:                 int nNC = nE - nC;     // Number of non-clicks

  69:                 Beta b = new Beta(1.0 + nC, 1.0 + nNC);

  70:                 double m, v;

  71:                 b.GetMeanAndVariance(out m, out v);

  72:                 obs[i] = Gaussian.FromMeanAndVariance(m, v);

  73:             }


  75:             numberOfObservations.ObservedValue = obs.Length;

  76:             observationDistribs.ObservedValue = obs;

  77:             InferenceEngine engine = new InferenceEngine();

  78:             Gaussian[] latentScore = engine.Infer<Gaussian[]>(scores);

  79:             Bernoulli[][] predictedLabels = new Bernoulli[numLabels][];

  80:             for (int j = 0; j < numLabels; j++)

  81:                 predictedLabels[j] = engine.Infer<Bernoulli[]>(testLabels[j]);


  83:             Console.WriteLine("\n******   Some Predictions  ******\n");

  84:             Console.WriteLine("Clicks\tExams\t\tScore\t\tLabel0\t\tLabel1\t\tLabel2");

  85:             for (int i = 0; i < clicks.Length; i++) {

  86:                 Console.WriteLine("{0}\t{1}\t\t{2}\t\t{3}\t\t{4}\t\t{5}",

  87:                         clicks[i], exams[i], latentScore[i].GetMean().ToString("F4"),

  88:                         predictedLabels[0][i].GetProbTrue().ToString("F4"),

  89:                         predictedLabels[1][i].GetProbTrue().ToString("F4"),

  90:                         predictedLabels[2][i].GetProbTrue().ToString("F4"));

  91:             }

  92:         }


  94:         static private ClickModelMarginals Model1(int numLabels, bool allowNoExams)

  95:         {

  96:             //     Inference engine must be EP because of the ConstrainBetween constraint

  97:             InferenceEngine engine = new InferenceEngine();

  98:             if (!(engine.Algorithm is ExpectationPropagation))

  99:             {

 100:                 Console.WriteLine("This example only runs with Expectation Propagation");

 101:                 return null;

 102:             }

 103:             engine.NumberOfIterations = 10;  // Restrict the number of iterations


 105:             // Includes lower and upper bounds

 106:             int numThresholds = numLabels + 1;


 108:             //-------------------------------------------------------------

 109:             // Specify prior distributions

 110:             //-------------------------------------------------------------

 111:             Gaussian priorScoreMean = Gaussian.FromMeanAndVariance(0.5, 1.0);

 112:             Gamma priorScorePrec = Gamma.FromMeanAndVariance(2.0, 0.0);

 113:             Gamma priorJudgePrec = Gamma.FromMeanAndVariance(2.0, 1.0);

 114:             Gamma priorClickPrec = Gamma.FromMeanAndVariance(2.0, 1.0);

 115:             Gaussian[] priorThresholds;

 116:             CreateThresholdPriors(numLabels, out priorThresholds);


 118:             //-------------------------------------------------------------

 119:             // Variables to infer

 120:             //-------------------------------------------------------------

 121:             Variable<double> scoreMean = Variable.Random(priorScoreMean).Named("scoreMean");

 122:             Variable<double> scorePrec = Variable.Random(priorScorePrec).Named("scorePrec");

 123:             Variable<double> judgePrec = Variable.Random(priorJudgePrec).Named("judgePrec");

 124:             Variable<double> clickPrec = Variable.Random(priorClickPrec).Named("clickPrec");

 125:             Variable<double>[] thresholds = new Variable<double>[numLabels + 1];

 126:             for (int i = 0; i < thresholds.Length; i++)

 127:                 thresholds[i] = Variable.Random(priorThresholds[i]).Named("thresholds"+i);


 129:             //----------------------------------------------------------------------------------

 130:             // The model

 131:             //----------------------------------------------------------------------------------

 132:             VariableArray<Gaussian>[] observationDistribs = new VariableArray<Gaussian>[numLabels];

 133:             Variable<int>[] numberOfObservations = new Variable<int>[numLabels];

 134:             for (int i = 0; i < numLabels; i++) {

 135:                 numberOfObservations[i] = Variable.New<int>().Named("NumObs" + i);

 136:                 Range r = new Range(numberOfObservations[i]).Named("N" + i);

 137:                 //r.AddAttribute(new Sequential()); // option to get faster convergence

 138:                 observationDistribs[i] = Variable.Array<Gaussian>(r).Named("Obs" + i);

 139:                 VariableArray<double> scores = Variable.Array<double>(r).Named("Scores" + i);

 140:                 VariableArray<double> scoresJ = Variable.Array<double>(r).Named("ScoresJ" + i);

 141:                 VariableArray<double> scoresC = Variable.Array<double>(r).Named("ScoresC" + i);

 142:                 scores[r] = Variable.GaussianFromMeanAndPrecision(scoreMean, scorePrec).ForEach(r);

 143:                 scoresJ[r] = Variable.GaussianFromMeanAndPrecision(scores[r], judgePrec);

 144:                 scoresC[r] = Variable.GaussianFromMeanAndPrecision(scores[r], clickPrec);

 145:                 Variable.ConstrainBetween(scoresJ[r], thresholds[i], thresholds[i + 1]);

 146:                 Variable.ConstrainEqualRandom(scoresC[r], observationDistribs[i][r]);

 147:             }



 150:             // Get the arrays of human judgement labels, clicks, and examinations

 151:             int[] labels;

 152:             int[] clicks;

 153:             int[] exams;

 154:             LoadData(@"data/ClickModel.txt", allowNoExams, out labels, out clicks, out exams);

 155:             // Convert the raw click data into uncertain Gaussian observations chunk-by-chunk

 156:             Gaussian[][] allObs = getClickObservations(numLabels, labels, clicks, exams);

 157:             // (a) Set the observation and observation count parameters in the model

 158:             for (int i = 0; i < numLabels; i++) {

 159:                 numberOfObservations[i].ObservedValue = allObs[i].Length;

 160:                 observationDistribs[i].ObservedValue = allObs[i];

 161:             }

 162:             // (b) Request the marginals

 163:             ClickModelMarginals marginals = new ClickModelMarginals(numLabels);

 164:             marginals.marginalScoreMean = engine.Infer<Gaussian>(scoreMean);

 165:             marginals.marginalScorePrec = engine.Infer<Gamma>(scorePrec);

 166:             marginals.marginalJudgePrec = engine.Infer<Gamma>(judgePrec);

 167:             marginals.marginalClickPrec = engine.Infer<Gamma>(clickPrec);

 168:             for (int i = 0; i < numThresholds; i++)

 169:                 marginals.marginalThresh[i] = engine.Infer<Gaussian>(thresholds[i]);


 171:             Console.WriteLine("Training: sample size: " + labels.Length + "\n");

 172:             Console.WriteLine("scoreMean = {0}", marginals.marginalScoreMean);

 173:             Console.WriteLine("scorePrec = {0}", marginals.marginalScorePrec);

 174:             Console.WriteLine("judgePrec = {0}", marginals.marginalJudgePrec);

 175:             Console.WriteLine("clickPrec = {0}", marginals.marginalClickPrec);

 176:             for (int t = 0; t < numThresholds; t++)

 177:                 Console.WriteLine("threshMean {0} = {1}", t, marginals.marginalThresh[t]);


 179:             return marginals;

 180:         }


 182:         static private ClickModelMarginals Model2(int numLabels, bool allowNoExams)

 183:         {

 184:             // Inference engine must be EP because of the ConstrainBetween constraint

 185:             InferenceEngine engine = new InferenceEngine();

 186:             if (!(engine.Algorithm is ExpectationPropagation))

 187:             {

 188:                 Console.WriteLine("This example only runs with Expectation Propagation");

 189:                 return null;

 190:             }

 191:             engine.NumberOfIterations = 10;


 193:             // Includes lower and upper bounds

 194:             int numThresholds = numLabels + 1;

 195:             // Partition the dat into chunks to improve the schedule

 196:             int chunkSize = 200;

 197:             // Maximum number of passes through the data

 198:             int maxPasses = 5;

 199:             // The marginals at any given stage.

 200:             ClickModelMarginals marginals = new ClickModelMarginals(numLabels);

 201:             // Compare the marginals with the previous marginals to create

 202:             // a convergence criterion

 203:             Gaussian prevMargScoreMean;

 204:             Gamma prevMargJudgePrec;

 205:             Gamma prevMargClickPrec;

 206:             double convergenceThresh = 0.01;


 208:             // Get the arrays of human judgement labels, clicks, and examinations

 209:             int[] labels;

 210:             int[] clicks;

 211:             int[] exams;

 212:             LoadData(@"data/ClickModel.txt", allowNoExams, out labels, out clicks, out exams);

 213:             // Convert the raw click data into uncertain Gaussian observations chunk-by-chunk

 214:             Gaussian[][][] allObs = getClickObservations(numLabels, chunkSize, labels, clicks, exams);

 215:             int numChunks = allObs.Length;


 217:             //-------------------------------------------------------------

 218:             // Specify prior distributions

 219:             //-------------------------------------------------------------

 220:             Gaussian priorScoreMean = Gaussian.FromMeanAndVariance(0.5, 1.0);

 221:             Gamma priorScorePrec = Gamma.FromMeanAndVariance(2.0, 0.0);

 222:             Gamma priorJudgePrec = Gamma.FromMeanAndVariance(2.0, 1.0);

 223:             Gamma priorClickPrec = Gamma.FromMeanAndVariance(2.0, 1.0);

 224:             Gaussian[] priorThresholds;

 225:             CreateThresholdPriors(numLabels, out priorThresholds);

 226:             //-----------------------------------------------------

 227:             // Create shared variables - these are the variables

 228:             // which are shared between all chunks

 229:             //-----------------------------------------------------

 230:             Model model = new Model(numChunks);

 231:             SharedVariable<double> scoreMean = SharedVariable<double>.Random(priorScoreMean).Named("scoreMean");

 232:             SharedVariable<double> scorePrec = SharedVariable<double>.Random(priorScorePrec).Named("scorePrec");

 233:             SharedVariable<double> judgePrec = SharedVariable<double>.Random(priorJudgePrec).Named("judgePrec");

 234:             SharedVariable<double> clickPrec = SharedVariable<double>.Random(priorClickPrec).Named("clickPrec");

 235:             SharedVariable<double>[] thresholds = new SharedVariable<double>[numThresholds];

 236:             for (int t = 0; t < numThresholds; t++) {

 237:                 thresholds[t] = SharedVariable<double>.Random(priorThresholds[t]).Named("threshold" + t);

 238:             }


 240:             //----------------------------------------------------------------------------------

 241:             // The model

 242:             //----------------------------------------------------------------------------------


 244:             // Gaussian click observations are given to the model - one set of observations

 245:             // per label class. Also the number of observations per label class is given to the model

 246:             VariableArray<Gaussian>[] observationDistribs = new VariableArray<Gaussian>[numLabels];

 247:             Variable<int>[] numberOfObservations = new Variable<int>[numLabels];

 248:             // For each label, and each observation (consisting of a human judgement and

 249:             // a Gaussian click observation), there is a latent score variable, a judgement

 250:             // score variable, and a click score variable

 251:             for (int i = 0; i < numLabels; i++) {

 252:                 numberOfObservations[i] = Variable.New<int>().Named("NumObs" + i);

 253:                 Range r = new Range(numberOfObservations[i]).Named("N" + i);

 254:                 observationDistribs[i] = Variable.Array<Gaussian>(r).Named("Obs" + i);

 255:                 VariableArray<double> scores = Variable.Array<double>(r).Named("Scores" + i);

 256:                 VariableArray<double> scoresJ = Variable.Array<double>(r).Named("ScoresJ" + i);

 257:                 VariableArray<double> scoresC = Variable.Array<double>(r).Named("ScoresC" + i);

 258:                 scores[r] = Variable.GaussianFromMeanAndPrecision(scoreMean.GetCopyFor(model), scorePrec.GetCopyFor(model)).ForEach(r);

 259:                 scoresJ[r] = Variable.GaussianFromMeanAndPrecision(scores[r], judgePrec.GetCopyFor(model));

 260:                 scoresC[r] = Variable.GaussianFromMeanAndPrecision(scores[r], clickPrec.GetCopyFor(model));

 261:                 Variable.ConstrainEqualRandom(scoresC[r], observationDistribs[i][r]);

 262:                 Variable.ConstrainBetween(scoresJ[r], thresholds[i].GetCopyFor(model), thresholds[i + 1].GetCopyFor(model));

 263:             }


 265:             //----------------------------------------------------------

 266:             // Outer loop iterates over a number of passes

 267:             // Inner loop iterates over the unique labels

 268:             //----------------------------------------------------------

 269:             Console.WriteLine("Training: sample size: " + labels.Length + "\n");

 270:             for (int pass = 0; pass < maxPasses; pass++) {

 271:                 prevMargScoreMean = marginals.marginalScoreMean;

 272:                 prevMargJudgePrec = marginals.marginalJudgePrec;

 273:                 prevMargClickPrec = marginals.marginalClickPrec;

 274:                 for (int c = 0; c < numChunks; c++) {

 275:                     for (int i = 0; i < numLabels; i++) {

 276:                         numberOfObservations[i].ObservedValue = allObs[c][i].Length;

 277:                         observationDistribs[i].ObservedValue = allObs[c][i];

 278:                     }


 280:                     model.InferShared(engine,c);


 282:                     // Retrieve marginals

 283:                     marginals.marginalScoreMean = scoreMean.Marginal<Gaussian>();

 284:                     marginals.marginalScorePrec = scorePrec.Marginal<Gamma>();

 285:                     marginals.marginalJudgePrec = judgePrec.Marginal<Gamma>();

 286:                     marginals.marginalClickPrec = clickPrec.Marginal<Gamma>();

 287:                     for (int i = 0; i < numThresholds; i++)

 288:                         marginals.marginalThresh[i] = thresholds[i].Marginal<Gaussian>();


 290:                     Console.WriteLine("\n****** Pass {0}, chunk {1} ******", pass, c);

 291:                     Console.WriteLine("----- Marginals -----");

 292:                     Console.WriteLine("scoreMean = {0}", marginals.marginalScoreMean);

 293:                     Console.WriteLine("scorePrec = {0}", marginals.marginalScorePrec);

 294:                     Console.WriteLine("judgePrec = {0}", marginals.marginalJudgePrec);

 295:                     Console.WriteLine("clickPrec = {0}", marginals.marginalClickPrec);

 296:                     for (int t = 0; t < numThresholds; t++)

 297:                         Console.WriteLine("threshMean {0} = {1}", t, marginals.marginalThresh[t]);

 298:                 }

 299:                 // Test for convergence

 300:                 if (marginals.marginalScoreMean.MaxDiff(prevMargScoreMean) < convergenceThresh &&

 301:                         marginals.marginalJudgePrec.MaxDiff(prevMargJudgePrec) < convergenceThresh &&

 302:                         marginals.marginalClickPrec.MaxDiff(prevMargClickPrec) < convergenceThresh) {

 303:                     Console.WriteLine("\n****** Inference converged ******\n");

 304:                     break;

 305:                 }

 306:             }

 307:             return marginals;

 308:         }


 310:         // Method to read click data. This assumes a header row

 311:         // followed by data rows with tab or comma separated text

 312:         static private void LoadData(

 313:                 string ifn,         // The file name

 314:                 bool allowNoExams,  // Allow records with no examinations

 315:             out int[] labels,   // Labels

 316:             out int[] clicks,   // Clicks

 317:             out int[] exams)    // Examinations

 318:         {

 319:             // File is assumed to have a header row, followed by

 320:             // tab or comma separated label, clicks, exams

 321:             labels = null;

 322:             clicks = null;

 323:             exams = null;

 324:             int totalDocs = 0;

 325:             string myStr;

 326:             StreamReader mySR;

 327:             char[] sep = { '\t', ',' };


 329:             for (int pass = 0; pass < 2; pass++) {

 330:                 if (1 == pass) {

 331:                     labels = new int[totalDocs];

 332:                     clicks = new int[totalDocs];

 333:                     exams = new int[totalDocs];

 334:                     totalDocs = 0;

 335:                 }

 336:                 mySR = new StreamReader(ifn);

 337:                 mySR.ReadLine(); // Skip over header line

 338:                 while ((myStr = mySR.ReadLine()) != null) {

 339:                     string[] mySplitStr = myStr.Split(sep);

 340:                     int exm = int.Parse(mySplitStr[2]);

 341:                     // Only include data with non-zero examinations

 342:                     if (0 != exm || allowNoExams) {

 343:                         if (1 == pass) {

 344:                             int lab = int.Parse(mySplitStr[0]);

 345:                             int clk = int.Parse(mySplitStr[1]);

 346:                             labels[totalDocs] = lab;

 347:                             clicks[totalDocs] = clk;

 348:                             exams[totalDocs] = exm;

 349:                         }

 350:                         totalDocs++;

 351:                     }

 352:                 }

 353:                 mySR.Close();

 354:             }

 355:         }


 357:         // Count the number of documents for each label

 358:         static private int[] getLabelCounts(int numLabs, int[] labels)

 359:         {

 360:             return getLabelCounts(numLabs, labels, 0, labels.Length);

 361:         }


 363:         // Count the number of documents for each label for a given chunk

 364:         static private int[] getLabelCounts(int numLabs, int[] labels, int startX, int endX)

 365:         {

 366:             int[] cnt = new int[numLabs];

 367:             for (int l = 0; l < numLabs; l++)

 368:                 cnt[l] = 0;


 370:             if (startX < 0)

 371:                 startX = 0;

 372:             if (startX >= labels.Length)

 373:                 startX = labels.Length - 1;

 374:             if (endX < 0)

 375:                 endX = 0;

 376:             if (endX > labels.Length)

 377:                 endX = labels.Length;


 379:             for (int d = startX; d < endX; d++) {

 380:                 cnt[labels[d]]++;

 381:             }

 382:             return cnt;

 383:         }


 385:         // Get click observations for each label class

 386:         static private Gaussian[][] getClickObservations(int numLabs, int[] labels, int[] clicks, int[] exams)

 387:         {

 388:             Gaussian[][][] obs = getClickObservations(numLabs, labels.Length, labels, clicks, exams);

 389:             return obs[0];

 390:         }


 392:         // Get click observations for each chunk and label class

 393:         static private Gaussian[][][] getClickObservations(int numLabs, int chunkSize, int[] labels, int[] clicks, int[] exams)

 394:         {


 396:             int nData = labels.Length;

 397:             int numChunks = (nData + chunkSize - 1) / chunkSize;

 398:             Gaussian[][][] chunks = new Gaussian[numChunks][][];

 399:             int[] obsX = new int[numLabs];


 401:             int startChunk = 0;

 402:             int endChunk = 0;

 403:             for (int c = 0; c < numChunks; c++) {

 404:                 startChunk = endChunk;

 405:                 endChunk = startChunk + chunkSize;

 406:                 if (endChunk > nData)

 407:                     endChunk = nData;


 409:                 int[] labCnts = getLabelCounts(numLabs, labels, startChunk, endChunk);

 410:                 chunks[c] = new Gaussian[numLabs][];

 411:                 Gaussian[][] currChunk = chunks[c];

 412:                 for (int l = 0; l < numLabs; l++) {

 413:                     currChunk[l] = new Gaussian[labCnts[l]];

 414:                     obsX[l] = 0;

 415:                 }


 417:                 for (int d = startChunk; d < endChunk; d++) {

 418:                     int lab = labels[d];

 419:                     int nC = clicks[d];

 420:                     int nE = exams[d];

 421:                     int nNC = nE - nC;

 422:                     double b0 = 1.0 + nC;  // Observations of clicks

 423:                     double b1 = 1.0 + nNC;   // Observations of no clicks

 424:                     Beta b = new Beta(b0, b1);

 425:                     double m, v;

 426:                     b.GetMeanAndVariance(out m, out v);

 427:                     Gaussian g = new Gaussian();

 428:                     g.SetMeanAndVariance(m, v);

 429:                     currChunk[lab][obsX[lab]++] = g;

 430:                 }

 431:             }

 432:             return chunks;

 433:         }


 435:         // This method creates threshold priors - they are

 436:         // set at regular intervals between 0 and 1 with overlapping

 437:         // distributions. The lower and upper bounds are fixed to

 438:         // 0 and 1 respectively

 439:         static private void CreateThresholdPriors(

 440:                 int numLabels,

 441:                 out Gaussian[] priorThresholds)

 442:         {

 443:             double invNumLabs = 1.0 / ((double)numLabels);

 444:             double prec = (double)(numLabels * numLabels);

 445:             double mean = invNumLabs;

 446:             int numThresholds = numLabels + 1;

 447:             priorThresholds = new Gaussian[numThresholds];

 448:             priorThresholds[0] = Gaussian.PointMass(0);

 449:             for (int t = 1; t < numThresholds - 1; t++) {

 450:                 priorThresholds[t] = new Gaussian();

 451:                 priorThresholds[t].SetMeanAndPrecision(mean, prec);

 452:                 mean += invNumLabs;

 453:             }

 454:             priorThresholds[numThresholds - 1] = Gaussian.PointMass(1);

 455:         }

 456:     }

 457:     public class ClickModelMarginals

 458:     {

 459:         public Gaussian marginalScoreMean;

 460:         public Gamma marginalScorePrec;

 461:         public Gamma marginalJudgePrec;

 462:         public Gamma marginalClickPrec;

 463:         public Gaussian[] marginalThresh;


 465:         public ClickModelMarginals(int numLabels)

 466:         {

 467:             marginalScoreMean = new Gaussian();

 468:             marginalScorePrec = new Gamma();

 469:             marginalJudgePrec = new Gamma();

 470:             marginalClickPrec = new Gamma();

 471:             marginalThresh = new Gaussian[numLabels + 1];

 472:         }

 473:     }

 474: }

Output from Infer.NET

====== Output from ClickModel ======
Compiling model...done.
.........| 10
Training: sample size: 522

scoreMean = Gaussian(0.4612, 0.001029)
scorePrec = Gamma.PointMass(2)
judgePrec = Gamma(89.34, 0.2212)[mean=19.76]
clickPrec = Gamma(88.32, 0.2263)[mean=19.99]
threshMean 0 = Gaussian.PointMass(0)
threshMean 1 = Gaussian(0.2013, 0.0002222)
threshMean 2 = Gaussian(0.8018, 0.0002837)
threshMean 3 = Gaussian.PointMass(1)
Compiling model...done.
.........|.........|.........|.........|.........| 50

****** Some Predictions ******

Clicks Exams Score Label0 Label1 Label2
10 20 0.4958 0.1828 0.6435 0.1737
100 200 0.4964 0.1731 0.6619 0.1650
1000 2000 0.4965 0.1720 0.6641 0.1639
9 10 0.7929 0.0345 0.4764 0.4891
99 100 0.9328 0.0095 0.3278 0.6626
999 1000 0.9489 0.0082 0.3103 0.6815
10 100 0.1408 0.5767 0.4059 0.0174
10 1000 0.0522 0.6837 0.3081 0.0081
10 10000 0.0433 0.6939 0.2986 0.0075


How to solve this problem?

In this example two models are built to solve the problem. These models are the same except that the second one uses shared variables. The two models should give identical results provided the inference converges.

Building the First Model
In the first model of the example provided, each click or non-click provides evidence about the relevance of the query/document pair. The more examinations performed the more  believable the evidence is.

As suggested by the example author:

”We could think of the set of click/non-click events as the outcome of a binomial experiment – the probability of observing m clicks given N examinations is given by the binomial distribution Bin(m|N, m) where m is a parameter that we need to infer.”

Infer.NET does not provide built-in support for binomial distributions.

“We could add binomials in ourselves, but instead we consider each click/non-click event as outcomes of individual Bernoulli experiments, and include each click or non-click as an individual observed variable. However, this would create a large number of variables for each query/document pair, and might be impractical in a very large scale application.?..”

”Instead, we adopt a practical approach where the posterior for m is calculated outside the model. This posterior can be analytically and simply calculated as a beta distribution. We then use moment-matching to project this distribution onto a Gaussian distribution (the reason for this is that we will later be introducing a Gaussian score variable corresponding to this observation). All of this can be very simply done using the Infer.NET class libraries. For simplicity, we just assume for now that the observation distributions are in a single array, though this will change later.”

Understanding the Second Model and How it differs from the first

The  second model  takes care of the plumbing needed for sharing information between models. The SharedVariable class is a convenient wrapper class used to specify the variables that are shared between the models. Let’s now skip ahead to look at the differences between model one  and model two..

”Here we implement the same model as in click model 1 but with shared variables. There are a number of reasons why one might want to use shared variables including memory problems, parallelization, and more control over the schedule which might be necessary if there are convergence problems. Infer.NET provides a SharedVariable class and a Model class which ensure that the correct messages get marshaled between the different models. This model is available as Model2 in the example code. It mirrors the Model1 code except for the following:

  • SharedVariable objects are created in place of Variable objects for all variables that we want to infer; these are initialised with the priors.
  • Model code must be changed to refer to the instance of the SharedVariable for the current chunk.
  • The data is divided into identically sized chunks.
  • We explicitly loop over chunks, and do inference on each chunk. We need to loop over all chunks several times, checking marginals between each pass to test for convergence.
  • For each chunk, we use SharedVariable and Model class methods to obtain the variables for each sub model, and to perform inference on these variables, respectively.”

Using the models in Prediction

// The prediction model

// The observations will be in the form of an array of distributions
Variable<int> numberOfObservations = Variable.New<int>().Named("NumObs");
Range r = new Range(numberOfObservations).Named("N");
VariableArray<Gaussian> observationDistribs = Variable.Array<Gaussian>(r).Named("Obs");
// Use the marginals from the trained model
Variable<double> scoreMean = Variable.Random<double>(marginals.marginalScoreMean).Named("scoreMean");
Variable<double> scorePrec = Variable.Random<double>(marginals.marginalScorePrec).Named("scorePrec");
Variable<double> judgePrec = Variable.Random<double>(marginals.marginalJudgePrec).Named("judgePrec");
Variable<double> clickPrec = Variable.Random<double>(marginals.marginalClickPrec).Named("clickPrec");
Variable<double>[] thresholds = new Variable<double>[numLabels + 1];

// Variables for each observation
VariableArray<double> scores = Variable.Array<double>(r).Named("Scores");
VariableArray<double> scoresJ = Variable.Array<double>(r).Named("ScoresJ");
VariableArray<double> scoresC = Variable.Array<double>(r).Named("ScoresC");
scores[r] = Variable.GaussianFromMeanAndPrecision(scoreMean, scorePrec).ForEach(r);
scoresJ[r] = Variable.GaussianFromMeanAndPrecision(scores[r], judgePrec);
scoresC[r] = Variable.GaussianFromMeanAndPrecision(scores[r], clickPrec);
// Constrain to the click observation
Variable.ConstrainEqualRandom(scoresC[r], observationDistribs[r]);
// The threshold variables
thresholds[0] = Variable.GaussianFromMeanAndVariance(Double.NegativeInfinity, 0.0).Named("thresholds0");
for (int i = 1; i < thresholds.Length - 1; i++)
thresholds[i] = Variable.Random(marginals.marginalThresh[i]).Named("thresholds"+i);
thresholds[thresholds.Length - 1] = Variable.GaussianFromMeanAndVariance(Double.PositiveInfinity, 0.0).Named("thresholds"+(thresholds.Length-1));
// Boolean label variables
VariableArray<bool>[] testLabels = new VariableArray<bool>[numLabels];
for (int j = 0; j < numLabels; j++) {
testLabels[j] = Variable.Array<bool>(r).Named("TestLabels" + j);
testLabels[j][r] = Variable.IsBetween(scoresJ[r], thresholds[j], thresholds[j + 1]);

The training models used above were structured according to label class.  For prediction there is usually no label information. Many of the components of the model are the same as for training.

The differences

Inferred variables from the trained model are used as the priors for the prediction model.  Data is not partitioned according to label because there are no labels so there are no loop over labels. The lower and upper bound thresholds are set to negative infinity, and positive infinity rather than 0.0 and 1.0 – The label probabilities that will be output by the model sum to 1.0  An array of bool variables is set up The marginal distributions of these as Bernoulli distributions will give the probability of each label.

Running the Prediction from the Models


            // Running the prediction model


            int[] clicks = { 10, 100, 1000, 9, 99, 999, 10, 10, 10 };

            int[] exams = { 20, 200, 2000, 10, 100, 1000, 100, 1000, 10000 };

            Gaussian[] obs = new Gaussian[clicks.Length];

            for (int i = 0; i < clicks.Length; i++) {

                int nC = clicks[i];    // Number of clicks 

                int nE = exams[i];     // Number of examinations

                int nNC = nE - nC;     // Number of non-clicks

                Beta b = new Beta(1.0 + nC, 1.0 + nNC);

                double m, v;

                b.GetMeanAndVariance(out m, out v);

                obs[i] = Gaussian.FromMeanAndVariance(m, v);



            numberOfObservations.ObservedValue = obs.Length;

            observationDistribs.ObservedValue = obs;

            InferenceEngine engine = new InferenceEngine();

            Gaussian[] latentScore = engine.Infer<Gaussian[]>(scores);

            Bernoulli[][] predictedLabels = new Bernoulli[numLabels][];

            for (int j = 0; j < numLabels; j++)

                predictedLabels[j] = engine.Infer<Bernoulli[]>(testLabels[j]);


            Console.WriteLine("\n******   Some Predictions  ******\n");


            for (int i = 0; i < clicks.Length; i++) {


                        clicks[i], exams[i], latentScore[i].GetMean().ToString("F4"),







You will notice that the click data  is provided as arrays and examination counts. Click data is converted into Gaussian observations the same way as the training model uses (though not  by label). This distribution array is set as the value of the observationDistrib parameter. The marginals are then requested from the inference engine.  From the results we can determine that the more confident the model is of the labeling.

What Infer.NET does well..

Infer.NET provides the .NET programmer with:

  • Powerful and Flexible Model Construction
    The Infer.NET API modeling API makes converting a conceptual model into code simply and effectively. The API can be used to implement a wide range of models.
    Models supported include: Bayes point machine, latent Dirichlet allocation, factor analysis, and principal component analysis in only a few lines of code.
  • Scalable and Composable Models
    The Infer.NET modeling API is composable.   You can implement complex conceptual models from  building blocks. You don’t have to implement the entire model at once. You can start with a simplified conceptual model, which captures the basic features. You can then scale up the model and the data set in stages until you have a fully-implemented model that can process real data sets. You can also scale up these models computationally, starting with a small data set, scale up to handle much larger amounts of data, including using parallelized computation.
  • Built-in Inference Engine
    Infer.NET includes an inference engine that  allows for the computing of posteriors using Bayesian inference and numerical analysis . With Infer.NET, your application constructs a model, observes one or more variables, queries the inference engine for posteriors. The query is done in only a single line of code. The inference engine does the heavy lifting.
  • Separation of Model from Inference
    Infer.NET gets around the problem of no clear distinction between the model and the inference algorithm. 

    Infer.NET maintains a clear distinction between model and inference. The model encodes basic prior knowledge. An Infer.NET model is typically confined to a single relatively small block of code. The model is often encapsulated in a separate class, so that you can use the same model for different queries.

    A separate model is straightforward to understand and modify, and is much more resistant to inconsistencies. Inconsistencies that  creep in will be caught by the inference engine. The inference engine handles the computations. You can change the model without touching the inference engine, and you can change the inference algorithm without touching the model.

  • Drawbacks  is limited to relatively simple models. There can be difficulty in changing the model. This can easier to introduce inconsistencies to the model. Infer.NET limits you to a particular inference algorithm.