A look at Microsoft Research’s Infer.NET

By Don Burnett

What is Infer.net and how is it useful ?

imageComputers make decisions for us every day, whether it’s using a search engine such as Microsoft’s Bing, or trying to show us advertising on a web site, that may be relevant to our interests. Behind what we see are powerful decision making programs that allow the computer to be “smart” about choices being made (in other words: less wrong). They are not perfect and “artificial intelligence” (also known as A.I.)  has been around for a very long time. These are usually hidden away under computer science topics such as “cognitive sciences and machine intelligence”.

My first encounter with such a system was in the 1980s and it was called an “Expert System” at the time and it was named Magellan. It was being done by a local Ann Arbor based company called Emerald Intelligence which went on to having much industry success. Over the years technology technology has improved as the internet has exploded.  At the heart of these systems there is something called an “Inference Engine”. 

According to Wikipedia:
“an inference engine is a computer program that tries to derive answers from a knowledge base. It is the “brain” that expert systems use to reason about the information in the knowledge base for the ultimate purpose of formulating new conclusions. Inference engines are considered to be a special case of reasoning engines, which can use more general methods of reasoning.”

Machine Intelligence at Microsoft Research

imageAs we all know there can be many applications of such an engine or framework for problem solving. The folks at Microsoft Research have been working on such an engine that you can use today for non-commercial projects.. It’s called Infer.NET.


“Infer.NET
is a framework for running Bayesian inference in graphical models. It can also be used for probabilistic programming.

You can use Infer.NET to solve many different kinds of machine learning problems, from standard problems like classification or clustering through to customized solutions to domain-specific problems. Infer.NET has been used in a wide variety of domains including information retrieval, bioinformatics, epidemiology, vision, and many others. “

Let’s take a look at how Infer.NET works and what it does for you..

The user creates a model definition using the API for modeling which specifies a set of inference queries relating to the model. The user then passes the model definition and inference queries to the model compiler, which creates the source code needed to perform those queries on the model, using the specified inference algorithm. Source code may be written to a file and used directly if you need to do so.

The source code is compiled to create a compiled algorithm. This can be manually executed to get refined control of how inference is execute or performed by the Infer method.  By passing the framework a set of observed values (arrays of data), the inference engine executes the compiled algorithm, to produce the marginal distributions requested in your query. This can be iterated/repeated for different settings of the observed values without recompiling it.

From the documentation:

image

 

What can we use this for and how is it useful ?

image

Problems that Infer.NET can solve for you.. The “Click Model Example”

One of the samples provided by Microsoft Research for instance allows us to glean what human relevance to be reconciled with document click counts.  These example models allow us the calibration of human judgment data against click data using query/document pairs for which we have both observations.

This can be used to identify data for which click data and human judgment data are inconsistent and and need clean up for a ranking model to be useful. It could also use the predicted labels or score and  supplement the human judgment training data.

The problem: 
A user submits a query to a search engine, the search engine returns a list of document hyperlinks to the user, along with a title and query-related snippet extracted from the document. The user looks at the list, and based on title and snippet, decides whether to click on a document in the list or whether to pass over it. These decisions are recorded in click log. The decision of a user to click or not click on a document in the list gives an indication as to whether the document is relevant or not.

The relevance of a document to a given query can also be determined by human judgments.
Judgments are usually  in the form of a set of labels with associated numeric values.

  1. Not Relevant
  2. Possibly Relevant
  3. Relevant

Building a successful search engine requires the collection of many human relevance judgments to create a valid document ranking system. These tend  to be much more expensive to collect and more valuable than the logs themselves in the grand scheme of things.

Code for Walk Through:

   1: using System;

   2: using System.Collections.Generic;

   3: using System.Text;

   4: using System.IO;

   5: using MicrosoftResearch.Infer;

   6: using MicrosoftResearch.Infer.Models;

   7: using MicrosoftResearch.Infer.Distributions;

   8:  

   9: namespace MicrosoftResearch.Infer.Tutorials

  10: {

  11:     public class ClickModel

  12:     {

  13:         public void Run()

  14:         {

  15:             // Number of label classes for this example

  16:             int numLabels = 3;

  17:  

  18:             // Train the model

  19:             ClickModelMarginals marginals = Model1(numLabels, false);

  20:             if (marginals == null)

  21:                 return;

  22:  

  23:             //-----------------------------------------------------------------------------

  24:             // The prediction model

  25:             //-----------------------------------------------------------------------------

  26:  

  27:             // The observations will be in the form of an array of distributions

  28:             Variable<int> numberOfObservations = Variable.New<int>().Named("NumObs");

  29:             Range r = new Range(numberOfObservations).Named("N");

  30:             VariableArray<Gaussian> observationDistribs = Variable.Array<Gaussian>(r).Named("Obs");

  31:             // Use the marginals from the trained model

  32:             Variable<double> scoreMean = Variable.Random<double>(marginals.marginalScoreMean).Named("scoreMean");

  33:             Variable<double> scorePrec = Variable.Random<double>(marginals.marginalScorePrec).Named("scorePrec");

  34:             Variable<double> judgePrec = Variable.Random<double>(marginals.marginalJudgePrec).Named("judgePrec");

  35:             Variable<double> clickPrec = Variable.Random<double>(marginals.marginalClickPrec).Named("clickPrec");

  36:             Variable<double>[] thresholds = new Variable<double>[numLabels + 1];

  37:  

  38:             // Variables for each observation

  39:             VariableArray<double> scores = Variable.Array<double>(r).Named("Scores");

  40:             VariableArray<double> scoresJ = Variable.Array<double>(r).Named("ScoresJ");

  41:             VariableArray<double> scoresC = Variable.Array<double>(r).Named("ScoresC");

  42:             scores[r] = Variable.GaussianFromMeanAndPrecision(scoreMean, scorePrec).ForEach(r);

  43:             scoresJ[r] = Variable.GaussianFromMeanAndPrecision(scores[r], judgePrec);

  44:             scoresC[r] = Variable.GaussianFromMeanAndPrecision(scores[r], clickPrec);

  45:             // Constrain to the click observation

  46:             Variable.ConstrainEqualRandom(scoresC[r], observationDistribs[r]);

  47:             // The threshold variables

  48:             thresholds[0] = Variable.GaussianFromMeanAndVariance(Double.NegativeInfinity, 0.0).Named("thresholds0");

  49:             for (int i = 1; i < thresholds.Length - 1; i++)

  50:                 thresholds[i] = Variable.Random(marginals.marginalThresh[i]).Named("thresholds"+i);

  51:             thresholds[thresholds.Length - 1] = Variable.GaussianFromMeanAndVariance(Double.PositiveInfinity, 0.0).Named("thresholds"+(thresholds.Length-1));

  52:             // Boolean label variables

  53:             VariableArray<bool>[] testLabels = new VariableArray<bool>[numLabels];

  54:             for (int j = 0; j < numLabels; j++) {

  55:                 testLabels[j] = Variable.Array<bool>(r).Named("TestLabels" + j);

  56:                 testLabels[j][r] = Variable.IsBetween(scoresJ[r], thresholds[j], thresholds[j + 1]);

  57:             }

  58:  

  59:             //--------------------------------------------------------------------

  60:             // Running the prediction model

  61:             //--------------------------------------------------------------------

  62:             int[] clicks = { 10, 100, 1000, 9, 99, 999, 10, 10, 10 };

  63:             int[] exams = { 20, 200, 2000, 10, 100, 1000, 100, 1000, 10000 };

  64:             Gaussian[] obs = new Gaussian[clicks.Length];

  65:             for (int i = 0; i < clicks.Length; i++) {

  66:                 int nC = clicks[i];    // Number of clicks 

  67:                 int nE = exams[i];     // Number of examinations

  68:                 int nNC = nE - nC;     // Number of non-clicks

  69:                 Beta b = new Beta(1.0 + nC, 1.0 + nNC);

  70:                 double m, v;

  71:                 b.GetMeanAndVariance(out m, out v);

  72:                 obs[i] = Gaussian.FromMeanAndVariance(m, v);

  73:             }

  74:  

  75:             numberOfObservations.ObservedValue = obs.Length;

  76:             observationDistribs.ObservedValue = obs;

  77:             InferenceEngine engine = new InferenceEngine();

  78:             Gaussian[] latentScore = engine.Infer<Gaussian[]>(scores);

  79:             Bernoulli[][] predictedLabels = new Bernoulli[numLabels][];

  80:             for (int j = 0; j < numLabels; j++)

  81:                 predictedLabels[j] = engine.Infer<Bernoulli[]>(testLabels[j]);

  82:  

  83:             Console.WriteLine("\n******   Some Predictions  ******\n");

  84:             Console.WriteLine("Clicks\tExams\t\tScore\t\tLabel0\t\tLabel1\t\tLabel2");

  85:             for (int i = 0; i < clicks.Length; i++) {

  86:                 Console.WriteLine("{0}\t{1}\t\t{2}\t\t{3}\t\t{4}\t\t{5}",

  87:                         clicks[i], exams[i], latentScore[i].GetMean().ToString("F4"),

  88:                         predictedLabels[0][i].GetProbTrue().ToString("F4"),

  89:                         predictedLabels[1][i].GetProbTrue().ToString("F4"),

  90:                         predictedLabels[2][i].GetProbTrue().ToString("F4"));

  91:             }

  92:         }

  93:  

  94:         static private ClickModelMarginals Model1(int numLabels, bool allowNoExams)

  95:         {

  96:             //     Inference engine must be EP because of the ConstrainBetween constraint

  97:             InferenceEngine engine = new InferenceEngine();

  98:             if (!(engine.Algorithm is ExpectationPropagation))

  99:             {

 100:                 Console.WriteLine("This example only runs with Expectation Propagation");

 101:                 return null;

 102:             }

 103:             engine.NumberOfIterations = 10;  // Restrict the number of iterations

 104:  

 105:             // Includes lower and upper bounds

 106:             int numThresholds = numLabels + 1;

 107:  

 108:             //-------------------------------------------------------------

 109:             // Specify prior distributions

 110:             //-------------------------------------------------------------

 111:             Gaussian priorScoreMean = Gaussian.FromMeanAndVariance(0.5, 1.0);

 112:             Gamma priorScorePrec = Gamma.FromMeanAndVariance(2.0, 0.0);

 113:             Gamma priorJudgePrec = Gamma.FromMeanAndVariance(2.0, 1.0);

 114:             Gamma priorClickPrec = Gamma.FromMeanAndVariance(2.0, 1.0);

 115:             Gaussian[] priorThresholds;

 116:             CreateThresholdPriors(numLabels, out priorThresholds);

 117:  

 118:             //-------------------------------------------------------------

 119:             // Variables to infer

 120:             //-------------------------------------------------------------

 121:             Variable<double> scoreMean = Variable.Random(priorScoreMean).Named("scoreMean");

 122:             Variable<double> scorePrec = Variable.Random(priorScorePrec).Named("scorePrec");

 123:             Variable<double> judgePrec = Variable.Random(priorJudgePrec).Named("judgePrec");

 124:             Variable<double> clickPrec = Variable.Random(priorClickPrec).Named("clickPrec");

 125:             Variable<double>[] thresholds = new Variable<double>[numLabels + 1];

 126:             for (int i = 0; i < thresholds.Length; i++)

 127:                 thresholds[i] = Variable.Random(priorThresholds[i]).Named("thresholds"+i);

 128:  

 129:             //----------------------------------------------------------------------------------

 130:             // The model

 131:             //----------------------------------------------------------------------------------

 132:             VariableArray<Gaussian>[] observationDistribs = new VariableArray<Gaussian>[numLabels];

 133:             Variable<int>[] numberOfObservations = new Variable<int>[numLabels];

 134:             for (int i = 0; i < numLabels; i++) {

 135:                 numberOfObservations[i] = Variable.New<int>().Named("NumObs" + i);

 136:                 Range r = new Range(numberOfObservations[i]).Named("N" + i);

 137:                 //r.AddAttribute(new Sequential()); // option to get faster convergence

 138:                 observationDistribs[i] = Variable.Array<Gaussian>(r).Named("Obs" + i);

 139:                 VariableArray<double> scores = Variable.Array<double>(r).Named("Scores" + i);

 140:                 VariableArray<double> scoresJ = Variable.Array<double>(r).Named("ScoresJ" + i);

 141:                 VariableArray<double> scoresC = Variable.Array<double>(r).Named("ScoresC" + i);

 142:                 scores[r] = Variable.GaussianFromMeanAndPrecision(scoreMean, scorePrec).ForEach(r);

 143:                 scoresJ[r] = Variable.GaussianFromMeanAndPrecision(scores[r], judgePrec);

 144:                 scoresC[r] = Variable.GaussianFromMeanAndPrecision(scores[r], clickPrec);

 145:                 Variable.ConstrainBetween(scoresJ[r], thresholds[i], thresholds[i + 1]);

 146:                 Variable.ConstrainEqualRandom(scoresC[r], observationDistribs[i][r]);

 147:             }

 148:  

 149:  

 150:             // Get the arrays of human judgement labels, clicks, and examinations

 151:             int[] labels;

 152:             int[] clicks;

 153:             int[] exams;

 154:             LoadData(@"data/ClickModel.txt", allowNoExams, out labels, out clicks, out exams);

 155:             // Convert the raw click data into uncertain Gaussian observations chunk-by-chunk

 156:             Gaussian[][] allObs = getClickObservations(numLabels, labels, clicks, exams);

 157:             // (a) Set the observation and observation count parameters in the model

 158:             for (int i = 0; i < numLabels; i++) {

 159:                 numberOfObservations[i].ObservedValue = allObs[i].Length;

 160:                 observationDistribs[i].ObservedValue = allObs[i];

 161:             }

 162:             // (b) Request the marginals

 163:             ClickModelMarginals marginals = new ClickModelMarginals(numLabels);

 164:             marginals.marginalScoreMean = engine.Infer<Gaussian>(scoreMean);

 165:             marginals.marginalScorePrec = engine.Infer<Gamma>(scorePrec);

 166:             marginals.marginalJudgePrec = engine.Infer<Gamma>(judgePrec);

 167:             marginals.marginalClickPrec = engine.Infer<Gamma>(clickPrec);

 168:             for (int i = 0; i < numThresholds; i++)

 169:                 marginals.marginalThresh[i] = engine.Infer<Gaussian>(thresholds[i]);

 170:  

 171:             Console.WriteLine("Training: sample size: " + labels.Length + "\n");

 172:             Console.WriteLine("scoreMean = {0}", marginals.marginalScoreMean);

 173:             Console.WriteLine("scorePrec = {0}", marginals.marginalScorePrec);

 174:             Console.WriteLine("judgePrec = {0}", marginals.marginalJudgePrec);

 175:             Console.WriteLine("clickPrec = {0}", marginals.marginalClickPrec);

 176:             for (int t = 0; t < numThresholds; t++)

 177:                 Console.WriteLine("threshMean {0} = {1}", t, marginals.marginalThresh[t]);

 178:  

 179:             return marginals;

 180:         }

 181:  

 182:         static private ClickModelMarginals Model2(int numLabels, bool allowNoExams)

 183:         {

 184:             // Inference engine must be EP because of the ConstrainBetween constraint

 185:             InferenceEngine engine = new InferenceEngine();

 186:             if (!(engine.Algorithm is ExpectationPropagation))

 187:             {

 188:                 Console.WriteLine("This example only runs with Expectation Propagation");

 189:                 return null;

 190:             }

 191:             engine.NumberOfIterations = 10;

 192:  

 193:             // Includes lower and upper bounds

 194:             int numThresholds = numLabels + 1;

 195:             // Partition the dat into chunks to improve the schedule

 196:             int chunkSize = 200;

 197:             // Maximum number of passes through the data

 198:             int maxPasses = 5;

 199:             // The marginals at any given stage.

 200:             ClickModelMarginals marginals = new ClickModelMarginals(numLabels);

 201:             // Compare the marginals with the previous marginals to create

 202:             // a convergence criterion

 203:             Gaussian prevMargScoreMean;

 204:             Gamma prevMargJudgePrec;

 205:             Gamma prevMargClickPrec;

 206:             double convergenceThresh = 0.01;

 207:  

 208:             // Get the arrays of human judgement labels, clicks, and examinations

 209:             int[] labels;

 210:             int[] clicks;

 211:             int[] exams;

 212:             LoadData(@"data/ClickModel.txt", allowNoExams, out labels, out clicks, out exams);

 213:             // Convert the raw click data into uncertain Gaussian observations chunk-by-chunk

 214:             Gaussian[][][] allObs = getClickObservations(numLabels, chunkSize, labels, clicks, exams);

 215:             int numChunks = allObs.Length;

 216:  

 217:             //-------------------------------------------------------------

 218:             // Specify prior distributions

 219:             //-------------------------------------------------------------

 220:             Gaussian priorScoreMean = Gaussian.FromMeanAndVariance(0.5, 1.0);

 221:             Gamma priorScorePrec = Gamma.FromMeanAndVariance(2.0, 0.0);

 222:             Gamma priorJudgePrec = Gamma.FromMeanAndVariance(2.0, 1.0);

 223:             Gamma priorClickPrec = Gamma.FromMeanAndVariance(2.0, 1.0);

 224:             Gaussian[] priorThresholds;

 225:             CreateThresholdPriors(numLabels, out priorThresholds);

 226:             //-----------------------------------------------------

 227:             // Create shared variables - these are the variables

 228:             // which are shared between all chunks

 229:             //-----------------------------------------------------

 230:             Model model = new Model(numChunks);

 231:             SharedVariable<double> scoreMean = SharedVariable<double>.Random(priorScoreMean).Named("scoreMean");

 232:             SharedVariable<double> scorePrec = SharedVariable<double>.Random(priorScorePrec).Named("scorePrec");

 233:             SharedVariable<double> judgePrec = SharedVariable<double>.Random(priorJudgePrec).Named("judgePrec");

 234:             SharedVariable<double> clickPrec = SharedVariable<double>.Random(priorClickPrec).Named("clickPrec");

 235:             SharedVariable<double>[] thresholds = new SharedVariable<double>[numThresholds];

 236:             for (int t = 0; t < numThresholds; t++) {

 237:                 thresholds[t] = SharedVariable<double>.Random(priorThresholds[t]).Named("threshold" + t);

 238:             }

 239:  

 240:             //----------------------------------------------------------------------------------

 241:             // The model

 242:             //----------------------------------------------------------------------------------

 243:  

 244:             // Gaussian click observations are given to the model - one set of observations

 245:             // per label class. Also the number of observations per label class is given to the model

 246:             VariableArray<Gaussian>[] observationDistribs = new VariableArray<Gaussian>[numLabels];

 247:             Variable<int>[] numberOfObservations = new Variable<int>[numLabels];

 248:             // For each label, and each observation (consisting of a human judgement and

 249:             // a Gaussian click observation), there is a latent score variable, a judgement

 250:             // score variable, and a click score variable

 251:             for (int i = 0; i < numLabels; i++) {

 252:                 numberOfObservations[i] = Variable.New<int>().Named("NumObs" + i);

 253:                 Range r = new Range(numberOfObservations[i]).Named("N" + i);

 254:                 observationDistribs[i] = Variable.Array<Gaussian>(r).Named("Obs" + i);

 255:                 VariableArray<double> scores = Variable.Array<double>(r).Named("Scores" + i);

 256:                 VariableArray<double> scoresJ = Variable.Array<double>(r).Named("ScoresJ" + i);

 257:                 VariableArray<double> scoresC = Variable.Array<double>(r).Named("ScoresC" + i);

 258:                 scores[r] = Variable.GaussianFromMeanAndPrecision(scoreMean.GetCopyFor(model), scorePrec.GetCopyFor(model)).ForEach(r);

 259:                 scoresJ[r] = Variable.GaussianFromMeanAndPrecision(scores[r], judgePrec.GetCopyFor(model));

 260:                 scoresC[r] = Variable.GaussianFromMeanAndPrecision(scores[r], clickPrec.GetCopyFor(model));

 261:                 Variable.ConstrainEqualRandom(scoresC[r], observationDistribs[i][r]);

 262:                 Variable.ConstrainBetween(scoresJ[r], thresholds[i].GetCopyFor(model), thresholds[i + 1].GetCopyFor(model));

 263:             }

 264:  

 265:             //----------------------------------------------------------

 266:             // Outer loop iterates over a number of passes

 267:             // Inner loop iterates over the unique labels

 268:             //----------------------------------------------------------

 269:             Console.WriteLine("Training: sample size: " + labels.Length + "\n");

 270:             for (int pass = 0; pass < maxPasses; pass++) {

 271:                 prevMargScoreMean = marginals.marginalScoreMean;

 272:                 prevMargJudgePrec = marginals.marginalJudgePrec;

 273:                 prevMargClickPrec = marginals.marginalClickPrec;

 274:                 for (int c = 0; c < numChunks; c++) {

 275:                     for (int i = 0; i < numLabels; i++) {

 276:                         numberOfObservations[i].ObservedValue = allObs[c][i].Length;

 277:                         observationDistribs[i].ObservedValue = allObs[c][i];

 278:                     }

 279:  

 280:                     model.InferShared(engine,c);

 281:  

 282:                     // Retrieve marginals

 283:                     marginals.marginalScoreMean = scoreMean.Marginal<Gaussian>();

 284:                     marginals.marginalScorePrec = scorePrec.Marginal<Gamma>();

 285:                     marginals.marginalJudgePrec = judgePrec.Marginal<Gamma>();

 286:                     marginals.marginalClickPrec = clickPrec.Marginal<Gamma>();

 287:                     for (int i = 0; i < numThresholds; i++)

 288:                         marginals.marginalThresh[i] = thresholds[i].Marginal<Gaussian>();

 289:  

 290:                     Console.WriteLine("\n****** Pass {0}, chunk {1} ******", pass, c);

 291:                     Console.WriteLine("----- Marginals -----");

 292:                     Console.WriteLine("scoreMean = {0}", marginals.marginalScoreMean);

 293:                     Console.WriteLine("scorePrec = {0}", marginals.marginalScorePrec);

 294:                     Console.WriteLine("judgePrec = {0}", marginals.marginalJudgePrec);

 295:                     Console.WriteLine("clickPrec = {0}", marginals.marginalClickPrec);

 296:                     for (int t = 0; t < numThresholds; t++)

 297:                         Console.WriteLine("threshMean {0} = {1}", t, marginals.marginalThresh[t]);

 298:                 }

 299:                 // Test for convergence

 300:                 if (marginals.marginalScoreMean.MaxDiff(prevMargScoreMean) < convergenceThresh &&

 301:                         marginals.marginalJudgePrec.MaxDiff(prevMargJudgePrec) < convergenceThresh &&

 302:                         marginals.marginalClickPrec.MaxDiff(prevMargClickPrec) < convergenceThresh) {

 303:                     Console.WriteLine("\n****** Inference converged ******\n");

 304:                     break;

 305:                 }

 306:             }

 307:             return marginals;

 308:         }

 309:  

 310:         // Method to read click data. This assumes a header row

 311:         // followed by data rows with tab or comma separated text

 312:         static private void LoadData(

 313:                 string ifn,         // The file name

 314:                 bool allowNoExams,  // Allow records with no examinations

 315:             out int[] labels,   // Labels

 316:             out int[] clicks,   // Clicks

 317:             out int[] exams)    // Examinations

 318:         {

 319:             // File is assumed to have a header row, followed by

 320:             // tab or comma separated label, clicks, exams

 321:             labels = null;

 322:             clicks = null;

 323:             exams = null;

 324:             int totalDocs = 0;

 325:             string myStr;

 326:             StreamReader mySR;

 327:             char[] sep = { '\t', ',' };

 328:  

 329:             for (int pass = 0; pass < 2; pass++) {

 330:                 if (1 == pass) {

 331:                     labels = new int[totalDocs];

 332:                     clicks = new int[totalDocs];

 333:                     exams = new int[totalDocs];

 334:                     totalDocs = 0;

 335:                 }

 336:                 mySR = new StreamReader(ifn);

 337:                 mySR.ReadLine(); // Skip over header line

 338:                 while ((myStr = mySR.ReadLine()) != null) {

 339:                     string[] mySplitStr = myStr.Split(sep);

 340:                     int exm = int.Parse(mySplitStr[2]);

 341:                     // Only include data with non-zero examinations

 342:                     if (0 != exm || allowNoExams) {

 343:                         if (1 == pass) {

 344:                             int lab = int.Parse(mySplitStr[0]);

 345:                             int clk = int.Parse(mySplitStr[1]);

 346:                             labels[totalDocs] = lab;

 347:                             clicks[totalDocs] = clk;

 348:                             exams[totalDocs] = exm;

 349:                         }

 350:                         totalDocs++;

 351:                     }

 352:                 }

 353:                 mySR.Close();

 354:             }

 355:         }

 356:  

 357:         // Count the number of documents for each label

 358:         static private int[] getLabelCounts(int numLabs, int[] labels)

 359:         {

 360:             return getLabelCounts(numLabs, labels, 0, labels.Length);

 361:         }

 362:  

 363:         // Count the number of documents for each label for a given chunk

 364:         static private int[] getLabelCounts(int numLabs, int[] labels, int startX, int endX)

 365:         {

 366:             int[] cnt = new int[numLabs];

 367:             for (int l = 0; l < numLabs; l++)

 368:                 cnt[l] = 0;

 369:  

 370:             if (startX < 0)

 371:                 startX = 0;

 372:             if (startX >= labels.Length)

 373:                 startX = labels.Length - 1;

 374:             if (endX < 0)

 375:                 endX = 0;

 376:             if (endX > labels.Length)

 377:                 endX = labels.Length;

 378:  

 379:             for (int d = startX; d < endX; d++) {

 380:                 cnt[labels[d]]++;

 381:             }

 382:             return cnt;

 383:         }

 384:  

 385:         // Get click observations for each label class

 386:         static private Gaussian[][] getClickObservations(int numLabs, int[] labels, int[] clicks, int[] exams)

 387:         {

 388:             Gaussian[][][] obs = getClickObservations(numLabs, labels.Length, labels, clicks, exams);

 389:             return obs[0];

 390:         }

 391:  

 392:         // Get click observations for each chunk and label class

 393:         static private Gaussian[][][] getClickObservations(int numLabs, int chunkSize, int[] labels, int[] clicks, int[] exams)

 394:         {

 395:  

 396:             int nData = labels.Length;

 397:             int numChunks = (nData + chunkSize - 1) / chunkSize;

 398:             Gaussian[][][] chunks = new Gaussian[numChunks][][];

 399:             int[] obsX = new int[numLabs];

 400:  

 401:             int startChunk = 0;

 402:             int endChunk = 0;

 403:             for (int c = 0; c < numChunks; c++) {

 404:                 startChunk = endChunk;

 405:                 endChunk = startChunk + chunkSize;

 406:                 if (endChunk > nData)

 407:                     endChunk = nData;

 408:  

 409:                 int[] labCnts = getLabelCounts(numLabs, labels, startChunk, endChunk);

 410:                 chunks[c] = new Gaussian[numLabs][];

 411:                 Gaussian[][] currChunk = chunks[c];

 412:                 for (int l = 0; l < numLabs; l++) {

 413:                     currChunk[l] = new Gaussian[labCnts[l]];

 414:                     obsX[l] = 0;

 415:                 }

 416:  

 417:                 for (int d = startChunk; d < endChunk; d++) {

 418:                     int lab = labels[d];

 419:                     int nC = clicks[d];

 420:                     int nE = exams[d];

 421:                     int nNC = nE - nC;

 422:                     double b0 = 1.0 + nC;  // Observations of clicks

 423:                     double b1 = 1.0 + nNC;   // Observations of no clicks

 424:                     Beta b = new Beta(b0, b1);

 425:                     double m, v;

 426:                     b.GetMeanAndVariance(out m, out v);

 427:                     Gaussian g = new Gaussian();

 428:                     g.SetMeanAndVariance(m, v);

 429:                     currChunk[lab][obsX[lab]++] = g;

 430:                 }

 431:             }

 432:             return chunks;

 433:         }

 434:  

 435:         // This method creates threshold priors - they are

 436:         // set at regular intervals between 0 and 1 with overlapping

 437:         // distributions. The lower and upper bounds are fixed to

 438:         // 0 and 1 respectively

 439:         static private void CreateThresholdPriors(

 440:                 int numLabels,

 441:                 out Gaussian[] priorThresholds)

 442:         {

 443:             double invNumLabs = 1.0 / ((double)numLabels);

 444:             double prec = (double)(numLabels * numLabels);

 445:             double mean = invNumLabs;

 446:             int numThresholds = numLabels + 1;

 447:             priorThresholds = new Gaussian[numThresholds];

 448:             priorThresholds[0] = Gaussian.PointMass(0);

 449:             for (int t = 1; t < numThresholds - 1; t++) {

 450:                 priorThresholds[t] = new Gaussian();

 451:                 priorThresholds[t].SetMeanAndPrecision(mean, prec);

 452:                 mean += invNumLabs;

 453:             }

 454:             priorThresholds[numThresholds - 1] = Gaussian.PointMass(1);

 455:         }

 456:     }

 457:     public class ClickModelMarginals

 458:     {

 459:         public Gaussian marginalScoreMean;

 460:         public Gamma marginalScorePrec;

 461:         public Gamma marginalJudgePrec;

 462:         public Gamma marginalClickPrec;

 463:         public Gaussian[] marginalThresh;

 464:  

 465:         public ClickModelMarginals(int numLabels)

 466:         {

 467:             marginalScoreMean = new Gaussian();

 468:             marginalScorePrec = new Gamma();

 469:             marginalJudgePrec = new Gamma();

 470:             marginalClickPrec = new Gamma();

 471:             marginalThresh = new Gaussian[numLabels + 1];

 472:         }

 473:     }

 474: }

Output from Infer.NET

====== Output from ClickModel ======
Compiling model...done.
Iterating:
.........| 10
Training: sample size: 522

scoreMean = Gaussian(0.4612, 0.001029)
scorePrec = Gamma.PointMass(2)
judgePrec = Gamma(89.34, 0.2212)[mean=19.76]
clickPrec = Gamma(88.32, 0.2263)[mean=19.99]
threshMean 0 = Gaussian.PointMass(0)
threshMean 1 = Gaussian(0.2013, 0.0002222)
threshMean 2 = Gaussian(0.8018, 0.0002837)
threshMean 3 = Gaussian.PointMass(1)
Compiling model...done.
Iterating:
.........|.........|.........|.........|.........| 50

****** Some Predictions ******

Clicks Exams Score Label0 Label1 Label2
10 20 0.4958 0.1828 0.6435 0.1737
100 200 0.4964 0.1731 0.6619 0.1650
1000 2000 0.4965 0.1720 0.6641 0.1639
9 10 0.7929 0.0345 0.4764 0.4891
99 100 0.9328 0.0095 0.3278 0.6626
999 1000 0.9489 0.0082 0.3103 0.6815
10 100 0.1408 0.5767 0.4059 0.0174
10 1000 0.0522 0.6837 0.3081 0.0081
10 10000 0.0433 0.6939 0.2986 0.0075

 

How to solve this problem?

In this example two models are built to solve the problem. These models are the same except that the second one uses shared variables. The two models should give identical results provided the inference converges.

Building the First Model
In the first model of the example provided, each click or non-click provides evidence about the relevance of the query/document pair. The more examinations performed the more  believable the evidence is.

As suggested by the example author:

”We could think of the set of click/non-click events as the outcome of a binomial experiment – the probability of observing m clicks given N examinations is given by the binomial distribution Bin(m|N, m) where m is a parameter that we need to infer.”

Infer.NET does not provide built-in support for binomial distributions.

“We could add binomials in ourselves, but instead we consider each click/non-click event as outcomes of individual Bernoulli experiments, and include each click or non-click as an individual observed variable. However, this would create a large number of variables for each query/document pair, and might be impractical in a very large scale application.?..”

”Instead, we adopt a practical approach where the posterior for m is calculated outside the model. This posterior can be analytically and simply calculated as a beta distribution. We then use moment-matching to project this distribution onto a Gaussian distribution (the reason for this is that we will later be introducing a Gaussian score variable corresponding to this observation). All of this can be very simply done using the Infer.NET class libraries. For simplicity, we just assume for now that the observation distributions are in a single array, though this will change later.”

Understanding the Second Model and How it differs from the first

The  second model  takes care of the plumbing needed for sharing information between models. The SharedVariable class is a convenient wrapper class used to specify the variables that are shared between the models. Let’s now skip ahead to look at the differences between model one  and model two..

”Here we implement the same model as in click model 1 but with shared variables. There are a number of reasons why one might want to use shared variables including memory problems, parallelization, and more control over the schedule which might be necessary if there are convergence problems. Infer.NET provides a SharedVariable class and a Model class which ensure that the correct messages get marshaled between the different models. This model is available as Model2 in the example code. It mirrors the Model1 code except for the following:

  • SharedVariable objects are created in place of Variable objects for all variables that we want to infer; these are initialised with the priors.
  • Model code must be changed to refer to the instance of the SharedVariable for the current chunk.
  • The data is divided into identically sized chunks.
  • We explicitly loop over chunks, and do inference on each chunk. We need to loop over all chunks several times, checking marginals between each pass to test for convergence.
  • For each chunk, we use SharedVariable and Model class methods to obtain the variables for each sub model, and to perform inference on these variables, respectively.”

Using the models in Prediction

//-----------------------------------------------------------------------------
// The prediction model
//-----------------------------------------------------------------------------

// The observations will be in the form of an array of distributions
Variable<int> numberOfObservations = Variable.New<int>().Named("NumObs");
Range r = new Range(numberOfObservations).Named("N");
VariableArray<Gaussian> observationDistribs = Variable.Array<Gaussian>(r).Named("Obs");
// Use the marginals from the trained model
Variable<double> scoreMean = Variable.Random<double>(marginals.marginalScoreMean).Named("scoreMean");
Variable<double> scorePrec = Variable.Random<double>(marginals.marginalScorePrec).Named("scorePrec");
Variable<double> judgePrec = Variable.Random<double>(marginals.marginalJudgePrec).Named("judgePrec");
Variable<double> clickPrec = Variable.Random<double>(marginals.marginalClickPrec).Named("clickPrec");
Variable<double>[] thresholds = new Variable<double>[numLabels + 1];

// Variables for each observation
VariableArray<double> scores = Variable.Array<double>(r).Named("Scores");
VariableArray<double> scoresJ = Variable.Array<double>(r).Named("ScoresJ");
VariableArray<double> scoresC = Variable.Array<double>(r).Named("ScoresC");
scores[r] = Variable.GaussianFromMeanAndPrecision(scoreMean, scorePrec).ForEach(r);
scoresJ[r] = Variable.GaussianFromMeanAndPrecision(scores[r], judgePrec);
scoresC[r] = Variable.GaussianFromMeanAndPrecision(scores[r], clickPrec);
// Constrain to the click observation
Variable.ConstrainEqualRandom(scoresC[r], observationDistribs[r]);
// The threshold variables
thresholds[0] = Variable.GaussianFromMeanAndVariance(Double.NegativeInfinity, 0.0).Named("thresholds0");
for (int i = 1; i < thresholds.Length - 1; i++)
thresholds[i] = Variable.Random(marginals.marginalThresh[i]).Named("thresholds"+i);
thresholds[thresholds.Length - 1] = Variable.GaussianFromMeanAndVariance(Double.PositiveInfinity, 0.0).Named("thresholds"+(thresholds.Length-1));
// Boolean label variables
VariableArray<bool>[] testLabels = new VariableArray<bool>[numLabels];
for (int j = 0; j < numLabels; j++) {
testLabels[j] = Variable.Array<bool>(r).Named("TestLabels" + j);
testLabels[j][r] = Variable.IsBetween(scoresJ[r], thresholds[j], thresholds[j + 1]);
}

The training models used above were structured according to label class.  For prediction there is usually no label information. Many of the components of the model are the same as for training.

The differences

Inferred variables from the trained model are used as the priors for the prediction model.  Data is not partitioned according to label because there are no labels so there are no loop over labels. The lower and upper bound thresholds are set to negative infinity, and positive infinity rather than 0.0 and 1.0 – The label probabilities that will be output by the model sum to 1.0  An array of bool variables is set up The marginal distributions of these as Bernoulli distributions will give the probability of each label.

Running the Prediction from the Models

//--------------------------------------------------------------------

            // Running the prediction model

            //--------------------------------------------------------------------

            int[] clicks = { 10, 100, 1000, 9, 99, 999, 10, 10, 10 };

            int[] exams = { 20, 200, 2000, 10, 100, 1000, 100, 1000, 10000 };

            Gaussian[] obs = new Gaussian[clicks.Length];

            for (int i = 0; i < clicks.Length; i++) {

                int nC = clicks[i];    // Number of clicks 

                int nE = exams[i];     // Number of examinations

                int nNC = nE - nC;     // Number of non-clicks

                Beta b = new Beta(1.0 + nC, 1.0 + nNC);

                double m, v;

                b.GetMeanAndVariance(out m, out v);

                obs[i] = Gaussian.FromMeanAndVariance(m, v);

            }

 

            numberOfObservations.ObservedValue = obs.Length;

            observationDistribs.ObservedValue = obs;

            InferenceEngine engine = new InferenceEngine();

            Gaussian[] latentScore = engine.Infer<Gaussian[]>(scores);

            Bernoulli[][] predictedLabels = new Bernoulli[numLabels][];

            for (int j = 0; j < numLabels; j++)

                predictedLabels[j] = engine.Infer<Bernoulli[]>(testLabels[j]);

 

            Console.WriteLine("\n******   Some Predictions  ******\n");

            Console.WriteLine("Clicks\tExams\t\tScore\t\tLabel0\t\tLabel1\t\tLabel2");

            for (int i = 0; i < clicks.Length; i++) {

                Console.WriteLine("{0}\t{1}\t\t{2}\t\t{3}\t\t{4}\t\t{5}",

                        clicks[i], exams[i], latentScore[i].GetMean().ToString("F4"),

                        predictedLabels[0][i].GetProbTrue().ToString("F4"),

                        predictedLabels[1][i].GetProbTrue().ToString("F4"),

                        predictedLabels[2][i].GetProbTrue().ToString("F4"));

            }

        }

 

You will notice that the click data  is provided as arrays and examination counts. Click data is converted into Gaussian observations the same way as the training model uses (though not  by label). This distribution array is set as the value of the observationDistrib parameter. The marginals are then requested from the inference engine.  From the results we can determine that the more confident the model is of the labeling.

What Infer.NET does well..

Infer.NET provides the .NET programmer with:

  • Powerful and Flexible Model Construction
    The Infer.NET API modeling API makes converting a conceptual model into code simply and effectively. The API can be used to implement a wide range of models.
    Models supported include: Bayes point machine, latent Dirichlet allocation, factor analysis, and principal component analysis in only a few lines of code.
  • Scalable and Composable Models
    The Infer.NET modeling API is composable.   You can implement complex conceptual models from  building blocks. You don’t have to implement the entire model at once. You can start with a simplified conceptual model, which captures the basic features. You can then scale up the model and the data set in stages until you have a fully-implemented model that can process real data sets. You can also scale up these models computationally, starting with a small data set, scale up to handle much larger amounts of data, including using parallelized computation.
  • Built-in Inference Engine
    Infer.NET includes an inference engine that  allows for the computing of posteriors using Bayesian inference and numerical analysis . With Infer.NET, your application constructs a model, observes one or more variables, queries the inference engine for posteriors. The query is done in only a single line of code. The inference engine does the heavy lifting.
  • Separation of Model from Inference
    Infer.NET gets around the problem of no clear distinction between the model and the inference algorithm. 

    Infer.NET maintains a clear distinction between model and inference. The model encodes basic prior knowledge. An Infer.NET model is typically confined to a single relatively small block of code. The model is often encapsulated in a separate class, so that you can use the same model for different queries.

    A separate model is straightforward to understand and modify, and is much more resistant to inconsistencies. Inconsistencies that  creep in will be caught by the inference engine. The inference engine handles the computations. You can change the model without touching the inference engine, and you can change the inference algorithm without touching the model.

  • Drawbacks 
    Infer.net  is limited to relatively simple models. There can be difficulty in changing the model. This can easier to introduce inconsistencies to the model. Infer.NET limits you to a particular inference algorithm.

image

About Don Burnett

Changing how people interact with software

Posted on November 20, 2011, in Machine Learning and tagged , , . Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: