A look at Microsoft Research’s Infer.NET
By Don Burnett
What is Infer.net and how is it useful ?
Computers make decisions for us every day, whether it’s using a search engine such as Microsoft’s Bing, or trying to show us advertising on a web site, that may be relevant to our interests. Behind what we see are powerful decision making programs that allow the computer to be “smart” about choices being made (in other words: less wrong). They are not perfect and “artificial intelligence” (also known as A.I.) has been around for a very long time. These are usually hidden away under computer science topics such as “cognitive sciences and machine intelligence”.
My first encounter with such a system was in the 1980s and it was called an “Expert System” at the time and it was named Magellan. It was being done by a local Ann Arbor based company called Emerald Intelligence which went on to having much industry success. Over the years technology technology has improved as the internet has exploded. At the heart of these systems there is something called an “Inference Engine”.
According to Wikipedia:
“an inference engine is a computer program that tries to derive answers from a knowledge base. It is the “brain” that expert systems use to reason about the information in the knowledge base for the ultimate purpose of formulating new conclusions. Inference engines are considered to be a special case of reasoning engines, which can use more general methods of reasoning.”
Machine Intelligence at Microsoft Research
As we all know there can be many applications of such an engine or framework for problem solving. The folks at Microsoft Research have been working on such an engine that you can use today for non-commercial projects.. It’s called Infer.NET.
“Infer.NET is a framework for running Bayesian inference in graphical models. It can also be used for probabilistic programming.
You can use Infer.NET to solve many different kinds of machine learning problems, from standard problems like classification or clustering through to customized solutions to domain-specific problems. Infer.NET has been used in a wide variety of domains including information retrieval, bioinformatics, epidemiology, vision, and many others. “
Let’s take a look at how Infer.NET works and what it does for you..
The user creates a model definition using the API for modeling which specifies a set of inference queries relating to the model. The user then passes the model definition and inference queries to the model compiler, which creates the source code needed to perform those queries on the model, using the specified inference algorithm. Source code may be written to a file and used directly if you need to do so.
The source code is compiled to create a compiled algorithm. This can be manually executed to get refined control of how inference is execute or performed by the Infer method. By passing the framework a set of observed values (arrays of data), the inference engine executes the compiled algorithm, to produce the marginal distributions requested in your query. This can be iterated/repeated for different settings of the observed values without recompiling it.
From the documentation:
What can we use this for and how is it useful ?
Problems that Infer.NET can solve for you.. The “Click Model Example”
One of the samples provided by Microsoft Research for instance allows us to glean what human relevance to be reconciled with document click counts. These example models allow us the calibration of human judgment data against click data using query/document pairs for which we have both observations.
This can be used to identify data for which click data and human judgment data are inconsistent and and need clean up for a ranking model to be useful. It could also use the predicted labels or score and supplement the human judgment training data.
A user submits a query to a search engine, the search engine returns a list of document hyperlinks to the user, along with a title and query-related snippet extracted from the document. The user looks at the list, and based on title and snippet, decides whether to click on a document in the list or whether to pass over it. These decisions are recorded in click log. The decision of a user to click or not click on a document in the list gives an indication as to whether the document is relevant or not.
The relevance of a document to a given query can also be determined by human judgments.
Judgments are usually in the form of a set of labels with associated numeric values.
- Not Relevant
- Possibly Relevant
Building a successful search engine requires the collection of many human relevance judgments to create a valid document ranking system. These tend to be much more expensive to collect and more valuable than the logs themselves in the grand scheme of things.
Code for Walk Through:
Output from Infer.NET
How to solve this problem?
In this example two models are built to solve the problem. These models are the same except that the second one uses shared variables. The two models should give identical results provided the inference converges.
Building the First Model
In the first model of the example provided, each click or non-click provides evidence about the relevance of the query/document pair. The more examinations performed the more believable the evidence is.
”We could think of the set of click/non-click events as the outcome of a binomial experiment – the probability of observing m clicks given N examinations is given by the binomial distribution Bin(m|N, m) where m is a parameter that we need to infer.”
Infer.NET does not provide built-in support for binomial distributions.
“We could add binomials in ourselves, but instead we consider each click/non-click event as outcomes of individual Bernoulli experiments, and include each click or non-click as an individual observed variable. However, this would create a large number of variables for each query/document pair, and might be impractical in a very large scale application.?..”
”Instead, we adopt a practical approach where the posterior for m is calculated outside the model. This posterior can be analytically and simply calculated as a beta distribution. We then use moment-matching to project this distribution onto a Gaussian distribution (the reason for this is that we will later be introducing a Gaussian score variable corresponding to this observation). All of this can be very simply done using the Infer.NET class libraries. For simplicity, we just assume for now that the observation distributions are in a single array, though this will change later.”
Understanding the Second Model and How it differs from the first
The second model takes care of the plumbing needed for sharing information between models. The SharedVariable class is a convenient wrapper class used to specify the variables that are shared between the models. Let’s now skip ahead to look at the differences between model one and model two..
”Here we implement the same model as in click model 1 but with shared variables. There are a number of reasons why one might want to use shared variables including memory problems, parallelization, and more control over the schedule which might be necessary if there are convergence problems. Infer.NET provides a SharedVariable class and a Model class which ensure that the correct messages get marshaled between the different models. This model is available as Model2 in the example code. It mirrors the Model1 code except for the following:
- SharedVariable objects are created in place of Variable objects for all variables that we want to infer; these are initialised with the priors.
- Model code must be changed to refer to the instance of the SharedVariable for the current chunk.
- The data is divided into identically sized chunks.
- We explicitly loop over chunks, and do inference on each chunk. We need to loop over all chunks several times, checking marginals between each pass to test for convergence.
- For each chunk, we use SharedVariable and Model class methods to obtain the variables for each sub model, and to perform inference on these variables, respectively.”
Using the models in Prediction
The training models used above were structured according to label class. For prediction there is usually no label information. Many of the components of the model are the same as for training.
Inferred variables from the trained model are used as the priors for the prediction model. Data is not partitioned according to label because there are no labels so there are no loop over labels. The lower and upper bound thresholds are set to negative infinity, and positive infinity rather than 0.0 and 1.0 – The label probabilities that will be output by the model sum to 1.0 An array of bool variables is set up The marginal distributions of these as Bernoulli distributions will give the probability of each label.
Running the Prediction from the Models
You will notice that the click data is provided as arrays and examination counts. Click data is converted into Gaussian observations the same way as the training model uses (though not by label). This distribution array is set as the value of the observationDistrib parameter. The marginals are then requested from the inference engine. From the results we can determine that the more confident the model is of the labeling.
What Infer.NET does well..
Infer.NET provides the .NET programmer with:
- Powerful and Flexible Model Construction
The Infer.NET API modeling API makes converting a conceptual model into code simply and effectively. The API can be used to implement a wide range of models.
Models supported include: Bayes point machine, latent Dirichlet allocation, factor analysis, and principal component analysis in only a few lines of code.
- Scalable and Composable Models
The Infer.NET modeling API is composable. You can implement complex conceptual models from building blocks. You don’t have to implement the entire model at once. You can start with a simplified conceptual model, which captures the basic features. You can then scale up the model and the data set in stages until you have a fully-implemented model that can process real data sets. You can also scale up these models computationally, starting with a small data set, scale up to handle much larger amounts of data, including using parallelized computation.
- Built-in Inference Engine
Infer.NET includes an inference engine that allows for the computing of posteriors using Bayesian inference and numerical analysis . With Infer.NET, your application constructs a model, observes one or more variables, queries the inference engine for posteriors. The query is done in only a single line of code. The inference engine does the heavy lifting.
- Separation of Model from Inference
Infer.NET gets around the problem of no clear distinction between the model and the inference algorithm.
Infer.NET maintains a clear distinction between model and inference. The model encodes basic prior knowledge. An Infer.NET model is typically confined to a single relatively small block of code. The model is often encapsulated in a separate class, so that you can use the same model for different queries.
A separate model is straightforward to understand and modify, and is much more resistant to inconsistencies. Inconsistencies that creep in will be caught by the inference engine. The inference engine handles the computations. You can change the model without touching the inference engine, and you can change the inference algorithm without touching the model.
Infer.net is limited to relatively simple models. There can be difficulty in changing the model. This can easier to introduce inconsistencies to the model. Infer.NET limits you to a particular inference algorithm.