Use Bing API and MSDN metadata to generate code automagically (Part 2)

Back in Part 1, I introduced you to MSDN’s rich metadata and Bing’s undocumented meta: keyword.  At the conclusion of the post, I mentioned creating an add-in for Visual Studio 2010. Unfortunately, after tinkering with Visual Studio for a bit I couldn’t nail down exactly what I wanted the add-in to do. Procrastination set and… well, that was the end of that idea. Instead, I’ll introduce you to Bing’s Search API and create a simple command-line utility to retrieve function prototypes (ultimately for code generation purposes).

Bing API, which has reached version 2.2, supports a number of protocols – JSON, XML, and SOAP (web services). If I was going to write code that run in the browser context, I’d pick JSON as it’s natively supported in all mainstream browser Javascript implementations. For our little C# client, however, we’ll consume the web service instead. (The XML protocol would be ideal for applications that aren’t SOAP smart, like Adobe’s Flash.)

Whoa there. Before you can access the API, you need an application ID – or key. Don’t worry it’s free. Just fill out the AppID Request Form and you’ll get a bunch of letters and numbers back.

In a fresh console project, add a reference to http://api.bing.net/search.wsdl?AppID=[appid]&Version=2.2, replacing [appid] with your unique key. (Steps on how to add a web service reference can be found on MSDN.) Visual Studio will then connect to the web service, parse the Web Service Definition Language (WSDL), and generate a bunch of classes for you.

Using the newly generated classes, and page 14 of the API Basics documentation as a guide, I created a simple search utility. Be careful! The API Basics documentation refers to a LiveSearchService class. This class doesn’t exist anymore. Use the BingPortTypeClient class instead.

Here’s what I have so far (C#):

using (BingPortTypeClient service = new BingPortTypeClient()){
 SearchRequest request = new SearchRequest();
 request.AppId = "05167YY01EWW5E89E4B0694ZZ9C8807FABFXXXXX";
 request.Sources = new SourceType[] { SourceType.Web };
 request.Query = "Microsoft";

 request.Web = new WebRequest();
 request.Web.Count = 1;
 request.Web.CountSpecified = true;
}

As you can see, there’s no magic here. I’m simply creating a web request, for the first page that hits on my query of “Microsoft”. For our code generation purposes, this isn’t very useful. Let’s make some modifications, using the tips gleaned from part one of the series.

using (BingPortTypeClient service = new BingPortTypeClient())
{
 SearchRequest request = new SearchRequest();
 request.AppId = "05167YY01EWW5E89E4B0694ZZ9C8807FABFXXXXX";
 request.Sources = new SourceType[] { SourceType.Web };
 request.Query = "site:msdn.microsoft.com";

 request.Web = new WebRequest();
 request.Web.Count = 1;
 request.Web.CountSpecified = true;
 request.Web.Options = new WebSearchOption[] { WebSearchOption.DisableHostCollapsing };
 request.Web.SearchTags = new string[] { "Search.MSHAttr.APIName:CreateFile" };

 SearchResponse res = service.Search(request);
}

As you can see, I altered the Query to restrict our search to MSDN. I also added the DisableHostCollapsing flag, to workaround a host collapsing issue (as indicated in part one), and a "search tag” to find only pages containing the meta tag Search.MSHAttr.APIName with a value of CreateFile. If you execute this code, you’ll quickly discover…

The SearchTags property doesn’t work (the way it should).

Here’s what the documentation on SearchTags states:

WebRequest.SearchTags Property (Bing, Version 2.1)

Specifies name and value combinations for which to search within the HTML <meta> tag. Only pages that contain these combinations will be returned. […] The format for the property is WebRequest.SearchTags = “Search.Name:Value”

… and here’s what a Bing engineer stated, upon inquiry …

[…] In the Web.SearchTags case you are asking to have the tag returned in the array of SearchTags if one is found on a result, for different processing in the client. The reason for this is that one of the original usages for the search tags were to display differently certain tagged entries (for example having an extra URL to an image), not to select specific results.

This is clearly a case of “how it should work” vs. “how it really works”. To workaround this ridiculous behavior, we’ll just shove all our tags into the Query property, as if we were mashing all our terms into the Bing search box. Here’s what my corrected code looks like:

using (BingPortTypeClient service = new BingPortTypeClient())
{
 SearchRequest request = new SearchRequest();
 request.AppId = "05167YY01EWW5E89E4B0694ZZ9C8807FABFXXXXX";
 request.Sources = new SourceType[] { SourceType.Web };
 request.Query = string.Format("meta:Search.MSHAttr.APIName("{0}")", "CreateFile");

 request.Web = new WebRequest();
 request.Web.Count = 1;
 request.Web.CountSpecified = true;
 request.Web.Options = new WebSearchOption[] { WebSearchOption.DisableHostCollapsing };

  SearchResponse response = service.Search(request);
}

It works! The result returned points to CreateFile’s home on MSDN. Now what?

Well, remember – I wanted to pull function prototypes, for the purpose of automatically generating code. We need to add code to pull the prototype information. Take a look at the page’s XHTML code.

xhtml_msdn

See the <pre> element that encases the prototype snippet? Browsing around MSDN, you’ll see this element – for the majority of the Win32 API functions documented – is first on the page. It’s pretty safe to say, then, that the prototype will be contained in the first <pre> element on the page. Let’s add some additional code to grab it.

/* ... */

string url = string.Empty;

/* ... earlier search code ... */

url = response.Web.Results[0].Url;

XDocument xdoc = new XDocument();
xdoc = XDocument.Load(url);
XElement xe = xdoc.XPathSelectElement("//pre");
string prototype = xe.Value;

}

Again, nothing magic here. We simply load the XHTML page into Linq’s XDocument object and use the XPath extension method to return the first <pre> element found. From here, you can alter the prototype using regular expressions and pump your own code into the body of the function. As text parsing is a problem with many solutions, I’ll stop here. Cliffhanger!