What's with the Rs, Bs, WRs, CCs, DCAs, and PTs in the Windows Store XML?

Like Raymond Chen, I spelunk around Windows during my lunch breaks. Today, I came across raw Windows Store XML -- the XML sent down to the Windows Store app on your Windows machine. I noticed immediately that it wasn't your ordinary semantic XML. Instead, the elements had been transformed into shortened, mechanical, almost alien versions, some comprising of only one letter.

Here's a snippet of XML delivered as part of the Where's My Water? 2 page:

<T>Where&#x2019;s My Water? 2</T>
<!-- // Snipped // -->
<!-- // Snipped // -->

Why would anyone name their elements this way?

I suspect this is an optimization to minimize the amount of data streaming from the datacenters hosting the Windows Store. (Or maybe storage on disk?) To compute a rough savings figure here, I copied the XML and replaced all the cryptic element names with saner versions. I then ran both through Visual Studio's document formatting feature (i.e. XML tidy) and gzip'ed them.

Here's a snippet of the cleaned up version:

<Title>Where&#x2019;s My Water? 2</Title>
<!-- // Snipped // -->
<!-- // Snipped // -->

Here are the compression results (using 7-Zip 9.32a 64-bit using a gzip ultra 32KB dictionary and 128 word size):

Original XML: 6,523 bytes
Renamed XML: 8,843 bytes
Difference: 2320 bytes

Gzip'ed original XML: 2,429 bytes
Gzip'ed renamed XML: 2,727 bytes
Difference: 298 bytes

As you can see, the difference between the gzip'ed original and modified XML markup is negligible. Even accounting for 110 million Windows 8 users, we're talking a relatively small savings of ~32.8GB.

Is it worth the extra transformation work on each side? (Think CPU, battery)


I took another look at the requests and replies going back and forth and it turns out, the Windows Store app omits the HTTP Accept-Encoding header that would normally indicate support for gzip'ed content. So it doesn't appear the XML is getting gzip'ed at all. That changes our savings figure to ~255.2GB.

Weird stuff.