Tuesday Jul 08, 2008

A Standard for Measuring and Reporting Downloads


Mozilla and Firefox have garnered a lot of well-deserved publicity for their new Guinness record for most downloads in a day. This raises the question (again) of what constitutes a "download" and how is it measured? (I first touched on this subject a couple of years ago, writing "When is a download a download?".) I was glad to see in the above referenced BBC article, "The official figure was confirmed after logs from download servers were audited and checked to ensure duplicate and unfinished downloads were not counted." So that implies "attempted" downloads weren't counted, only completed. That's good, as I'm sure some stats we see reported don't make such distinctions. 

So with Guiness in the mix now, and with all the focus on downloads in the free & open source software world, I propose it's time for an agreed upon standard on how the number of downloads is reported. (To be honest, I'm uncertain if this could ever be a "formal" standard and who would govern it -- please let me know if you have any suggestions.) For starters, here's what I propose:

  • Simple case: One Product = one file
    • This case means that the entire product is contained in a single downloadable file. If it is downloaded successfully, the customer can install and use the product.
    • The suggested measurement is the one generally used -- the last byte of the file is delivered to the customer.
      • Note this may not be 100% accurate. Threaded download managers can conceivably download the last byte of a file and then fail an earlier byte range before the entire download completes. But this is an edge case. It would also be extremely resource intensive to have to match up all the byte range requests in a threaded download scenario to "prove" that one user completed the download. Thus, I think the "de facto" standard of last byte delivered is acceptable.

  • Complex case: One Product = multiple files
    • First, make the distinction between required and optional files:
      • Required files are required to install the product. There may be one or more required files as part of the download. 
        • For example, on Sun Download Center, there is an option to download Solaris OS as 5 CDs. If you do not download all five, you cannot install the product. Thus, all five are required files, and each one must be downloaded to install Solaris.
      • Optional files are not required to install or run the product. Optional files may consist of "plug-ins,"  "add-ons," or "modules" that add functionality to the base program. Also included would be documentation, checksum files, additional language packages, and the like.
    • The measurement of a completed download in this case is that all required files are downloaded by the same customer. (This can be a challenge to measure, but it can be done in most cases with a good degree of accuracy.)
    • Optional files do not count and should never be reported as "product" downloads.

  • In either case, "Attempted Downloads" should never be reported as a product download. 
    • For a single file product, an attempted download is when the user starts the download but does not receive the last byte of the file.
    • For complex products, an attempt is when any individual required file is not completed and/or the customer does not successfully download all required files.

Monday Jun 02, 2008

Download Best Practices, Part 2


I'm going to list three more of the download best practices that we encourage all our product teams to use as they develop products for electronic software distribution (ESD). This is the second part of a continuing series I'll publish from time to time. (You can see Part 1 here.) Some of these suggestions may seem like "no-brainers" (at least once you understand the basics of ESD), but they need to be stated and documented nonetheless and are certainly helpful to teams that are just starting out to prepare downloadable software.

1) Compress the files as much as possible. Obviously smaller files are cheaper/faster for everyone to download, and faster delivery means less time for things to fail and higher completion rates. Generally, we recommend using zip compression, as it's pretty much a de facto standard at this point (and is supported on Solaris OS as well). There are many free and open source distributions product teams can use, such as Info-ZIP, bzip2, and 7-Zip. You should experiment to see which one provides the best compression (in one test here, 7-Zip outperformed the others by about 20%, but your results may vary). There are also proprietary compression programs ("NOSSO," for example) that can greatly outperform "vanilla" zip compression, but they come with a price tag. If your download size really really matters, they may be worth investigating.

2) Use standard MIME types for file name extensions, for example "filename.zip" and "filename.exe." Web servers, browsers, download managers, and many other pieces of the Internet infrastructure rely on standard MIME types to ensure download transactions work correctly and consistently. So, don't release files without a MIME type extension (for ex., use "README.txt," not "README"), and don't make up file extensions or use non-standard ones, such as "filename.aa," "filename.bb," etc.

3) For large files, offer alternatives. Not everyone can successfully download a large file, especially ones larger than 2 GB. Other choices should be provided, such as a segmented ("chunked") version of the large file, CD images instead of a single DVD, and/or the option to order free or low cost CDs or DVDs. See my earlier post on "Breaking the Large File Barrier" for more details.


Thursday May 15, 2008

Using wget With Our New Download System


Soon after releasing our new download system, we received comments that the command line download client wget (and others) no longer worked. This is a result of a new session-based security model used to protect download links from being copied and re-used (which could circumvent numerous Legal requirements necessary to authorize a download). 

We have implemented a work-around now, so I just wanted to put this quick note out there to increase awareness of its availability. If you'd like to use wget or similar tools with Sun Download Center, please review the work-around, posted on our new Downloads FAQ Wiki.

Sorry for any inconvenience this has caused -- it was one of those unintended consequences. We believe we have a solution for it and will implement it as soon as we can. In the meantime, I'm pleased we could at least offer a temporary fix.

Get the work-around script here: Using wget with the Sun Download Center


Friday Apr 18, 2008

Video Demo of Sun's New Download System


OK, so things took a bit longer than we expected, and we hit a few bumps in the road, but our new download system is now fully operational and humming along! Just this week we finished moving all sun.com downloads onto it, and we're extremely pleased to reach this milestone.

To help introduce the new system (and also get newcomers and experienced users alike downloading as quickly, easily, and successfully as possible), please check out our new demo video. It features a walk through of the Solaris OS download experience, as well as helpful tips on using Sun Download Manager. The video takes less than five minutes.




I hope you enjoy the video, and more importantly, that you benefit from the enhancements we've made to downloading from Sun. As always, your suggestions for further improvement are welcome. You can leave me a comment, or if you encounter any questions or issues, please let our customer support team know. Thanks!


Friday Mar 21, 2008

Download Best Practices, Part 1


We've been working on a new internal wiki to capture download best practices gleaned over the past 11 years of designing and managing our high capacity download systems. I think this is good and valuable information so wanted to blog about some of our findings. This will be the first in a series of posts on the subject.

First, let's consider our goals in defining and publishing these practices. At a high level, I'd sum them up as follows:

  • Ensure customers are able to download successfully, as fast as possible and with minimal hassle.
  • Provide a consistent and efficient download experience.
  • Increase completion rates and reduce abandonment.
  • Ensure our systems and processes optimize the conversion of downloaders into (paying) customers.
  • Save time and bandwidth (cost) for Sun and our customers.
  • Within Sun, document the processes to release software for download, and ensure they are followed.

Our first best practice is seemingly simple yet frankly it's not been followed consistently over the years: use a unique file name for every file in every product and version release. There are several reasons why this is so important, the first of which is unique to our systems:

Sun Download Manager uses special "verification property files" (VPFs) to checksum files as they are downloaded from Sun Download Center. The VPFs are mapped to unique file names. So, here's where we get into really deep water if file names aren't unique from release to release! If we release a product with file name "product.zip", a VPF will be created called "product.zip.sdm.zip" that SDM grabs automatically and uses to checksum the file and ensure a successful download. Now if the product team releases the next version of the product but does not update the file name, a new VPF will be created with the same name as the prior one. This works fine for downloading the new product. But when users try to download the prior version, there will be a checksum mismatch between the old file and the new VPF -- the download will fail. And that's why this is so critical on our system.

Here are a couple of other good reasons (not unique to Sun's download systems):

  • Downloads often flow through content distribution networks which cache files around the world for fastest delivery. Unique file names ensure caching works most efficiently and eliminates the potential for delivering incorrect content from the caches.
  • Unique file names reduce potential errors when locating, copying, moving, updating, reporting on, and archiving content.

It's really not hard to do, so we hope you'll follow this practice!

If you work for Sun, you can view the full Download Best Practices Wiki.

Sorry, this is for Sun employees only. The link will not work unless you are inside Sun's firewall.


Thursday Feb 14, 2008

New Wiki for Sun Download FAQs


I've been blogging for a while now, and I guess it was inevitable that wikis were in my near future as well! Having spent a lot of time and energy learning HTML and web publishing, I was hesitant to dive in and learn "wiki language" and the new publishing method. But a little encouragement from our VP goes a long way -- Curtis suggested we take a stab at a wiki to involve more of our community in the areas of download support and information.

So, we've taken the plunge and have posted a new Sun Downloads FAQ on wikis.sun.com. We already have a comprehensive FAQ devoted to Sun Download Center, and there's no point in repeating all the content in a wiki format. Rather, we want to broaden the scope to include all Sun downloads (open source products, for example) and related areas. I think we should be open to non-Sun content as well, if it relates to improving the download experience and helping our customers with any download questions or issues.

Kudos to Lori Holmes who got the site up and running and seeded the content with a few of the top questions from the SDLC FAQ. Next, I dutifully scanned the wiki manual and jumped in with my first contribution (including a shameless plug for my blog). I would love to add links to more expert resources around the web, so please let us know any suggestions.

I believe anyone who's a Sun employee has read/write privileges automatically, and your contributions are welcome. Those outside Sun will need to request edit permissions to contribute, but I'm sure we can figure out how to enable that if you'll let us know of your interest.

Time will tell if we can get critical mass around this effort, but at the least it allows us to quickly put out new information ourselves (without bugging our web publishing team to update the SDLC FAQ). Hopefully it will evolve into a valuable resource, ideally one with contributors far and wide!


Tuesday Feb 05, 2008

Breaking the Large File Barrier


"Large file" is actually a technical term for files larger than 2 GB (and to be really precise, that's 231 or 2,147,483,648 bytes.) In the past, we've not been able to distribute files over 2 GB on Sun Download Center (SDLC). Our engineers tell me that's because 32 bit systems cannot handle signed integers greater than 2 GB. Way back when we built our "old" download system, it didn't really seem to matter, as we would never try to offer files that large for downloading (kind of like not worrying about Y2K back in the 1980's!). But times have changed, and with the proliferation of large, single file DVD ISO images that many Linux distros use, it's no longer uncommon. (Of course, this goes hand-in-hand with the proliferation of broadband access.)

As we built our new download application, large file support was a requirement. But, we were still stuck with a large file limit in some of the older code in Sun Download Manager (SDM). As using a download manager is really, really helpful for files this large, we appeared to be unable to proceed. We were very aware of this limit and had a high priority mandate to fix it, but we haven't had enough engineering resources to take that on yet. The pressure was building, however, as the Solaris OS team really wanted to be able to release single large file DVD images (and I don't blame them).

That's when things got interesting. Internally, we were testing our new download system, and we put a few large files on it. One of the testers sent in some test results saying, "Successfully downloaded the 3.2 GB test file using SDM." Impossible, I thought, there must be a mistake. But no, the tester insisted it worked. So I tried it myself -- it worked! They say "ignorance is bliss," and thankfully this tester was unaware of what we all knew "would not work" and simply went for it. It was quite a surprise.

Now these types of bugs really don't have a habit of fixing themselves, but we figured out what's going on.

For files on SDLC, we generate what we call "Verification Property Files" (VPFs) that contain the checksums SDM uses to checksum downloads real-time as they are received. Another piece of data in VPFs is the file size, and that's how we get around this limit in SDM. It turns out that as long as there is a VPF for a large file (and we create them automatically for all files released on SDLC), SDM can get the file size from the VPF and it all works! When there is no VPF, the file size is part of the header info sent from the web server, and this is when things break. (Some older web servers can't handle the large numbers either.)

So, bottom line, after a bunch more testing, we've just released the first ever single large file on SDLC -- the latest version of Solaris Express Developer Edition (~ 3.7 GB DVD ISO image). This is a small but significant milestone after years of butting up against the 2 GB limit.

Now the larger the file, the more that can go wrong, so if you give this a try, please do use SDM. And here are some notes and "best practices" we gleaned from rolling this out:

  • The "32 bit" limit isn't unique to our systems but can affect servers, routers, operating systems, and clients throughout the network. For example, if a Windows XP system uses a FAT32 file system rather than NTFS, there's no way it's going to work -- the OS simply can't handle the file. (Thanks to openSUSE.org where I found that tip.)
    • As a result, we highly recommend to our product teams that they not rely solely on a large file for distribution, as it's not going to work for all customers. Offer options such as a "chunked" version of the DVD that users concatenate after downloading in smaller pieces. Or offer multiple CD images instead of the DVD (as we do for Solaris). And finally, offer a hard media version (DVD) that users can order inexpensively (or better yet, free) and is then shipped to them.
  • Use an up-to-date browser and fully patched, modern operating system to be sure large files are adequately supported on the client end.
  • Absolutely do not attempt this with a slow line, like a dial-up modem. You can expect it to take about 40 hours per GB on a 56K modem.
  • Use a download manager so you can resume where you left off in case anything goes wrong (you do not want to have to start over from the beginning). You can also pause and resume, if you're running out of time.
  • Make sure you have at least twice the size of the file in free disk space -- with these large files, that's actually quite a bit of disk space. Operating systems typically make a temporary copy of the file while downloading, then copy it to its final location, so you must have the extra space.
  • And a couple of notes specific to SDM:
    • As noted previously, SDM will not support large files except on SDLC, so don't try it on other sites (until we can get this fixed).
    • When SDM finishes downloading the large file, there is internal processing that must take place before the download is actually complete. Due to the huge file size, this processing can take several minutes. As a result, the SDM progress bar will say "100%" while the Status still says "Downloading data..." Be patient and do not close SDM. After a few minutes, the Status changes to "Downloaded", and the download is complete.

Hopefully this first large file release goes well and is the first of many. If you give it a try and have any problems or questions, please let us know -- the feedback is very helpful as we learn the ins and outs of large file distribution over the Internet.


Tuesday Dec 18, 2007

Sun's New Download System is rolling out now


After a few delays due to the complexity of the undertaking, I'm really happy to say we're back on track with the roll-out of the new download system that powers Sun Download Center. We are following a "phased product migration" plan which means we will not transfer all the products in one "big bang" from the old system to the new -- too much risk in that approach.

Back in October we put the first few products out but then had to hold pending some more back-end fixes. In early December, we released the fixes and then released 2000 of our lower volume downloads onto the new download system. A few days after that we released three high volume download products to slowly add load to the system. Among those is Sun Download Manager, which gets close to 30,000 downloads a month. So far so good -- all is working smoothly, and we have a ton of capacity remaining on our new servers.

Most of Sun closes for a week between Christmas and New Years, and so we decided to wait until after the break to move the remainder of the downloads onto the new system, including our "crown jewels" such as Solaris Operating System, Sun's Software PortfolioJava software and developer tools, etc. This ensures that we'll have the appropriate support and engineering staff available during the transition.

As you download from Sun for the next few weeks, you'll find some products on the old system and some on the new. Other than a few UI changes and other mostly-subtle improvements, this should be transparent and not an issue for any of our customers. (You can tell you're on the new system because the URLs start with https://cds.sun.com/...)

I'll be off shortly for a couple of weeks of much needed rest and relaxation so wanted to send my best wishes to you all for the holidays and a great 2008!


Wednesday Oct 31, 2007

"1-Click" downloading debuts at Sun


I realize this isn't exactly a ground breaking development, but it's still a first here at Sun -- we've just released our first product to use 1-click download. Basically, this simply means you click the "Download Now" button, and we use JavaScript to start the download automatically -- no further clicks needed! And we did include an enhancement that makes our implementation unique by integrating the Sun Download Manager (SDM) directly into the 1-click experience. If you check "Use Sun Download Manager" before clicking the Download Now button, SDM installs as part of the download process and starts up with your product already loaded into its file list.

Granted our first product is a relatively small file which doesn't take great advantage of SDM's ability to pause, resume, and restart downloads, but once we roll out larger files, this will be a very effective solution.

As I wrote previously, there are still a number of caveats around using this feature, and it will take time to enhance its functionality to support many more products. But it's definitely the direction we're going -- to streamline the download experience and remove as many steps as possible. 

If you'd like to give it a try, start on the page Downloads for Java Web Services Developer Pack 2.0, scroll down the page until you see "Java Web Services 2.0 Tutorial," and click the orange Download Now button.


Monday Oct 22, 2007

First Products Released on Sun's New Download System


I am pleased to announce that last week we released the first few products on our new download system -- an important and exciting milestone! We will continue migrating products to the new system in a phased manner in order to gradually add load and reduce risks associated with such a large product migration. This process will culminate later this year with the release of our top product downloads such as Solaris Operating System and Java software

If you'd like to be among the first to try the new system, here are a couple of products that are live now.

Solaris Operating System for x86 Installation Check Tool 1.2
(Note this product requires a Sun Online Account to download.)

J2SE(TM) Runtime Environment 5.0 Update 2
Click on the "Download JRE" link (not the "Download JDK" link).

These are older product versions, again to reduce risk and start out conservatively. Yet our stats show we've already had quite a few downloads on the new system, and so far it's going well! If you give it a try, please feel free to leave me a comment about your experience. If you hit any questions or issues, please check out our updated FAQ, or try the new download customer support form to reach download customer service.

I'm going to claim the distinction of having done the first live download on the new system until someone proves otherwise -- it went live at 8:00 am PDT on October 17, and I completed my download at 8:02!


Friday Oct 05, 2007

Sun's new Download System -- So what's going on?


I first mentioned we're building a new download system back in March (wow, was it really that long ago already?), and frankly, we had hoped to have it out the door by now. Alas, it's been a very complex project, and when you're dealing with the kinds of download volumes we are, we simply needed more time to ensure the highest quality system.

One of the complexities (and benefits) of the project has been our decision to use much more of a service oriented architecture (SOA). When we built the first system starting way back in 1997, the term SOA wasn't even coined, and we built all the functionality ourselves. Since then, we have worked hard to standardize web services and systems that all of Sun's web properties can share via SOA. We call this set of systems the "Common Web Platform," and it includes ID managementMy Sun Connection Portal, eCommerce, downloads, and more.

Here are some of Sun's common services that our new download system (internally, we call it "CDS" -- Common Download Service) will use and their benefits (similar functions were built-in to the old system, making it even harder to manage):

  • ID Management: By using common Sun Online Accounts, users don't have to create nor remember multiple credentials on different Sun properties and can move between them seamlessly using single sign-on and session transfer. CDS doesn't have to build its own customer registration system nor store the data for millions of downloaders.
  • Portal: When users download many of our most popular products, we'll automatically signal the My Sun Connection Portal about the transaction. Customers can then login there, go to their "My Products" tab, and it'll list their recent downloads. Using this info, the Portal presents really useful content. For instance, if you download the Solaris Operating System, you'll find informative links to articles, blogs, training, support resources, and forum postings. If you've never visited our portal, I think you'll find it very worthwhile -- check it out!
  • Outbound Email: Some products are set up to send instructive emails to customers after they download. By using Sun's common email service, we gain efficiency while better respecting customers' Sun-wide privacy preferences. (Trying to track opt-in/opt-out data separately on our many web sites just doesn't work!)

So what's this have to do with the project schedule? With all the benefits of SOA, we're learning about the added complexity as well and some new pitfalls:

  • Number 1, and probably obvious, but we don't control 100% of our fate anymore. Our team has to work with each service's business and engineering teams. Sometimes their priorities are different than ours, and a delay in any external system we rely on affects our entire schedule. (One of the systems lost a key engineer in the middle of our project, for example, and that hurt.)
  • Environmental complexities: We can't build and test everything in our shared production environment, so we work in development and test environments. But the non-production versions of the different systems aren't necessarily in the same place, and so your testing can come to a dead halt just because someone hadn't opened the right firewall ports for the systems to interconnect. 
  • Debugging can be more difficult, and quite honestly it introduces a whole new world of "finger pointing" (as in, "My service works perfectly, so it's obviously something wrong on your end!")

These complexities are not the only reason the project has taken longer than expected, but they certainly contributed. It's a good learning experience, and when we plan our next SOA integrations, we'll know to add some extra time and be better prepared for this new world of interconnectedness.


Wednesday Jul 18, 2007

Heads Up! Sun Download Links are Changing


We've been working very hard on preparing our new download system for release. We hit a snag with data migration from the old system to the new and lost a number of weeks on our schedule, but we're back on track now and preparing to release the new system within the next couple of months (hopefully!).  

I wanted to post this "heads up" to alert any sites that direct link to downloads on Sun that those links will change. We are taking care of this with an automated, systematic approach for web pages we publish. However, we know there are external sites, not owned by Sun, that link directly to downloads. If you own or manage such a site, this is your notice that your links won't work correctly when we release the new system. (We will put redirects in to handle requests to the old system as gracefully as possible, but clearly the best solution is to change the links to point to the new system.)

What kind of links are we talking about? Primarily, this concerns direct linking to the current SDLC application. All such links start with:

http[s]://javashoplm.sun.com/...

(For example, look at the "Get it" page for Sun Java System Web Server 7.0 Update 1 and you'll see a "Download" link at the bottom that goes to:

http://javashoplm.sun.com/ECom/docs/Welcome.jsp?StoreId=8&PartDetailId=SJWS-7.0U1-OTH-G-F&TransactionId=Try.)

Do you manage or know of a site with any URLs on them that start this way? If yes, we will be happy to work with the site team to let them know how to "translate" these URLs to our new download system. It's not difficult and can be done programmatically if there are a lot of them. 

(By the way, if you're using these links for Java software downloads, unless your customers require a very specific version or it's primarily a developer audience, we recommend linking to java.com instead. Java.com is much more of a consumer oriented download experience, and you can change your links now and remove any dependency on the roll-out of our new system. Java.com will not change as far as external facing downloads are concerned.)

Finally, there is one other class of effected pages that are "intermediary" pages on the way to the download system. All such pages start with:

http://www.sun.com/download/products.xml?id=

(The above example of the "Get it" page for Sun Java System Web Server 7.0 Update 1 is this type of page, located at http://www.sun.com/download/products.xml?id=467713d6 )

So, in summary, the new download system will replace these links and many of these "Get it" pages. If you link to them, you must update your site. To do so, please contact me at once by email at gary dot zellerbach at sun dot com. I will send you instructions on how to make the updates and ensure your are communicated with regularly about the changes and schedule. If you have general questions about what's going on, please leave me a blog comment. Thanks!


About

I helped design, build, and manage download systems at Sun for many years. Recently I've focused on web eMarketing systems. Occasionally, I write about other interests, such as holography and jazz guitar. Follow me on Twitter: http://twitter.com/garyzel

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Bookmarks
News

No bookmarks in folder

Blogroll
ESD