It’s been 7 months since the last post in my YSlow series. So why did I stop blogging about performance optimization? Well actually there are two answers to that question. First on is my passion for unit-testing which I started blogging about at pretty much that time. Secondly I have to admit, that I didn’t know what to write about YSlow rule #13 – Remove Duplicate Scripts.

So what’s the problem with including the same JavaScript twice (or more)? When the browser reaches a script import it makes a HTTP request for that script. Not all browsers are smart enough to catch this mistake. Even with modern and fast internet connections, making a lot of requests, slow down the experience of your website. If you have your caching strategy under control the duplicate request shouldn’t be much of a problem but if not, the remote request is actually made multiple times. I typically see two different patterns when people by accident include the same script twice.

Pattern 1: Including scripts i both layout/master page and partial views

Partial views in ASP.NET MVC are a great way to extract common reusable controls into a single file. A partial view has the possibility to load external JavaScript files as well. A common mistake by web developers, is to include a reference to the same script in both the layout/master page-file as well as a partial view. I haven’t really found a great solution to catch this mistake, other than running YSlow on your page. If you read this and know a solution to this problem, please let me know in the comments.

Pattern 2: Including both minified and unminified version of a script

I’ve seen multiple websites doing something like this:


<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.js"></script>
<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>

You typically see this pattern from webdev-cowboys needing to track down some bug in the JavaScript by adding the unminified version of jQuery to be able to debug it. When the bug is fixed, they forget to remove the newly added reference. The mistake won’t show itself in the browser, because the last imported script override the first. I was a bit disapointed by the new bundling support in ASP.NET MVC 4 when I found out, that the bundler always adds the minified version if available. Making it chose the full version in Debug configuration and the minified version in Release configuration would have been the optimal choice for me.

I’m working to catch this bug in my upcoming website validator HippoValidator.com.

A former college of mine told me, that he had founded “The association opposed the abandonment of PerformanceDude.com”. In order to make him happy, I thought that I would leave the wonderful world of unit testing for a moment, in the benefit of writing a performance related post :)  We’ve reached YSlow rule number 12 – Avoid redirects. I think the Yahoo documentation about the subject is a bit vague, so here’s my opinion on the subject.

Redirects are used to tell the browser, that a requested URL have been moved either permanent (http status code 301) or temporarily (http status code 302). When doing a request which returns a 301/302, the browser automatically makes a new http request, towards the URL returned by the 301/302. Being able to redirect URLs is really important if you want to change your URLs and your site is already indexed by search engines. If changing your URL scheme without returning a redirect, you basically lose your ranking in search engines like Google and all those hours spend getting link juice is wasted. In ASP.NET MVC 3 doing redirect is easy:

public ActionResult OldUrl()
{
    return RedirectPermanent("/the-new-url");
}

Notice that the above example returns a 301 (permanent redirect). If you for some reason need to return a 302, use the Redirect-method instead.

So what is my opinion on this YSlow rule? Well unless you use some sort of obscure web-framework which does a lot of magic behind the scene, there’s only one pitfall which should make you want to do something about this rule. When doing permanent redirects, ALWAYS make sure that all your internal links points to the new URL, rather than towards the old URL, letting the browser do the redirect. Implementing this anti-pattern will absolutely slow down your site!

Way down the list of YSlow rules, we’ve reached rule number 11, which is important and a no-brainer to implement. The rule is titled “Minify JavaScript and CSS.” So what is minification, and why do we need it?

Both JavaScript and CSS files are packed with unnessecary characters like spaces, comments, tabs, etc. Minification is the process of identifying which characters can be cut out. One part of minification can also be done by obfuscating the code, as we know it from obfuscators for Java and C#. In short, obfuscation is the process of shortening everything that can be shortened: method names, variable names, etc. These different methods combined lead to a more compressed output file, which leads us to question number 2: why do we need it? Fetching not only JavaScript but also CSS is a time-consuming task in the browser. The render thread blocks every time the browser loads a script, which can cause the user experience to regenerate. Speeding up the load time for JavaScripts and style sheets causes the browser to render the page quicker.

Minifying your JavaScripts and style sheets is easy. You can use one of the various minification tools online like thisthis, or this. If you want to use this approach, Google for “JavaScript minify” and use the tool you think is the best.

There’s a downside to minification as well. When you want to debug your JavaScript, the minified source code looks pretty messy and is hard to read. Luckily there’s a better way than hardcoding your minified source: minifying on the fly. I’ve previously talked about Mads Kristensen’s WebOptimizer, which contains nice HTTP modules for both JavaScripts and style sheets. I usually go with SquishIt, which I wrote about in my blog post “Combining your scripts and style sheets with SquishIt.” SquishIt combines your scripts and style sheets into a single file, which reduces the amount of HTTP requests done by the browser. In addition, SquishIt also minifies the generated file. What’s really genius about the minification in SquishIt is its ability to avoid merging and minification when working on a localhost. Here you will have the ability to work with individual files as well as not-minified scripts, which helps you debug.

(Proofread by inWrite)

Most of us (me included) don’t think much about the way DNS works. Browsers, on the other hand, use DNS pretty much every day. For the uninformed, I will start by explaining how DNS works. If you already know this, feel free to skip the following paragraph.

As you know, all computers with a network card installed are awarded an IP address, which is four (or more) numbers separated by dots. When you do a request on a webserver, you are actually making an HTTP request to an IP address. Inputting IP addresses in the browser would make browsing the web just plain horrible. The DNS was invented to avoid having to remember IP addresses. When requesting a domain name, like www.hippovalidator.com, your browser asks the DNS to resolve the inputted domain name to an IP address. The request is then sent to the IP address returned by the DNS server, making it easy and painless for you to request a webpage.

Now then, why do we as performance-optimizing wizards need to worry about DNS? Well, in my opinion, we don’t. According to the YSlow documentation, a single DNS request typically takes 20-120 milliseconds. This may not seem much, but imagine the following HTML header example code:

 <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
 <script type="text/javascript" src="http://s7.addthis.com/js/250/addthis_widget.js"></script>
 <script type="text/javascript" src="http://static.myrating.dk/rating.js"></script>
 <script type="text/javascript" src="http://connect.facebook.net/en_US/all.js"></script>

<script src="http://cdn.uservoice.com/javascripts/widgets/tab.js"></script>

The browser needs to make 5 requests to the DNS server in order to fetch the referenced scripts. Worst case scenario, this will (according to YSlow) take as much as 600 milliseconds. This is a substantial amount of time! How do you avoid this?

Option 1: Reference less stuff from other domains.

Option 2: Copy the referenced stuff from the other domains to your own domain (in the example above, the scripts could all reside on the local domain).

So let’s head back to my point about DNS not being important. In fact, I don’t think you should implement any of the solutions mentioned above. Why?

1. Because of DNS caching. When your browser asks a DNS server for an IP address, the result is cached in the browser’s local DNS cache. Chances are, your ISP already has the result cached as well.

2. Because of prefetching in all modern browsers. Every single browser that I can think of does some sort of prefetching. In Chrome, a background thread parses all domain names in a requested HTML document and resolves the domain names while you look at the page. IE does something similar.

3. Because reducing DNS lookups also reduces the number of simultaneous downloads in the browser. As I mentioned in a previous post, the browser is not able to make unlimited simultaneous requests to the same domain. That’s why my recommendation is to use jQuery (and similar from Google’s or Microsoft’s CDN) and place all of your static files like images, scripts, and style sheets on a subdomain. This would result in more DNS lookups, but it would optimize the browser’s possibilities to fetch the needed resources.

The conclusion is this… Don’t be alarmed when YSlow warns you about reducing DNS lookups!

We’ve briefly touched this subject a number of times already. Making your JavaScript and CSS external simply means moving it from your HTML header to their own files, referenced as an external file in the header. So why does Yahoo think that this is a good idea? We’ve already discussed the possibility to cache resources in the browser client-side cache. Caching scripts and styles is definitely easier than caching HTML. JavaScript and CSS are typically only changed when changing something on the site like new styling, new features, etc. Changing HTML happens all the time when users write new content, new search results are found, etc. When inlined, the scripts and styles are typically not cached or at least not for a very long time. But wait, here’s the real bonus: scripts and styles are typically shared between multiple views. Moving all that static stuff to external files will make the browser cache them, serving the cached versions when loading page number 2, 3, and so on.

The YSlow documentation argues that you can sometimes benefit from inlining your script and styles directly in the HTML. This only goes for pages with a single page view per session like the Yahoo front page. I can see the point and noticed that pretty much all search engines do this. I still think that you should get a lot of visitors in order to do this kind of thing. So unless you’re Google or your front page uses an entirely different styling than the rest of your site, you should be covered by moving the static stuff to external files.

Another benefit is when debugging through Firebug. When I need to debug some inlined JavaScript, I typically spend some time wondering and then realizing that I can’t debug JavaScript from the HTML tab in Firebug. This usually causes a bit of confusion and sometimes even some pulling out of precious hair strands. The solution is to navigate to the scripts tab and selecting the HTML file. I can understand why the UI works that way, but I keep forgetting. Moving all JavaScript to an external file forces me to navigate to the scripts tab right away.

(Proofread by inWrite)

Time for another YSlow rule. This week we will be looking at rule number 8—avoid CSS expressions. My knowledge about CSS is very limited, and I’ve always seen CSS as a format for specifying style information rather than an actual programming language. It turns out that you can actually write JavaScript inside your CSS files. First of all, how stupid is that? Sounds a bit like writing JavaScript in your HTML code. I really like each language separated into its own set of files. The idea with CSS expressions should be to do dynamic styling depending on JavaScript code. I really can’t see the need for this, with the powers in JavaScript, jQuery, and even on the server-side when generating the markup.

So what’s the problem with doing CSS expressions? Let’s look at an example from the YSlow documentation:

background-color: expression( (new Date()).getHours()%2 ? "#B8D4FF" : "#F08A00" );

The example above set the background color to either of two hex values, depending on the current time. The problem here is that the browser executes the code over and over again, each time the user interacts with the page (scroll, click, and even move the mouse). Even though this only affects the performance inside each individual browser, it can result in a slow-performing browser.
I’ve never used CSS expressions myself and won’t recommend anyone to do so. Also, CSS expressions are only supported by Internet Explorer 5 to 7. Microsoft has really done some cool stuff over the years, but this must be one of their greatest screwups.

(Proofread by inWrite)

Last week I blogged about Google making Page Speed available for Chrome users. This week Google surprises me again, by doing Page Speed Online. Page Speed Online is pretty much just an online edition of Page Speed for Firefox and Chrome. The funny thing is, that I just a couple of posts ago votes against Google in the YSlow – Page Speed battle. Now I think that Yahoo actually needs to deliver something new to keep up with Google.

Update: I just realized, that there’s a Chrome YSlow extension as well. I’m really looking forward to see, which new tools this competition between Yahoo and Google will bring us.

We all know it: zipping is a great way to compress the size of a file. This rule also applies on the web. Zipping responses from a server is natively implemented in the HTTP version 1.1 standard, which is why you don’t have any excuse left not to zip all your responses. I won’t go into a lot of detail on how zipping over HTTP works, but here is a short brushup.

When doing a request on a web server, the client can set the Accept-Encoding request header. If set to one of the valid values like “gzip,” the client tells the server that it is capable of handling a zipped response. If the server chooses to zip the response, it must set the Content-Encoding response header to a valid value like “gzip” or “deflate.” The browser checks for this header and unzips the response if necessary. This sounds a bit technical, but actually all of the above are handled by the browser. This means that all you have to do is tell your server to gzip encode your content. Pretty neat, right?

There are different ways to gzip encode web server responses. I typically use ASP.NET MVC for my projects, which does not have anything build in. I’ve seen different implementations in HTTP handlers and modules as well as custom action filters. I typically use WebOptimizer.NET by a fellow Danish developer, Mads Kristensen. It has a pretty simple HTTP module, which produces standard gzip/deflate by adding the WebOptimizer dll to your dependencies and pasting the following XML to your web.config:

<system.webServer>
  <modules>
    ...
    <add name="CompressionModule" type="WebOptimizer.Modules.CompressionModule, WebOptimizer"/>
    ...
  </modules>
</system.webServer>

It doesn’t get any simpler than this! The downside is that WebOptimizer doesn’t really evolve (last commit in 2009). If you have any proposals for a better and more recent implementation, please provide me with a link.

If you have access to configure the IIS, compression is natively supported. Scott Hanselman just wrote a great blog post on the subject.

(Proofread by inWrite)

We have reached rule number 4 in our journey into the exciting world of the YSlow rule set. Rule number 4 is all about caching. We have tackled the caching area a bit in some of the previous posts, but in this post I will dig more into this important area.

Showing webpages is all about getting resources like HTML, JavaScript, and images from the server and combining them in a showable way in the browser. I’ve previously talked about how to minimize the number of requests to the server, but what do we do when we have removed all the unnecessary requests with SquishIt and similar tools? One of the answers is caching, which is applicable for pretty much all pieces of a webpage. I will not explain any basics about HTTP caching. If you are not familiar with the subject, you should check out Mark Nottingham’s great Caching Tutorial.

So what’s the buzz about the Expires and Cache-Control headers? Both terms are part of the response header, which is used to tell clients if the response can be cached. Rather than explaining you all the theories, I would rather show you how I’ve implemented caching in one of my Danish ASP.NET MVC-based websites named myrating.

The problem

On my “product” pages I show similar items in a column to the right of the product info. Similar content is pretty performance heavy to calculate because of the algorithm, which is based on tags, ratings, and even more stuff. As a result, loading the product page was very slow, caused by the small similar items box.

The solution

I decided to do two things. The first solution was to wait for the rest of the page to show and load the info box async. I will show you how to do this in an upcoming post, but right now we will focus on the second solution: caching the response. I could have implemented the caching in an HTTP module but decided to implement a new action filter for ASP.NET MVC. This way I can annotate all the methods in my ASP.NET MVC project, the result of which can be cached. The action filter looks like this:


public class CacheableAttribute : ActionFilterAttribute
{
    public override void OnActionExecuted(ActionExecutedContext filterContext)
    {
        filterContext.HttpContext.Response.Cache.SetExpires(DateTime.Now.AddDays(7));
        filterContext.HttpContext.Response.Cache.SetCacheability(HttpCacheability.Public);
        filterContext.HttpContext.Response.Cache.SetValidUntilExpires(true);
    }
}

In line 5 I set the Expires header to a week from now. I’ve hardcoded this time span but could as well have implemented a property for making this configurable.

In line 6 I set the cacheability to public. This basically means that all parties can cache this, including both browsers and proxies.

In line 7 I set valid until expires to true to avoid cache invalidation sent from the browser. This way I control how long the response is cached, and not the browser.

Making responses cacheable for one week is now piece of cake. All I need to do is to add the Cacheable attribute to the action methods which can be cached like this:


[Cacheable]
public ActionResult FindSimilarItems(Guid id)
{
    ...
}

(Proofread by inWrite)

In this post I will try to explain to you the idea behind a content delivery network or CDN. I think that the Yahoo documentation on this rule is a bit confusing. So here’s my attempt to explain YSlow rule #2: use a content delivery network.

To explain the rule, we need to start by looking at the problem. As discussed in the two previous blog posts, the browser typically needs to do a lot of requests to render a page. The YSlow documentation mentions that around 80-90 percent of the time of showing a page is used up by requesting resources from the server. That’s why we need to optimize this part before looking into other stuff like your SQL or other serverlike stuff. We can do this in two ways: (1) by limiting the number of requests, as explained in the previous two posts, and (2) by speeding up the existing requests. The use of a CDN focuses on speeding up your existing requests or even avoids making them.

So what’s a CDN? According to the YSlow documentation, a CDN is a number of web servers distributed across multiple locations. The idea behind this is to be able to serve different users as fast as possible, making the response times optimized for each particular client’s location. Even though this may be the official definition of a CDN, I typically don’t use a CDN for location-based stuff. As opposed to Yahoo, most of us develop smaller websites, targeting single countries or even regions. These sites typically don’t have the need for a CDN, because the web server(s) can be placed near the users.

Why do we need to focus on this rule then? Well, a few years back we did not need to. In the meantime Google, Microsoft, and other large companies have opened up their internal CDNs for us to use. This is best illustrated by an example. You most likely use some sort of JavaScript framework like jQuery or Google Maps. On the needed pages you add a script import like this:


<script src="/scripts/jquery.js"></script>

As mentioned in the two previous posts, we could combine this script with other required scripts on our server. I usually use this approach a lot, but never when referencing scripts like jQuery. As I mentioned, Google (as well as others) made a CDN for us to use for common scripts like jQuery. Google’s solution is called Google Libraries API and serves a lot of frequently used scripts. Instead of referencing jQuery like in the example above, you should reference it like this:

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.5.1/jquery.min.js"></script>

Why is this a good idea? There are  several reasons:

1. The browser is more likely to have the response of this URL cached, because a lot of sites now use this approach.

2. Response times on Google’s servers pretty much kick everybody else’s asses.

3. You don’t need to worry about minimizing JavaScript yourself.

4. You don’t need to use disk space for common scripts.

5. Updating to a new version is as simple as changing a URL. No new download from the jQuery website required.

Start by using everybody else’s content delivery networks and build your own only if the gig requires you to.

(Proofread by inWrite)