add html boilerplate
[tools.git] / doc / htaccess.md
Maxious 1 [HTML5 Boilerplate homepage](http://html5boilerplate.com) | [Documentation
2 table of contents](README.md)
3
4 # .htaccess
5
6 In Apache HTTP server, `.htaccess` (hypertext access) is the configuration file
7 that allows for web server configuration. HTML5 Boilerplate includes a number
8 of best practice server rules for making web pages fast and secure, these rules
9 can be applied by configuring `.htaccess` file.
10
11 **You'll want to have these modules enabled for optimum performance:**
12
13 * `mod_setenvif.c` (setenvif_module)
14 * `mod_headers.c` (headers_module)
15 * `mod_deflate.c` (deflate_module)
16 * `mod_filter.c` (filter_module)
17 * `mod_expires.c` (expires_module)
18 * `mod_rewrite.c` (rewrite_module)
19
20
21 ## On Windows
22
23 You've got a couple of options that depend on how you installed Apache.
24
25 1. **WampServer**. This is by far the simplest option. If you have installed
26 WampServer just click on the icon in the task bar, hover over the Apache
27 section in the menu that comes up and then hover over the modules section.
28 You will be presented with a list of modules. Simply click on a module name
29 to enable it (or disable it if it is already enabled). A check mark next to
30 a module indicates that it is enabled. WampServer will automatically restart
31 the Apache service after you enable a module.
32
33 2. **Manually editing `httpd.conf`**. This assumes that you have manually
34 installed Apache. You will need to locate the `httpd.conf` file which is
35 normally in the `conf` folder in the folder where you installed Apache (for
36 example `C:\apache\conf\httpd.conf`). Open up this file in a text editor. Near
37 the top (after a bunch of comments) you will see a long list of modules. Check
38 to make sure that the modules listed above are not commented out. If they
39 are, go ahead and uncomment them and restart Apache.
40
41 That's it, you're done!
42
43
44 ## On Linux
45
46 These instructions should work on any distribution where `apt-get` has been
47 used to install Apache.
48
49 1. Open up a terminal and type the following command. Enter your password when
50 prompted.
51
52 `sudo a2enmod setenvif headers deflate filter expires rewrite include`
53
54 1. Restart apache by using the following command so the new configuration takes
55 effect.
56
57 `sudo /etc/init.d/apache2 restart`
58
59 That's it, you're done!
60
61
62 ## On Mac
63
64 Coming soon...
65
66
67 ## Security
68
69 Do not turn off your ServerSignature (i.e., the `Server:` HTTP header). Serious
70 attackers can use other kinds of fingerprinting methods to figure out the
71 actual server and components running behind a port. Instead, as a site owner,
72 you should keep track of what's listening on ports on hosts that you control.
73 Run a periodic scanner to make sure nothing suspicious is running on a host you
74 control, and use the ServerSignature to determine if this is the web server and
75 version that you expect.
76
77
78 ## Performance
79
80 ### Configure ETags
81
82 ```apache
83 FileETag None
84 ```
85
86 Entity tags (ETags) is a mechanism that web servers and browsers use to
87 determine whether the component in the browser's cache matches the one on the
88 origin server. (An "entity" is another word a "component": images, scripts,
89 stylesheets, etc.) ETags were added to provide a mechanism for validating
90 entities that is more flexible than the last-modified date. An `ETag` is a
91 string that uniquely identifies a specific version of a component. The only
92 format constraints are that the string be quoted. The origin server specifies
93 the component's `ETag` using the `ETag` response header.
94
95 ```http
96 HTTP/1.1 200 OK
97 Last-Modified: Tue, 12 Dec 2006 03:03:59 GMT
98 ETag: "10c24bc-4ab-457e1c1f"
99 Content-Length: 12195
100 ```
101
102 Later, if the browser has to validate a component, it uses the `If-None-Match`
103 header to pass the `ETag` back to the origin server. If the ETags match, a 304
104 status code is returned reducing the response by 12195 bytes for this
105 example.
106
107 ```http
108 GET /i/yahoo.gif HTTP/1.1
109 Host: us.yimg.com
110 If-Modified-Since: Tue, 12 Dec 2006 03:03:59 GMT
111 If-None-Match: "10c24bc-4ab-457e1c1f"
112 HTTP/1.1 304 Not Modified
113 ```
114
115 The problem with ETags is that they typically are constructed using attributes
116 that make them unique to a specific server hosting a site. ETags won't match
117 when a browser gets the original component from one server and later tries to
118 validate that component on a different server, a situation that is all too
119 common on web sites that use a cluster of servers to handle requests. By
120 default, both Apache and IIS embed data in the ETag that dramatically reduces
121 the odds of the validity test succeeding on web sites with multiple servers.
122
123 The ETag format for Apache 1.3 and 2.x is inode-size-timestamp. Although a
124 given file may reside in the same directory across multiple servers, and have
125 the same file size, permissions, timestamp, etc., its inode is different from
126 one server to the next.
127
128 IIS 5.0 and 6.0 have a similar issue with ETags. The format for ETags on IIS is
129 Filetimestamp:ChangeNumber. A ChangeNumber is a counter used to track
130 configuration changes to IIS. It's unlikely that the ChangeNumber is the same
131 across all IIS servers behind a web site.
132
133 The end result is ETags generated by Apache and IIS for the exact same
134 component won't match from one server to another. If the ETags don't match, the
135 user doesn't receive the small, fast 304 response that ETags were designed for;
136 instead, they'll get a normal 200 response along with all the data for the
137 component. If you host your web site on just one server, this isn't a problem.
138 But if you have multiple servers hosting your web site, and you're using Apache
139 or IIS with the default ETag configuration, your users are getting slower
140 pages, your servers have a higher load, you're consuming greater bandwidth, and
141 proxies aren't caching your content efficiently. Even if your components have a
142 far future Expires header, a conditional GET request is still made whenever the
143 user hits Reload or Refresh.
144
145 If you're not taking advantage of the flexible validation model that ETags
146 provide, it's better to just remove the ETag altogether. The Last-Modified
147 header validates based on the component's timestamp. And removing the ETag
148 reduces the size of the HTTP headers in both the response and subsequent
149 requests. This Microsoft Support article describes how to remove ETags. In
150 Apache, this is done by simply adding the above line to your Apache
151 configuration file.
152
153
154 ### Gzip Components
155
156 Compression reduces response times by reducing the size of the HTTP response.
157
158 Starting with HTTP/1.1, web clients indicate support for compression with the
159 Accept-Encoding header in the HTTP request.
160
161 ```
162 Accept-Encoding: gzip, deflate
163 ```
164
165 If the web server sees this header in the request, it may compress the response
166 using one of the methods listed by the client. The web server notifies the web
167 client of this via the Content-Encoding header in the response.
168
169 ```
170 Content-Encoding: gzip
171 ```
172
173 Gzip is the most popular and effective compression method at this time. It was
174 developed by the GNU project and standardized by RFC 1952. The only other
175 compression format you're likely to see is deflate, but it's less effective and
176 less popular.
177
178 Gzipping generally reduces the response size by about 70%. Approximately 90% of
179 today's Internet traffic travels through browsers that claim to support gzip.
180 If you use Apache, the module configuring gzip depends on your version: Apache
181 1.3 uses `mod_gzip` while Apache 2.x uses `mod_deflate`.
182
183 There are known issues with browsers and proxies that may cause a mismatch in
184 what the browser expects and what it receives with regard to compressed
185 content. Fortunately, these edge cases are dwindling as the use of older
186 browsers drops off. The Apache modules help out by adding appropriate Vary
187 response headers automatically.
188
189 Servers choose what to gzip based on file type, but are typically too limited
190 in what they decide to compress. Most web sites gzip their HTML documents. It's
191 also worthwhile to gzip your scripts and stylesheets, but many web sites miss
192 this opportunity. In fact, it's worthwhile to compress any text response
193 including XML and JSON. Image and PDF files should not be gzipped because they
194 are already compressed. Trying to gzip them not only wastes CPU but can
195 potentially increase file sizes.
196
197 Gzipping as many appropriate file types as possible is an easy way to reduce
198 page weight and accelerate the user experience.
199
200
201 ### Cache busting
202
203 A first-time visitor to your page may have to make several HTTP requests, but
204 by using the Expires header you make those components cacheable. This avoids
205 unnecessary HTTP requests on subsequent page views. Expires headers are most
206 often used with images, but they should be used on all components including
207 scripts, stylesheets, etc.
208
209 Traditionally, if you use a far future Expires header you have to change the
210 component's filename whenever the component changes.
211
212 The H5BP `.htaccess` has built-in filename cache busting. To use it, uncomment
213 the relevant lines in the `.htaccess` file.
214
215 Doing so will route all requests for `/path/filename.20120101.ext` to
216 `/path/filename.ext`. To use this, just add a time-stamp number (or your own
217 numbered versioning system) into your resource filenames in your HTML source
218 whenever you update those resources.
219
220 #### Example:
221