We use the term "filename masking" rather than "url masking" because the seo significance is at the filename level, but recognize that we're really addressing the url with mod rewrite masking techniques (or IISAPI rewrite for Windows boxes).
Although many sites use filename masking for SEO purposes, most don't realize that masking was originally developed as a security feature. By showing .html instead of .php or .asp, a site could appear to be static, causing would be hackers looking for a dynamic site to hack (via syql data injection) to move on. But as more and more people gravitate to masking for SEO purposes, some of the brilliant and valuable uses of filename masking have been devalued by oversight.
One very interesting discovery was only made possible because of our work on Google penalty unwinds. We get a lot of requests for help, so we know the issues that sites are facing when things go bad. Among the repeat penalties is one that involves rank intermittency - site loses most ranks for weeks at a time, then recovers for a week or two, then cycles back down, and so forth.
We found that sites suffering this intermittent rank behavior would return after remasking to include extensions. We believe that Google expects to see unique content associated with unique filenames. For some reason, when sites scale very large, Google sometimes has problems recognizing the filename/content relationship and the resultant confusion triggers the intermittent ranks or they associate the content from many pages with the index.html or default.asp file that is really being served. When you mask all the way down to the filename extensions, there can be absolutely no confusion what the filename is.
But there are even more important reasons to follow the legacy purposes of masking. One of the basic tenets of SEO is that static is better than dynamic. If you're not showing the extension, you're revealing a dynamic infrastructure, not a static one, unnecessarily revealing a security vulnerability.
So there are 3 reasons to always mask all the way down to the filename extensions:
-1- Security - hide the dynamic infrastructure from hackers
-2- SEO - show a static site to the search engines by masking to the .html extension
-3- SEO - enable matching nomenclature precisely to target terms
UPDATE 23 September 2008:
Dynamic URLs vs. static URLs
Written by Juliane Stiller and Kaspar Szymanski,
Google Search Quality Team
Monday, September 22, 2008
This post claims that it is better to NOT mask dynamic filenames, and that Google is able to index complex urls that contain parameters. But the post is misleading in a big way.
This thread, written by Google employees verifies our observation about masking problems, but muddies the waters as to real solutions. The claim that the dynamic urls are BETTER than the supposed static looking masked filenames for spidering purposes proves that our observations concerning these urls have been right on the mark. We have a big problem with the claim that these look static:
"The following are some examples of static-looking URLs which may cause more crawling problems than serving the dynamic URL without rewriting:
What is NOT said is the most important piece. Notice that filenames, masked all the way to their extensions, are not on this list because they are NOT a problem for Google. In fact, we believe this is the ONLY way masking should be used, because it is the ONLY protocol that appears truly static. None of the above examples, provided by Google, are static looking in our opinion - all are more likely the result of masking a dynamic url. And while pointing out that Google has problems with those urls, they never mention what actually WORKS!
What works is to mask all the way down to filename extensions.
The takeaway: Masking protocols that do not reveal the filename extension can create problems for the Googlebot.