Search
Compliance
Thursday 7 November 2024 09:17 AM   Your IP: 34.204.176.71
Structural SEO
Home       SEO Enterprise Blog       Search Compliance       Structural SEO       The Semantic Imperative       About re1y.com      
Home
Restoring Ranks Post Panda
When Google Destroys Your Business
Search Due Diligence For Internet Investments
SEO Enterprise Blog
Enterprise SEO
Negative SEO
The Risks of Relying on Search
Rank Preservation
When SEO Fails
Search Compliance
Google Penalty Solutions
The Ethics Of Search
Structural SEO
Multiple Sites
Defensive Strategies
Inbound Links
Link Vetting
Third Party Interference
Filename Masking
Black Hat Reality
Recourse & SEO
The Null Set Redundancy
The Semantic Imperative
In The Name Of Relevance?
Automation And SEO
PageRank
Content Authority
Google Penalties Insight
Link Authority Trainwreck
robots.txt
Paid Links
Securing robots.txt
Foreign Language Sites
nofollow
RDF / RDFa
Replacing Nofollow
Canonical Condom
Granularity In CMS
Evaluating SEO Agencies
Search Forensics: Subdomains & Supplemental Results
Google Hiding Link Metrics Behind Sample Links
Enterprise Link Building
Link Velocity Debunked
New Link Disavow Tool
Turn Old Product Pages Into Link Bait
15593

Search Forensics
Discovery Through Search

One of the common problems faced by enterprise seos is how to achieve situational awareness from the outside. That is, how do you discover the structure of the implementation if you don't have server access?

This is especially important when you work with very large sites. Often no one person even has all the details about all the subdomains. In fact, subdomains may be hosted in different server environments in different locations around the world.

If you're trying to understand how the enterprise is managing it all, there are some very simple tricks that can help you assemble the pieces. Here's a really smart technique using the search functions site: and inurl: The cool thing is that these use Google's own search results to reveal a site's hidden information. (Using these searches may trigger the Google warning on the right.)

Here's a simple iterative search routine that well uncover all the indexed subdomains of any given top level domain.

Start with this search:

site:domain.com -inurl:www

This will show you all the urls without www - usually will show you https pages, and subdomains (www is actually a subdomain of every top level domain). You can easily find all subdomains by iterating through them. Pick a subdomain revealed by this search and then search like this:

site:domain.com -inurl:www -inurl:subdomain1

This will filter out all of both www and subdomain1, revealing other subdomains. By itertaing through all that you find, you end up with a search that gives no results. That search will show all the subdomains:

site:domain.com -inurl:www -inurl:subdomain1 -inurl:subdomain2 -inurl:subdomain3 ...

The limitation is the current 32 word search limit.


Try It With Google.com

When we tried to discover all the subdomains of Google, we had to stop at 32 (plus we got the warning and could do no further searches for a while):

site:google.com -inurl:www -inurl:adwords -inurl:knol -inurl:ditu -inurl:maps -inurl:local -inurl:translate -inurl:books -inurl:picasa -inurl:video -inurl:code -inurl:picasaweb -inurl:mail -inurl:chrome -inurl:ejabat -inurl:investor -inurl:wifi -inurl:labs -inurl:checkout -inurl:images -inurl:docs -inurl:photos -inurl:gears -inurl:pack -inurl:sites -inurl:documents -inurl:wave -inurl:afp -inurl:canadianpress -inurl:blogsearch -inurl:earth -inurl:answers

(limit was reached but we could still see:)

-inurl:research -inurl:trends -inurl:sitescontent -inurl:scholar -inurl:trends -inurl:toolbar -inurl:services -inurl:sketchup

Remember the "supplemental results" and all the issues surrounding their revelation and then their disappearance from the Google's search results? They're still around, and you can find them still with this secret handshake:

Here's a hack we uncovered long ago, when supplemental results markers were removed from the search results. So in a sense, this is even more important, because it lets you know whether your urls are in the main index or in the now hidden supplemental results.

We discovered a long time ago that by simply adding /* or /# to a site: search, the result set was drastically changed. Back when the supplemental results were flagged, we found that this search:

site:domain.com/*

revealed all urls in the main or primary index. If you subtracted them from the results you get when you search:

site:domain.com

the remainder were the urls in the supplemental results.

The image on this page is one we see a lot when we do these kinds of searches. Check our blog post on this. We strongly suspect we're being profiled because we see this warning often. Do you?

Home       SEO Enterprise Blog       Search Compliance       Structural SEO       The Semantic Imperative       About re1y.com      

re1y.com
Enterprise SEO
Google Penalty Solutions
Automation & Search Compliance

Looking for SEO enabled content management systems with structural, semantic optimization built into the cms? You're on the right site. Research identified targets are implemented within the markup, content, and filenames to enable the site to rank as high as possible based upon semantic relevance. 34789366G off site content requirements