While many SharePoint'ers are over in Vegas eagerly awaiting the
start of the 2009 SharePoint Conference, the rest of back in
reality enduring the hardships of plain old SharePoint 2007...
Over the past couple of years, the number of SharePoint built
websites has grown significantly. But how do all of these sites
stack up from a technical SEO perspective? Lets have a look using
the new IIS 7.0 SEO Toolkit to analyze our site: www.trinkit.co.nz.
Here is the output after running the tool over our website:

OK cool, lots of errors! Now what do they mean and what can we
do about it?
1. The page contains multiple canonical formats
This means that there are multiple addresses that can be
used to access the pages of our website. For example, take the home
page; we could browse to this page in the following ways:
http://www.trinkit.co.nz/pages/home.aspx
http://www.trinkit.co.nz/Pages/home.aspx
(capitalization)
http://www.trinkit.co.nz/pages/home.aspx
(no www)
The effect of this is that search engines will potentially spread
the ranking over the different URLs rather than aggregating it for
the one page. Now search engines are fairly clever and it should
work out that there is only one page to rank. Not taking any
chances
here is a method you can use to fix this (IIS 7 only).
2. The page contains unnecessary redirects
This is because of the infamous 302 rewrite issue.When
you type a URL like www.trinkit.co.nz, SharePoint
will perform a 302 (temporary) redirect to www.trinkit.co.nz/pages/default.aspx.
This is not ideal as search engines are not as keen on following
302 redirects, they prefer 301s (permanent). There is no ideal way
of fixing this but here are a couple of options:
-
Use IIS7 redirect rules
-
Using an HTTPModule
3. The description is missing (Not SharePoint
specific)
This is really obvious, we are missing the meta
description tag. The meta description tag is normally used for your
search engine result page (SERP) listing and is a key factor in
determining relevancy. While we are on the topic, don't bother with
the meta keywords tag. The big search engines have been ignoring
this since about 2002.
4. The page contains broken hyperlinks (Not SharePoint
specific)
Another obvious content issue. Broken hyperlinks are said by some
to affect page rankings. In theory search engines will favour sites
and pages that have relevant, up-to-date content and broken links
are sign of poorly maintained page. This is tough to keep a handle
on with blogs that have large amounts of outgoing links, but there
are tools available
that can help.
5. through to 7. are not SharePoint specific
issues and there are heaps of great resources around that address
these so I won't cover that here.
8. The URL is linked using different
casing
As mentioned in item 1, search engines are case sensitive. In an
ideal world all of your urls and all the links to them would be
lowercase, with dashes used to separate words. The navigation
controls in SharePoint always redirect to a first letter
capitalized 'Pages' and what is worse is the tendency for URL's to
occasionlly be loaded in upper
case. A technique to address this issue is discussed in this
blog post.
9. & 10. are not SharePoint specific
11. The page contains a large amount of script
code
SharePoint does have a habit of including an awful lot of
additional javascript. However I do think it's a little bit unfair
for it to be reported in this case as I have removed most of it.
Plenty of the javascript that gets loaded is only needed for
authenticated authors and the associated rich editing controls.
There are a few
simple
techniques to remove this and doing so can give you a great
performance boost.
12. This page contains invalid markup
It's pretty commonly known that
SharePoint isn't exactly standards friendly. Search engines
will have an easier time processing the contents of your page if it
is easily parsable. Now this doesn't mean that it has to be XHTML
1.1 Strict compliant. It just means that all the tags are closed
and are not mismatched, which is a lot easier to achieve than XHTML
standards. As WCAG 2.0 has the same requirements you can use a WCAG
2.0 validator to test this.
One other thing that does not seem to covered by the IIS SEO
Toolkit:
13. There is no XML sitemap defined
An XML sitemap tells the search engine where are all the
pages you want crawled are, it is not made to be human readable.
For a quick and easy way to get this setup check Waldek's
sitemap generator.
Note that this was done on a slightly older version of the site,
and a few of these issues have already been fixed.
The SEO tool is still in Beta and seems to be a little over
zealous in the number of issues it reports, but it is already
providing some really useful results.
Of course, nothing beats having really great original
content that naturally generates healthy back links. Fixing these
technical issues is really just a way of maximising that hard work
and there is certainly nothing wrong with that!