Generating a SiteMap with .NET

Jambr is still a baby, as such it's content and structure is changing.
It originally existed on two urls (www and non-www), and google was indexing both of them and to add to it not long ago I changed the url structure for Articles to be more, SEO friendly.

All of these changes confuse search engine indexers and one way to help them out is to provide them with a Sitemap. My rough list of requirements were:

  • To comply fully with the Sitemap specification
  • To generate automatically, when /sitemap.xml was called
  • To be able to decorate fixed controller actions with an attribute which would include them in the map.
  • To provide a simple way of adding the dynamic content
  • To cache the output for a period of time

Implementation

First things first, we need to create an XML document which matches the Sitemap protocol. So we create a new XmlDocument and from there, we add the xmlns for the Sitemap protocol, and add the root element "urlset"

''' <summary>
''' The scheme we add to the document
''' </summary>
Private Const SiteMapSchemaURL As String = "http://www.sitemaps.org/schemas/sitemap/0.9"

''' <summary>
''' The full URL to your website, for example http://www.jambr.co.uk
''' </summary>
Private Property FullyQualifiedUrl As String

Private _document As XmlDocument
''' <summary>
''' Returns the XML document
''' </summary>
Private ReadOnly Property Document As XmlDocument
Get
   Return _document
End Get
End Property

''' <summary>
''' Create a new instance of the SiteMapGenerator, initialise the XML document
''' and add the required namespaces
''' </summary>
''' <param name="FullyQualifiedUrl">The full URL to your website, for example http://www.jambr.co.uk</param>
Public Sub New(ByVal FullyQualifiedUrl As String)

Me.FullyQualifiedUrl = FullyQualifiedUrl.Replace("\", "/")

_document = New XmlDocument
Document.AppendChild(Document.CreateNode(XmlNodeType.XmlDeclaration, Nothing, Nothing))

'Create the root element and add the sitemap namespace
Dim rootelement = Document.CreateElement("urlset", SiteMapSchemaURL)
Document.AppendChild(rootelement)

End Sub

Next I wanted to create a flexible method to add new urls, that accepted all the valid options for the url child elements, on an optional basis, and only adding them if they're passed:

''' <summary>
''' Adds a URL to the site map
''' </summary>
''' <param name="Location">The URL to the page, will check for your domain and add if required.</param>
''' <param name="LastModified">Optional: The date the URL was last modified</param>
''' <param name="ChangeFrequency">Optional: The expected change frequency of the URL</param>
''' <param name="Priority">Optional: The priority of the page, ranging from 0.0 to 1.0, default is 0.5</param>
Public Sub AddUrl(ByVal Location As String,
  Optional ByVal ChangeFrequency As ChangeFrequency = Nothing,
  Optional ByVal Priority As Decimal = Nothing,
  Optional LastModified As DateTime = Nothing)

'sanitise the url
Location = Location.Replace("\", "/")
If Not Location.ToLower.Contains(FullyQualifiedUrl.ToLower) Then
Location = FullyQualifiedUrl & If(Left(Location, 1) = "/", Location, "/" & Location)
End If

'check we haven't added it already in a stored list of urls we've added
If AddedUrls.Contains(Location) Then Exit Sub
AddedUrls.Add(Location)

'Required elements
Dim newUrl = Document.CreateElement("url", SiteMapSchemaURL)
newUrl.AppendChild(CreateTextElement("loc", Location))

'Optional Elements
If Not LastModified = Nothing Then
newUrl.AppendChild(CreateTextElement("lastmod", LastModified.ToW3C))
End If

If Not ChangeFrequency = Nothing Then
newUrl.AppendChild(CreateTextElement("changefreq", ChangeFrequency.ToString))
End If

If Not Priority = Nothing Then
newUrl.AppendChild(CreateTextElement("priority", Priority))
End If

Document.DocumentElement.AppendChild(newUrl)

End Sub

Reflection

I mentioned previously that I wanted an easy way to add URLs, I didn't want to create a class which needed me to call AddUrl() over and over for all my pages. I decided to go down the route of creating a custom SettingAttribute, that I could just stick at the top of the controller actions I wanted to map, like this:

<SiteMap(ChangeFrequency:=ChangeFrequency.daily, Priority:=0.7)>
Function Index() As ActionResult
Return View(New HomeViewModel)
End Function

Now you've probably realised that this would only work for static URL's, dynamic actions that require parameters like this, wouldn't work. In the context of Jambr I have two controllers which serve dynamic content, Articles and News. I decided to go down the route of creating an interface, which allowed me to have a sub routine that could be called by the site map generator, like this:

''' <summary>
''' Populate the site map with the dynamic data
''' </summary>
''' <param name="generator">the generate object that gets passed</param>
Public Sub PopulateSiteMap(ByRef generator As SiteMapGenerator) Implements ISiteMap.PopulateSiteMap

'We need to initialise the UrlHelper because of the way we've invokved this method
Url = New UrlHelper(System.Web.HttpContext.Current.Request.RequestContext)

Using db As New JambrDBContext

'Lets add dynamic data, starting with my articles
Dim articles = (db.
   ArticlePosts.
   Where(Function(w) w.IsLive = True).
   OrderByDescending(Function(o) o.LastUpdated).
   Select(Function(s) New With {.SEOUrl = s.SEOUrl,
.LastUpdated = s.LastUpdated})).tolist

'Add my root element, with a last modified date of the latest article
generator.AddUrl(Url.Action("Index", "Article"), ChangeFrequency.daily, Nothing, articles.First.LastUpdated)
'Add the RSS feed, as it has the same last udpated date
generator.AddUrl(Url.Action("RSS", "Article"), ChangeFrequency.daily, Nothing, articles.First.LastUpdated)

'Add my other elements
For Each post In articles
generator.AddUrl(Url.Action("View", "Article", New With {.SEOUrl = post.SEOUrl}), Nothing, Nothing, post.LastUpdated)
Next
articles = Nothing

End Using

End Sub

We just look for either the SiteMapAttribute, or the Implementation of ISiteMap using reflection and get the associated details like so:

''' <summary>
''' When called, the site map generator will attempt to load any action methods
''' that are decorated with the SiteMapAttribute from your controller classes and
''' add a url for them based on it
''' </summary>
''' <remarks></remarks>
Public Sub LoadFromAttribute()

'Get all the controllers in the project
Dim controllers = Assembly.
  GetExecutingAssembly.
  GetTypes().
  Where(Function(t) GetType(System.Web.Mvc.ControllerBase).IsAssignableFrom(t))

'First we want to get all controllers that implement the ISiteMap interface and fire the method
For Each c In controllers.Where(Function(t) GetType(ISiteMap).IsAssignableFrom(t))

'Create an instance
Dim obj As ISiteMap = Activator.CreateInstance(c, True)
obj.PopulateSiteMap(Me)

Next

'Now get all the methods which are decorated with the SiteMapAttribute
Dim objs = (From c In controllers
   From act In c.GetMembers
   Where act.GetCustomAttributes(True).OfType(Of SiteMapAttribute)().Count > 0
   Select New With {.controller = c,
.action = act,
.actionnameattribute = act.GetCustomAttributes(True).OfType(Of ActionNameAttribute)().FirstOrDefault,
.sitemapattribute = act.GetCustomAttributes(True).OfType(Of SiteMapAttribute)().FirstOrDefault}).ToList

'We need a url helper to help us generate the url path
Dim UrlHelper = New UrlHelper(HttpContext.Current.Request.RequestContext)

For Each p In objs
'Now we have the objects, we need to build the url.  We need to look out for the ActionNameAttribute in case people are using it
'to name their action methods, we also need to remove Controller from the name of the controller
Dim url As String = UrlHelper.Action(If(p.actionnameattribute Is Nothing, p.action.Name, p.actionnameattribute.Name),
 p.controller.Name.Replace("Controller", ""))

'Add the object
AddUrl(url,
   p.sitemapattribute.ChangeFrequency,
   p.sitemapattribute.Priority,
   If(p.sitemapattribute.LastModified Is Nothing,
  Nothing,
  DateTime.Parse(p.sitemapattribute.LastModified, (New CultureInfo("en-us")))
  )
   )
Next

End Sub

Now add a route for sitemap.xml (remember this programming article is based around .Net MVC) in your RouteConfig.vb

'This is to overwrite the sitemap request
routes.MapRoute( _
name:="SiteMap", _
url:="sitemap.xml", _
defaults:=New With {.controller = "SiteMap", .action = "Index"})

Set the controller and action to wherever you're going to put your method, I decided to put mine in a new controller. Finally create your action method, I've decorated mine with the OutputCache attribute and set it to 6 hours, with the ability to clear the cache by using the query string parameter ClearCache

''' <summary>
''' Returns the site map
''' </summary>
<OutputCache(Duration:=21600, VaryByParam:="ClearCache", Location:=OutputCacheLocation.Server)>
Function Index() As ActionResult

'Create our site map
Dim p As New SiteMapGenerator("http://www.jambr.co.uk")

'Load any methods which are tagged with the attribute
p.LoadFromAttribute()

'Return the content
Return Content(p.ToString, "text\xml")

End Function

Something to note here is that I have created a ToString method, which takes the XmlDocument and outputs it as a UTF8 string, UTF8 is important so there is another class in the source code which creates a UTF8 based string writer.

Conclusion

I hope this article has shown you a clean way to implement a dynamic site map in .NET MVC using flexible attributes, full source code can be downloaded from Here, if you want to see my sitemap, check it Here and as usual - any questions please drop me a comment!