SEO on dynamic page types

We have a site that uses a page type with a custom page type controller. This is used to produce a product detail page that is rendered using a passed in sku parameter. Only one page exists in the sitemap e.g. example.com/product_detail but we pass in the sku to display the product we want e.g. example.com/product_detail/123-mysku

The pages do not exist until viewed so are not being crawled for SEO. The page type (product_detail) has been set to allow full page caching which I presumed would pre-render a static page when a product is viewed and allow the pages to be crawled. This doesn’t seem to be happening.

I am also seeing a cache status saying the page is not available in the full page cache even though we have it set to cache. Am I missing something?

Screenshot 2023-08-14 114428

What would typically be the case is that you have links on other pages to each of your product detail pages. Crawlers would then pick up each individual product detail page.

If that is not the case, you can create a product index page that contains nothing but links to your product detail pages. For the product index page, enable the page attribute “Exclude from Nav”. For Google, be sure to generate your sitemap (sitemap.xml) and verify that your product index page is included in sitemap.xml. The crawler should pick up your product index page, then all of your product detail pages.

I was thinking of using this approach but read that this could be considered link farming and could hurt SEO. I’m not sure if search engines would consider an index of links originating from the site as legitimate and the “link farming” issue is just a problem if you are linking to external sites. I think maybe if I categorize them it should be ok.

Link farming is not an issue inside a domain. It can only become an issue between domains.

I wasn’t thinking about link farming, but thanks for the clarification JohntheFish.

Thanks, it’s hard to keep up with what Google will penalize you for with regards SEO. I’ll have several thousand links so I’ll split these up into a few smaller index pages and include them in the sitexml build but hide them from the search and nav.

If you’re expecting that page to output different content based on the URL argument or query string, then full page caching should probably be turned off. Having it on may cause you problems with the same content appearing for different products.

With regard to listing all the products, I’ve done something similar on another project. Essentially I have a dashboard job/task that generates an XML Site Map just forthe products, using custom logic. I also generate the traditional XML Site map through the standard dashboard job, but here’s the clever bit: The XML site map spec allows you to specify a site map index, which includes references to both your standard site map, and the custom one with the products. The final piece of the puzzle is to reference the sitemap index from your robots.txt file so that Google will use that (and register it in webmaster tools of course)

e.g.

sitemap_index.xml:

<?xml version="1.0"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap><loc>https://www.example.com/sitemap_products.xml</loc><lastmod>2023-08-16T23:05:09+12:00</lastmod></sitemap>
<sitemap><loc>https://www.example.com/sitemap.xml</loc><lastmod>2023-08-16T23:05:06+12:00</lastmod></sitemap>
</sitemapindex>

robots.txt:

Sitemap: https://www.example.com/sitemap_index.xml

Thanks, that’s a nice solution.