Hi there. Google seems to have a problem indexing with Dynamic Websites. My site, Greatweekend.com, is an event discovery platform. It has over 500,000 events (all in the US). There is no static page per event. When someone comes to Greatweekend.com, we geolocate the IP and dynamically create a page (list of upcoming events) for that location.
The problem we see is that Google keeps indexing the same pages (depending on where their crawl servers are located). I've tried putting a sitemap.xml but Google still keeps crawling the same pages over and over - depending on their server IP location.
I was hoping someone else here has some experience with this problem and may be event willing to share the solution.
Thanks you in advance,
Bob
The problem you have is trying to serve geo specific content on the same URL's, this is breaking URL Semantics. The "R" in URL means Resource and in your case the city is the resource, at least when I searched it was city specific content that I got, if the city changes based on users location then the resource is changing and this is what breaks the semantics.
There are two ways to deal with this, one is sub directories and the other is sub domains.
Sub-directories:
You can redirect the user to a subdirectory (note, this directory does not need to exist on a filesystem anywhere, it's purely dynamic in nature) ie
http://greatweekend.com/us/ca/[city name here or some other resource type]
Note, you'll hit some gotchas with cities with the same name etc but this way the google index will see each url as unique content which it is. An example of sub directories is yelp, if I visit yelp at their main domain I get redirected from
https://www.yelp.com
to
https://www.yelp.com/sf
Sub-domains:
Craigslist uses subdomains as follows, if I go to their main website I get redirected to
https://sfbay.craigslist.org/
The key thing to remember is that a web resource should remain fixed at a single URL, the method chosen has some ramifications but as you can see from Yelp and craigslist both methods are proven to work well. I've used both methods and the sub domain route is a bit more work to get it to work well.
See the following links:
https://en.wikipedia.org/wiki/Semantic_URL
https://en.wikipedia.org/wiki/Web_resource
How about add a param on your website to geolocate instead of IP.
eg: site.com/something/locationX
And then you could add that site url to sitemap so that gg crawler could work on it.
Thank you everyone for your answers.
A number of answers suggested parmeterization of the URL or creating sub-directories or sub-domains.
Sub-domains are not realistic for us because we search by zip code, whereas most websites using sub-domains (Craig's was mentioned) search by markets.
Currently we use a directory-based approach. Our URLs have the following format:
http://greatweekend.com/events/[eventID]/[eventName]-[eventCity]-[eventState]-[eventDate]
Is this not the same as the directory approach that's been suggested?
What happens now (the problem) is that Google's servers are, say, in SomeCity, Utah. When that server crawls our site, the request will always be geoloated to SomeCity, Utah and we only serve back events in that location.
What I would like to force happen, is for Google to crawl all of our (dynamically generated) pages instead of only crawling the home page. Tried putting them in a sitemap.xml file but Google seems to keep coming back and crawl the home page over and over again, instead of hitting the pages listed in the sitemap.xml.
Any thoughts?
Bob
And THANK YOU again, everyone.
You might find some ideas and recommendations here https://thecontentworks.uk/dynamic-pages-seo-friendly/
Use this: https://prerender.io/
I looked at all the options, including writing my own code, but this was dead simple and it works.
Google's spiders are not designed to look for 100% dynamically generated content. An index by definition refers to a fixed point. If you want to index your content, you will need a static version of the content, even all on one page, that Google can refer to.
Make sure your content is indexed in your sitemap, that your sitemap is submitted to Google Search Console, and that you are using cononical tags for that content, since it's typically treated as similar/duplicate content. Here's an article about that: https://moz.com/learn/seo/canonicalizationhttps://moz.com/learn/seo/canonicalization