
How and Why Search Engines Render Pages
Web Content: Rendering vs. Indexing
Websites and the technology used to create them are constantly evolving. In some cases, new items allow you to achieve greater efficiency or pages with more quality. This is all at the expense of a slight increase in resource requirements. Javascript is no exception, as front-end and back-end scripting has become the norm for the day to day tasks of most modern websites, from analytics and retargeting to dynamically loaded content.
Progress makes it possible to use a much wider range of tools by executing scripts both on the server-side and using browser rendering engines. On the other hand, if earlier content on a website could be understood based on its hypertext markup and source code, the ubiquitous use of scripts makes this task much more difficult.
Until now, only a few search engines are able to correctly process Javascript to get real versions of web pages as presented to users. All the rest are limited to the content of the HTML code This creates additional heterogeneity when comparing the results of processing a particular request. In order to understand a little more, let’s look at a simple example such as a searching for text on a page.
What does render mean? Let’s say you want to maintain a small database of invoices that are issued to your customers. Since the scope and structure of the task are quite simple, it is enough to use solutions like Ivory Search and load new data into the database through the application. The applet will make sure that the data is available to the users of your site. When the client enters their invoice number or selects the desired value from the drop-down list, they will see the corresponding page. What is actually contained within the code of your page? Depending on the platform, one of the most common options are:
- iframe
- External script <script>
- object <object> or <embed>
- Shortcode for WordPress users.
However, if you turn off support for inline elements or Javascript, it’s almost impossible to figure out what your page is about. Another good example is using store locators.
Instead of entering information about each point into your page, you use the appropriate customer service interface to create a small database. All you have to do is add the shortcode to the desired section of the rendered pages.
Starbucks
After that, when the script is executed, a search field, a map, and a list of suitable stores will appear on the page.
Starbucks
However, if you look at a website from the point of view of a search bot, then this content is not on the page until the script code is executed. This makes it extremely difficult to determine its content.
Starbucks
In this case, if Javascript was disabled, the page would not be able to fully display its content. This problem is also common in e-commerce because key elements of online stores, such as product pages or the ordering process, are dynamically generated, which necessarily affects the ranking results.
Quite often, the use of external scripts is associated with analytical programs or attempts to expand the possibilities for search engine optimization of content. For example, when displaying some part of a page or product, download reviews or other information from Google or other sources. While such a decision can have a positive effect on the conversion of the page itself, there will most likely be no significant positive changes from an SEO point of view.
There are several reasons for this. Although the indirect signals in the form of purchases, bounce rate, and time on the page will improve, the search bot will not be able to relate this to the content that users see. This is because the code does not contain the corresponding sections or elements from the point of view of the bot.
Therefore, taking into account the specifics of the processing of modern web pages, there are two main stages between which link render occurs. The first is the original HTML and the second is the final form of the page.
When rendering a page, Google does not waste resources and time to recreate the visual display. As a reminder, the main task is to figure out what the page is about. To do this, you need to analyze its content. This means it is enough to get the final structure of the web document for further processing. Martin Splitt from Google spoke about this in detail in his speech at TechSEO Boost in 2019:
The reason Google is so careful about computing resources is very simple: the constant growth of volumes. According to Martin Splitt, about 160 trillion documents are processed every day, so resources must be allocated as deliberately as possible to ensure that productivity does not suffer, processing as many requests as possible continuously and sequentially.
In order to successfully pass indexing in Google, web pages are divided into two categories: requiring and not requiring content to be rendered before indexing. Martin Splitt described this process in more detail during the traditional Google Webmaster Central office-hours on August 23, 2019:
This step evaluates how much content is generated by the Javascript scripts. A little more information on how this process works can be found in the talk by Tom Greenaway and John Mueller at the Google I/O conference in 2018:
Indexing occurs in the order of a normal queue where the presence of scripts does not affect the nature and logic of the content. However, if external or internal scripts generate a significant part of the document, then it is sent for two-stage indexing with the participation of specialized rendering engines. The staging is that before indexing, you need to get the final structure of the page. Therefore, you need to render it first, get the final structure, and then send it for further processing and indexing, not the original HTML. Accordingly, the queue of pages requiring two-stage indexing comes only when sufficient resources are available to continuously perform all the necessary calculations. John Mueller also touched on the topic of queuing and prioritizing two-stage indexing on his Twitter.
Since the difference between getting a site in the indexing queue and exiting the pre-rendering queue can be anywhere from a few minutes to a week, John Mueller has repeatedly warned against overusing Javascript on key site pages that are critical for effective search. While the site is in the pre-render queue, changes or updates made will not be taken into account. You will have no control over how quickly the updated post-facto information is actually displayed on the search results pages. This can be especially important for news outlets, online stores, or other e-commerce resources.
Best Javascript Optimization Practices for Search Engines
While dynamic content is more of technical optimization, there are a few simple steps that can help your site avoid common indexing issues.
Allow indexing of external resources
Back in 2014, the Google Webmaster Central Blog mentioned the need to allow search bots to access all the resources necessary to display the page in one of its publications.
John Mueller also reminded us of this in his posts on Google+. The easiest way to achieve this is to add the following lines to your robots.txt file:
User-Agent: Googlebot
Allow: .js
Allow: .css
Use dynamic links correctly
Typically, Google follows “the link is a link” principle and treats all relevant elements in a document according to their attributes.
However, the exceptions are links that the search bot finds in the script code. In this case, it will try to index the found link but will automatically calculate it as a nofollow.
Don’t forget to set key attributes for dynamic content. For example, for an <a> link, the href value is required. Accordingly, the use of the relative page address <a href=>/site>> or the execution of the event script <a href=>/site> onclick=32goTo(‘site’)>> will be recognized correctly. However, other functional variations without the href attribute, such as <a onclick=>goTo(‘site’)> or <a href=32javascript:goTo(‘site’)>, will be recognized as an error.
Place important content in the user’s field of vision
Harry Elies and John Mueller have argued that the significance of a page element loaded but hidden from view using CSS or Javascript is difficult to assess.
This, in turn, leads to the fact that hidden items are not counted in the ranking and therefore, not indexed.
Use only the Scripts You Need
Search engine optimization includes many elements. One of the main tasks is to achieve optimal web page load time. With the abuse of analytical services, various add-ons, and plugins, the site may start to load slowly in the web render engine, despite all efforts. Use the Chrome Developer Tool to identify any unused scripts.
Don’t forget about forwarding
Since Javascript is executed on the client-side, the script cannot return the server response code when the page is requested, for example, a 404 error if there is no such page. However, you can make an internal call forwarding with the appropriate instructions. In this case, the script will redirect the user to the appropriate page with the correct error code. Alternatively, create a soft redirect with a noindex tag and an appropriate error message.
Laziness is the engine of progress
A technology known as lazyload allows images or other content to be loaded only when the user needs it, which saves resources and time. One of the important rules for using lazy loading is not to use it on main text content to avoid problems with indexing the page. To activate lazyload, it will be enough to add the loading=”lazy” attribute to the element. Remember to use asynchronous loading for scripts in your pages. The easiest way to do this is to add the async and defer attributes to the script load tag.
Delegate and Share
Time to First Byte (TTFB) and Time to Interactive (TTI) are some of the main performance metrics for on-page scripts. If your script starts to exceed 50 or even 100 kilobytes in size, it might be worth dividing it into several smaller pieces. Many small scripts can be completed more easily and quickly than one large one, especially when given the ability to load asynchronously.
Consider the Features of Different Search Services
Despite the fact that Google confidently maintains its status as an industry leader, there are a number of other popular search services that also have their own peculiarities when processing and indexing pages containing Javascript scripts. For example, the Bing search bot is capable of recognizing and partially processing such scenarios. However, for pages that use large amounts of Javascript, it is recommended to use dynamic rendering for a more accurate indexing. At the same time, the Yahoo bot does not support Javascript recognition and any content generated by such scripts will be hidden for it. In this situation, the best recommendation would be to use the <noscript> tag.