Indexing a website is essential as there is a chance to be pretty much invisible if Google doesn't index your site. Google finds a particular web page by crawling the web. Google then adds the crawled page in their index. It accomplishes this action using a web spider called Googlebot.
You might be wondering what actually is crawling and indexing. Crawling is basically following the hyperlinks on the web to discover content and Indexing is the process of storing the webpage in a vast database. To carry the crawling process at scale a piece of software called Web spider is used and Google's web spider is called Googlebot.
Google displays the relevant pages from their index when you search for something. There are billions of pages so Google's ranking algorithm does it's best to sort the pages and provide the best and most relevant pages first. It should be noted that Ranking and Indexing are completely different things. Ranking is winning the race while Indexing is just showing up for a race. So to win the race you must first show up for a race.
Check if you're indexed in Google
To check whether or not your site is indexed, go to Google and search for site: yourwebsite.com
The number obtained above is roughly a number of pages that Google has indexed. You can also check for specific URL by typing site:website.com/web-page-slug.
If your page is indexed then 1 result as above is shown and no result will show up if your page is not indexed. Coverage report provides more accurate insights regarding the index status of your website if you are a user of Google Search Console user.
You can visit Google Search Console> Index > Coverage
You can see a number of valid and invalid pages. If no pages of your site are valid then there can be a severe problem. Google Search Console also allows to check whether your site is indexed. Go to URL Inspection and past the URL. It says URL is on Google if your web page is indexed.
Get indexed by Google
If your website or webpage is not indexed in Google then you can try following the below steps.
• Go to Google Search Console and navigate to the URL Inspection tool.
• Now, paste the URL you'd like to index in the search bar and wait for Google response.
• Finally, click " Request indexing" button
If you are posting something new, you are telling Google that you've added something new to your website. However, this is not the only solution as there can be various problem on your site that need immediate diagnosis. Here is a list of some common problem that needs immediate attention.
Remove crawl blocks
If Google is unable to index your site then there could be a crawl block in robots .txt.file. You can check for this issue by going to-yourdomain.com/robots.txt. If you see a piece of code, it is telling Googlebot that they are not allowed to crawl any pages on your site. Simply remove a piece of code to fix the issue.
If Google isn't indexing your single page it can be due to robots.txt file's culprit. You can check it using URL Inspection tool provide by Google Search Console. Click on the Coverage section and look for Crawl allowed? If it says No then it is blocked by robots.txt.
Remove unwanted noindex tags
You can keep some of the webpage private and tell Google not to index the page. There are two ways to carry this action
First method-meta tag
Pages with these meta tags-<meta name="robots" content="noindex"> and <meta name="Googlebot" content="noindex"> won't be indexed by Google.
You can find all the pages with noindex meta tag on your site. You can run a crawl with Ahref's Site Audit and check the Internal Pages report. You can view all the affected pages and remove the noindex meta tag from any webpage.
Second method- X-Robots-Tag
Crawlers also consider the X-Robots-Tag HTTP response header. It can be implemented using a server-side scripting language like PHP. It can be detected if Google is blocked from crawling a page because of this header.
Go to Google Search Console and navigate to URL Inspection and enter the URL. Check for "Indexing allowed"? If it says No then you should check this in your site. Simply run a crawl in Ahrefs' Site Audit tool, and use "Robots information in HTTP header" filter in the Data Explorer.
Remove Deceitful canonical tags
A canonical tag looks something like < link rel="canonical" href="/page.html/"> and tells Google which is the preferred version of the page. The page having no canonical tags or a self-referencing canonical tag tells Google that you want to index that page.
But if your page has deceitful canonical tag then it might be telling Google about a preferred version of the page that doesn't exist.
You can check for canonical using URL inspection tool. If canonical is pointing to another page you can see an "Alternate page with canonical tags" warning.
Check orphaned pages
Orphaned pages are the pages without internal links. Orphan pages won't be discovered by Google as it cannot crawl along it. It is also not discoverable to Website visitors.
You can check for the orphaned pages by crawling your site through Ahref's Site Audit. Now check for incoming links for Orphan page. It does not have any incoming links then there is some errors. Ahref's Site Audit will show all the pages that have no internal links pointing them.
You can fix the problem of Orphan page either by deleting the unwanted pages or by incorporating it into the internal links if it is important.
Add powerful internal links
The Google Crawler may not be able to find your web pages if your site has poor internal links. One obvious solution is to add some internal links to the pages. You can create a link from any other web page that Google can crawl and index.
Google can index your pages quickly if you manage to create internal links from one of your more powerful pages. You can quickly view your best links report using Ahref's Site Explorer. This will show you result according to Rating-i.e shows the most powerful page first.
Choose a relevant page and manage to create an internal link.
Fix the problem of nofollow internal links
The tag with rel="nofollow" is called nofollow links. It hampers the transfer of PageRank to the destination URL. All the internal links to indexable pages should be followed.
Use can check it by Ahref's Audit tool and check for the Incoming Links report that has nofollow incoming internal links.
You can either remove nofollow tag from the page or permanently delete the page.
Build quality backlinks
Backlinks provide valuable information to Google. If links are from authorized pages then Google thinks that such pages hold some value. Google wants to index such pages.
We do not mean that Google only index the site having backlinks, there are plenty of indexed pages without backlinks. But Google often sees the page with quality backlinks and are more likely to crawl.
Consider removing low-quality pages
Your website serves only to waste crawl budget if you have too many low-quality pages. Google itself has stated that Publishers should worry about the crawl budget. A site with few URLs will be crawled efficiently.
It is never a bad thing to remove the unwanted pages from your site if you can bring a positive impact on the crawl budget.
Readers Comment