This step-by-step guide will help you to find out the pages of your website that are indexed on Google.
Requirements:
- Screaming Frog (Download Link)
- Access to the Website Google Search Console (GSC) Account
Notes:
- You can only check the index status of 2000 pages due to the GSC API limit.
- You will get the 2000-page API quota every single day.
Steps to Connect GSC API with Screaming Frog
- Open Screaming Frog
- Click on Configuration > API Access > Google Search Console
- A new Google Search Console will pop up.
- Click on Connect to New Account and sign in with the email account on which you have access to the website GSC account.
- A new confirmation message, “Received verification code. You may now close this window.” will pop up on the browser after choosing the account connected to the website GSC account.
- Now go back to Screaming Frog. You will find that the account is successfully connected to the Screaming Frog.
- Choose the website from the “Available Properties” list for which you want to check the index status of pages.
- Now on the same window, click on the “URL Inspection” tab and select the “Enable URL Inspection” option and click on the Ok button at the bottom.
- The configuration of GSC API with Screaming is complete.
Screaming Frog Crawling Methods
After the setup, we have two methods to crawl the website:
- Full Site Crawl
- List Mode Crawl
1. Full Site Crawl
If your website has less than or up to 2000 URLs, you can successfully get all the pages’ index status data. Otherwise, you will get the error in the middle of the crawl with a message that the 2000-page API quota is exhausted.
2. List Mode Crawl
You can easily crawl a list of 2000 pages of a particular website and get the page’s index status data.
To set up the list mode crawl, you just have to make the following changes on Screaming Frog:
- On the menu bar, click on Mode > List
- The Spider mode at the top will change to Upload Mode (List Mode), where you can upload the list of URLs through four different methods.
- In my case, I have selected the “Enter Manually” option.
- Now paste the URL list in the new “URL List” window and click next. Screaming Frog will read the file/URLs and click on “Ok”.
- The Screaming Frog will start crawling the pages.
Pages Index Data (Search Console Tab)
- Click on the drop-down icon and select “Search Console” to review the crawl data.
- You will be shifted to the Search Console tab on Screaming Frog, where you will find all the data related to the crawled pages.
Useful Insights to Review (Search Console Tab)
You can review the following list on the search console tab to know what is happening with the website URLs on Google or how Google crawlers interact with website pages.
Most important data points to review on the “All” filter:
- Summary
- Coverage
- Last Crawl
- Crawled As – Unknown user agent
- Crawl Allowed
- Indexing Allowed
- User-Declared Canonical
- Google-Selected Canonical
That’s all, folks!
If you find any difficulties connecting the GSC API with Screaming Frog, please feel free to reach out to me.