What to Do When Spammers Use Your Internal Site Search for Evil

20241009 -- What to Do When Spammers Use Your Internal Site Search for Evil -- Rexly

Nearly every site has an internal site search engine that searches only that site, and itโ€™s not usually a part of your list of security risks. Theyโ€™re also typically non-entities for search engine optimization. After all, search engines generally donโ€™t use search bars. There is, however, a way for spammers to hijack your internal site search to create spam. 

A Real-Life Example of Internal Site Search Spam

Hereโ€™s how internal site search spam works: A spammer uses your internal site search URL string to โ€œcreateโ€ low-value pages that contain keywords and URLs. Then, they create a third-party page that links to those low-value internal site search URLs on your site and use Google Search Console to request indexation of the page. That prompts Google to crawl from the third-party page to your low-value internal site URLs, getting them discovered and potentially indexed.

Too confusing? Hereโ€™s an example.

One of our clients, weโ€™ll call them Client-A, had an internal site search URL string that looked like this: https://www.client-a.com/search?q=search-term. The spam was generated using this scenario:

  1. Spammer-B identifies the internal site search results page URL on https://www.client-a.com.
  2. Spammer-B creates a page on its own spam site, for example, https://www.spammer-b.com/some-crummy-page.
  3. That page (https://www.spammer-b.com/some-crummy-page) links to a bunch of Client-Aโ€™s internal site search URLs. For example, perhaps they do a search for the query โ€œfree viagra www.spam-site-c.comโ€ in the internal site search, and it generates a zero-results internal search results page at this URL: https://www.client-a.com/search?q=free-viagra-www.spam-site-c.com. The spammers grab that URL and create a link to it on https://www.spammer-b.com/some-crummy-page.
  4. Spammer-B requests that Google index https://www.spammer-b.com/some-crummy-page in Google Search Console. 
  5. Google crawls https://www.spammer-b.com/some-crummy-page and discovers the links to Client-Aโ€™s internal site search pages, like https://www.client-a.com/search?q=viagra-free-viagra-www.spam-site-c.com. 

Why would spammers do this? The theory is that they are creating mentions of a URL and a keyword on a domain that has authority, some of which would then transfer to Spammer-Bโ€™s domain.

Itโ€™s incredibly unlikely that this would actually result in a transfer of value to the spammerโ€™s site, but that doesnโ€™t stop them from trying. What it does do is create a whole mess of low-value, zero-result internal site search URLs for Google to crawl, wasting your crawl equity.

To see if you have this problem today, look for your internal site search URLs in your Google Search Console Pages โ€œDiscovered – currently not indexedโ€ report.

How to Prevent Internal Site Search Spam

Better yet, before internal site search spam becomes an issue, take evasive action.

There are two ways to prevent internal site search spam from taking hold on your site. You can choose to block internal site search results from being indexed or from being crawled. 

Block Indexing

Of the two choices, blocking indexation is the better option because it ensures that these internal site search results pages wonโ€™t be indexed by Google and wonโ€™t appear in search results. Simply use a meta robots tag in the head of the page with a noindex attribute to effectively prevent indexation of the page. The line of code looks like this:

<meta name=โ€robotsโ€ content=โ€noindexโ€>

Block Crawling

Unlike using a meta robots noindex tag, choosing to block crawling using a disallow command in the robots.txt file doesnโ€™t prevent Google from indexing internal site search results pages. It only requests that bots not crawl the pages indicated.

Itโ€™s important to note, however, that if Google has already associated internal site search spam with your site, just blocking crawling wonโ€™t stop it from being indexed. You need to first noindex the affected pages. After they have been deindexed, then you can disallow them with the robots.txt file to save crawl budget.

Preventing internal site search spam is a simple yet crucial step in SEO and site security. It prevents bad actors from manipulating your internal site search results for nefarious purposes and protects your crawl budget from being wasted in the process. Take the time to check your Google Search Console reporting today for evidence of this type of spam and, if found, take steps to remove it today.

About the Author:

Share This Resource, Choose Your Platform!

Join the JumpFly Newsletter

Get Our Marketing Insights Right To Your Inbox

    Schedule a Call

      Fields containing a star (*) are required


      Content from Calendly will be embedded here