AI Bots Are Crawling Your Site: How to Control Search Indexing in 2025

Control Which AI Crawlers Index Your Website from cPanel

Author: Kaine
Date: 22 May, 2025

Learn how to manage GPT and AI bot access via robots.txt

With the rise of AI models like ChatGPT, Google Gemini, and Perplexity, more crawlers are accessing public websites to collect content for training datasets. If your site is hosted with Hosting Australia on cPanel, this guide explains how to control which bots can crawl your site, using tools built right into your hosting panel.

What Are AI Crawlers?

AI crawlers work like traditional search engine bots, but instead of indexing pages for search results, they scan content to train language models. Examples include:

  • ChatGPT-User (OpenAI)
  • Google-Extended (used by Google Gemini)
  • CCBot (used by Common Crawl)

These crawlers may pull large amounts of text from your public pages. If you prefer to control how your content is used, you can restrict access directly from your cPanel account.

Step 1: Access Your robots.txt File via cPanel

  1. Log in to cPanel.
  2. Scroll to the Files section and open File Manager.
  3. Navigate to the root folder of your domain (usually public_html).
  4. Check if a file named robots.txt exists. If not, right-click anywhere and choose Create New File, name it robots.txt.

Step 2: Block AI Crawlers in robots.txt

Edit the file and add the following:

User-agent: ChatGPT-User
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Google-Extended
Disallow: /

This tells those specific bots not to crawl your website. Reputable bots will obey these instructions, though some third-party scrapers may not.

Once done, save the file and confirm it’s publicly accessible at:

https://yourdomain.com/robots.txt

Step 3: Use Meta Tags for Page-Level Control (Optional)

For more granular control, you can add meta tags inside the <head> section of specific pages. For example:

<meta name="robots" content="noindex, nofollow">

This prevents indexing and link following for a given page. If you’re using WordPress, SEO plugins like Yoast or Rank Math let you control this per page or post.

Step 4: Monitor Bot Activity (Optional)

If you notice strange behavior, you can check visitor logs from cPanel:

  1. Go to Metrics > Raw Access or Visitors.
  2. Look for unusual user agents like ChatGPT-User or anything suspicious

Summary

AI crawlers are now a common part of web traffic. If you want to protect your content or reduce unwanted load, you can manage access easily through cPanel by editing your robots.txt file. This gives you full control over which bots can scan your site.

Need Help?

If you’re unsure how to implement this or need assistance with anything in cPanel, reach out to Hosting Australia’s support team. We’re happy to walk you through the process or make the changes for you.

Further Information

External Links

Sign Up To Our Newsletter

Related Posts

Hosting Australia Newsletter

Don't miss out on the latest news and
special offers from Hosting Australia.

Sign up today!

This field is hidden when viewing the form

Next Steps: Sync an Email Add-On

To get the most out of your form, we suggest that you sync this form with an email add-on. To learn more about your email add-on options, visit the following page (https://www.gravityforms.com/the-8-best-email-plugins-for-wordpress-in-2020/). Important: Delete this tip before you publish the form.
Privacy(Required)

Hosting Australia Newsletter

Don't miss out on the latest news and
special offers from Hosting Australia.

Sign up today!

This field is hidden when viewing the form

Next Steps: Sync an Email Add-On

To get the most out of your form, we suggest that you sync this form with an email add-on. To learn more about your email add-on options, visit the following page (https://www.gravityforms.com/the-8-best-email-plugins-for-wordpress-in-2020/). Important: Delete this tip before you publish the form.
Privacy(Required)