Skip to content

FabrizioCafolla/openai-crawlers-ip-ranges

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 

Repository files navigation

OpenAI crawlers IP ranges

Last Commit

Here are the complete and updated lists of OpenAI IP addresses (Official doc).

Why this list? For some time I have been receiving many requests from user-agent signed by OpenAI and by researching on the web I realized that I am not the only one. One way to block the bot crawl is to insert it into the robots.txt file but apparently this is not enough to block the page scanning.

The possibilities can be of various types:

  1. OpenAI ignores the robots and still performs the scans (see discussion)
  2. Malicious bots impersonating the OpenAI user-agent and performing page scraping (see discussion)

Read deep dive here, how to:

  • Block User-Agent

  • IPs Block

  • Block spam user-agents

    Configuration to block all user-agent requests coming from unofficial IPs

    geo $allowedipaddr {
        default             1;
        20.42.10.176        0; # <-- Official OpenAI IPs
        172.203.190.128     0; # <-- Official OpenAI IPs
        51.8.102.0          0; # <-- Official OpenAI IPs
        ...
    }
    
    map $http_user_agent $block_spam_user_agent {
        '~*GPTBot'                 $allowedipaddr;
        '~*ChatGPT-User'           $allowedipaddr;
        '~*OAI-SearchBot'          $allowedipaddr;
        default                    0;
    }
    
    server {
        ...
        location / {
          ...
          if ($block_spam_user_agent) { return 403; }
        }
    }

    If you want to test in a local environment (with docker)

    docker network create test-network
    docker run --name test-nginx --rm -d -p 80:80 --network test-network nginx:alpine
    docker run --name test-app --rm -d -it --network test-network python:alpine ash
    
    docker inspect test-nginx | grep IPAddress # Get IP Address: <nginx_ip_address>
    docker inspect test-app | grep IPAddress # Get IP Address: <app_ip_address>
    
    docker exec -it test-nginx ash
    # Now you are inside the nginx container
    $ vi etc/nginx/conf.d/default.conf
    $ # ...Paste nginx configuration and add <app_ip_address> to "geo $allowedipaddr"
    $ nginx -s reload -c /etc/nginx/nginx.conf
    $ exit
    
    docker exec -it test-app ash
    $ apk add curl
    $ curl --user-agent "ChatGPT-User" <nginx_ip_address>
    $ # Response: nginx welcome page
    $ exit

All OpenAI IPs

ChatGPT User IP

  • IPs list
  • User Agent: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot

GPTBot IP

  • IPs List
  • User Agent: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbot

SearchBot IP

  • IPs List
  • User Agent: OAI-SearchBot/1.0; +https://openai.com/searchbot

About

OpenAI crawlers IP ranges. Here are the complete and updated lists of OpenAI IP addresses

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •  

Languages