For the time being… I’ve just blocked all of OpenAI(s) Bots. They ( _thankfully_) publish a JSON endpoint that you can use to block all OpenAI crawlers from reaching your server ( _in my case, blocking it at the edge_). Example:
proxy-1:~# curl -qs https://openai.com/gptbot.json | jq -r '.prefixes[].ipv4Prefix' | xargs -I{} ./block-ip.sh {}
Where ... ⌘ Read more