Skip to main content
  1. Dispatches/

TIL - Block OpenAI and Meta's LLM web crawlers

··91 words·1 min
TIL Llm Development Web
Daniel Andrlik
Author
Daniel Andrlik lives in the suburbs of Philadelphia. By day he manages product teams. The rest of the time he is a podcast host and producer, writer of speculative fiction, a rabid reader, and a programmer.

Thanks to this post from Adam Johnson, I’ve now updated my configuration to block OpenAI1 and Meta2 from crawling this website to feed their LLMs.

If you would like to do the same you only need to add these entries to your robots.txt:

User-agent: GPTBot
Disallow: /

User-agent: FacebookBot
Disallow: /

User-agent: GoogleOther
Disallow: /

User-agent: Google-Extended
Disallow: /

Updated: Added Google’s bot for experiments to block list. It may or may not be used for training Google Bard.

Updated: Added Google-Extended, which is explicitly Google’s AI training bot.

Related

Quote: Naomi Klein on AI delusions
·90 words·1 min
Quotes Machine Learning Llm Development Tech
Because we do not live in the Star Trek-inspired rational, humanist world that Altman seems to be hallucinating. We live under capitalism, and under that system, the effects of flooding the market with technologies that can plausibly perform the economic tasks of countless working people is not that those people are suddenly free to become philosophers and artists.
Quote: Simon Willison on LLMs
·145 words·1 min
Quotes Machine Learning Llm Development Tech
I’m personally skeptical that LLMs provide enough benefit to outweigh their potential harm, and so I deeply appreciate those who are attempting to make them work without relying on pixie dust.
TIL - Getting asdf Python with tkinter working on a M2 Mac
·312 words·2 mins
TIL Macos Python Homebrew Tkinter
This week I wanted to play around with tkinter a bit. But this proved to be difficult on my M2 Mac. I use asdf to manage various runtime versions, so first I checked to see if it was already working with my existing Python installation using the built in test method.