As someone who did a lot of work on early spam fighting only to see it replaced by things like DKIM, I wonder if we are going to start having the "taxi medallion" style approach but for people connecting to your site.
e.g. IA will publish out signed https requests with their key so you, as the site owner, can confirm that it is indeed from them and not from AI.
Feels like that would be very anti open internet but not sure how else you would prove who is a good actor vs not (from your perspective that is).
I'll tell you what I expect to see from crawlers, agents and which I'm enforcing on everybody who doesn't look distinctly human:
* Reverse DNS which points to a web site which has a discoverable / well-known page which clearly describes their behavior.
* Some sort of reverse IP based, RBL and SPF -inspired TXT records which describe who, what, when, why, how, how often
so that I can make automated decisions based on it.
Yah, I don't have a lot of crawlers that I welcome... but I'm building a pretty good database of the worst offenders. At scale... there are advantages to scale which work in my favor, actually.
I documented this at the end of a blog post when I made blocking Amazon incoming requests a default policy several years ago.
e.g. IA will publish out signed https requests with their key so you, as the site owner, can confirm that it is indeed from them and not from AI.
Feels like that would be very anti open internet but not sure how else you would prove who is a good actor vs not (from your perspective that is).