FediDB has stopped crawling until they get robots.txt support

mesa@lemmy.world · edit-2 2 months ago

FediDB has stopped crawling until they get robots.txt support

Rimu@piefed.social · 2 months ago

lol FediDB isn’t a crawler, though. It makes API calls.

Pamasich@kbin.earth · 2 months ago

They do have a dedicated “Crawler” page.

And they do mention there that they use a website crawler for their Developer Tools and Network features.

Rimu@piefed.social · 2 months ago

Maybe the definition of the term “crawler” has changed but crawling used to mean downloading a web page, parsing the links and then downloading all those links, parsing those pages, etc etc until the whole site has been downloaded. If there were links going to other sites found in that corpus then the same process repeats for those. Obviously this could cause heavy load, hence robots.txt.

Fedidb isn’t doing anything like that so I’m a bit bemused by this whole thing.