Admin on the slrpnk.net Lemmy instance.

He/Him or what ever you feel like.

XMPP: povoq@slrpnk.net

Avatar is an image of a baby octopus.

  • 14 Posts
  • 687 Comments
Joined 2 years ago
cake
Cake day: September 19th, 2022

help-circle









  • Matrix servers have the problem of highly variable resource use.

    Basically if you only use it for some light chatting with friends and family and some niche topic public rooms it isn’t very heavy.

    But if any user of your homeserver joins any busy rooms or uses the bridges to join busy public Telegram channels or such, it will quickly outgrow the resources of a reasonably priced VPS.

    Personally I would rather recommend you to set up an xmpp server, which can include a gateway to Matrix and other services, but architecturally is much more lightweight and has better mobile clients.





  • Yeah, Forgejo and Gitea. I think it is partially a problem of insufficient caching on the side of these git forges that makes it especially bad, but in the end that is victim blaming 🫠

    Mlmym seems to be the target because it is mostly Javascript free and therefore easier to scrape I think. But the other Lemmy frontends are also not well protected. Lemmy-ui doesn’t even allow to easily add a custom robots.txt, you have to manually overwrite it in the reverse-proxy.


  • It seems any somewhat easy to implement solution gets circumvented by them quickly. Some of the bots do respect robots.txt through if you explicitly add their self-reported user-agent (but they change it from time to time). This repo has a regularly updated list: https://github.com/ai-robots-txt/ai.robots.txt/

    In my experience, git forges are especially hit hard, and the only real solution I found is to put a login wall in front, which kinda sucks especially for open-source projects you want to self-host.

    Oh and recently the mlmym (old reddit) frontend for Lemmy seems to have started attracting AI scraping as well. We had to turn it off on our instance because of that.