minus-squareISMETA@lemmy.ziptoTechnology@beehaw.org•Sarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-LibrarylinkfedilinkEnglisharrow-up1·2 years agoGPT3 is 800GB while the entirety of the English Wikipedia is around 10GB compressed. So yeah it doesn’t store evey detail of everything but LLMs do memorize a lot of things verbatim. Also see https://bair.berkeley.edu/blog/2020/12/20/lmmem/ linkfedilink
GPT3 is 800GB while the entirety of the English Wikipedia is around 10GB compressed. So yeah it doesn’t store evey detail of everything but LLMs do memorize a lot of things verbatim. Also see https://bair.berkeley.edu/blog/2020/12/20/lmmem/