We have explored an introduction to how large language models (LLMs) function, with a particular focus on tokenization, vector representations, and the idea of whether these models truly “compress” text as part of their process. We’ll be continuing on with our discussion on Anthropic next week.
Discussion Points for the Session:
-
How copyright operates as a form of property right, with control over copying, distribution and derivative use.
-
The distinction between lawful acquisition (title) and mere possession of digital content.
-
Whether internal uses like format-shifting or machine learning training amount to a misappropriation of property.
-
The relevance of Autodesk v Dyason in Australian law as a comparator.
-
Broader implications for how tax, research and development incentives and digital asset rules might engage with intellectual property issues arising from AI training.
Please see below link to case materials which is assumed reading in order to participate in the discussion:
Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson v Anthropic PBC
Discussion led by Adrian Cartland.