-
Harvard open-sources AI training dataset 'Institutional Books 1.0', covering 983,000 books in its collection
With the support of Microsoft and OpenAI, the Harvard Law School Library officially open-sourced its first open dataset for AI training, "Institutional Books 1.0," last week. The dataset is said to contain 983,000 books in Harvard's collection, covering 245 languages and a total of 242 billion tokens,1AI attached the project address (https://huggingface.co/datasets/institutional/institutional-). ...- 2.2k
-
Harvard, Google release 1 million public domain books to provide legitimate data for AI training
December 13, 2011 - Harvard University and Google announced the joint release of 1 million public domain books as an AI training dataset, TechCrunch reported on December 12th. Image source Pexels The data required for AI training is costly, but more suitable for well-funded tech companies. As a result, Harvard plans to release a dataset of about 1 million public domain books covering a wide range of genres, languages, and authors, including classic authors such as Dickens, Dante, and Shakespeare that are no longer under copyright, due to the fact that the copyrights on these works...- 5.4k
❯
Search
Scan to open current page
Top
Checking in, please wait
Click for today's check-in bonus!
You have earned {{mission.data.mission.credit}} points today!
My Coupons
-
¥CouponsLimitation of useExpired and UnavailableLimitation of use
before
Limitation of usePermanently validCoupon ID:×Available for the following products: Available for the following products categories: Unrestricted use:Available for all products and product types
No coupons available!
Unverify
Daily tasks completed:

