All Tags

Harvard University

Harvard open-sources AI training dataset 'Institutional Books 1.0', covering 983,000 books in its collection

With the support of Microsoft and OpenAI, the Harvard Law School Library officially open-sourced its first open dataset for AI training, "Institutional Books 1.0," last week. The dataset is said to contain 983,000 books in Harvard's collection, covering 245 languages and a total of 242 billion tokens,1AI attached the project address (https://huggingface.co/datasets/institutional/institutional-). ...
Information
- 4.3k
25/6/17
Harvard, Google release 1 million public domain books to provide legitimate data for AI training

December 13, 2011 - Harvard University and Google announced the joint release of 1 million public domain books as an AI training dataset, TechCrunch reported on December 12th. Image source Pexels The data required for AI training is costly, but more suitable for well-funded tech companies. As a result, Harvard plans to release a dataset of about 1 million public domain books covering a wide range of genres, languages, and authors, including classic authors such as Dickens, Dante, and Shakespeare that are no longer under copyright, due to the fact that the copyrights on these works...
Information
- 7.8k
24/12/13

❯

Search

Checking in, please wait

Click for today's check-in bonus!

You have earned {{mission.data.mission.credit}} points today!

Check-in

Leaderboard

{{item.credit}}

Lasted{{item.count}}days

More

My Coupons

_￥_Coupons

Limitation of useExpired and Unavailable

Limitation of use
before

Limitation of usePermanently valid

Coupon ID:
×

Available for the following products: Available for the following products categories: Unrestricted use:

[{{ct.name}}]

Available for all products and product types

No coupons available!

Cart

×

Delete

Shopping Cart is Empty!

Empty Cart Checkout

You have a new message

No new messages

Write a new message More