
HUGE NEWS for the Grass Ecosystem
News July 4, 2024
The network has just been used to open source over 600 million Reddit posts and comments! This content will now be publicly available for AI training, leveling the playing field for developers everywhere.
It is a giant step towards eliminating barriers to entry and putting AI back in the hands of the people.
This is why Grass exists
Thrilled to announce the release of the latest dataset from Grass Foundation: UpvoteWeb-24-600M
UpvoteWeb-24-600M contains 600 million of the top posts and comments from Reddit in 2024, as well as media links and reply lineage. The data has been totally anonymized to preserve user privacy, and it includes language detection and token counts. All content is safe for work, filtered using Reddit's moderation metadata.
Available now on https://huggingface.co/datasets/OpenCo7/UpVoteWeb