Text Data Filtering
π
36
Search through ROOTS corpus using queries
Use AI to translate text between languages
Analyze and visualize dataset characteristics and statistics
Search code snippets in the StarCoder dataset for matches
Explore the OBELICS dataset with an interactive map
Search large text corpora for information
Read a detailed overview of the FineWeb webβscale text dataset