A classic paper on visualizing high-dimensional data with t-SNE.
Visit Visualizing Data using t-SNE website →
A TechCrunch report on Nightshade, a data-poisoning tool that helps artists resist unauthorized AI training.
A Stack Diary report alleging Brave sold copyrighted data for AI training.
A Towards Data Science argument that large language models have not yet contributed meaningfully to linguistics.
A contemporary Brazilian Portuguese corpus with source and genre metadata.
An open network for large-scale AI datasets and models.
A specialized jobs platform for AI, machine learning, data science, and big data roles.
An explainer on why topological data analysis can be preferable to t-SNE or UMAP.
An enterprise RAG data platform for privately deployed generative AI Q&A solutions.