Massive internet datasets, embedded, open-sourced, and free
Vector embeddings are incredibly powerful, compressing text into a low-dimensional ‘meaning space’, but they’re also shockingly cheap. Alexandria is an open-source project to embed and release (for free) large datasets in research, law, medicine and more.