In the liminal space between sense and nonsense.
What a vision model packs into 384 numbers: generating images from its embeddings, finding interpretable features with a sparse autoencoder, and playing with the geometry of that space.