Probes and Cons: Multi-Trigger Classification Reveals Mixed Functional Mappings in Language Model Latent Space

Published in Preprint, 2024

This preprint introduces a new methodology for probing language model representations and reveals the complex, distributed nature of functional mappings in transformer architectures.

Recommended citation: Vedant Gaur et al. 2024. Probes and Cons: Multi-Trigger Classification Reveals Mixed Functional Mappings in Language Model Latent Space. arXiv preprint. #