Probes and Cons: Multi-Trigger Classification Reveals Mixed Functional Mappings in Language Model Latent Space
Published in Preprint, 2024
This preprint introduces a new methodology for probing language model representations and reveals the complex, distributed nature of functional mappings in transformer architectures.
Recommended citation: Vedant Gaur et al. 2024. Probes and Cons: Multi-Trigger Classification Reveals Mixed Functional Mappings in Language Model Latent Space. arXiv preprint. #