Enhancing Transformer Programs with Structured Biases

a performance-enhancing modular extension of Transformer Programs (Transformers designed for mechanistic interpretability)