If I understand it correctly, you are only attending preceding tokens in your pa...

		zuzun on May 14, 2023 \| parent \| context \| favorite \| on: Attention with Linear Biases (ALiBi) If I understand it correctly, you are only attending preceding tokens in your paper. Can the constant bias matrix be made symmetric for unmasked tasks?