New study from Stanford, MIT, Harvard, and Anthropic reveals that larger neural networks learn rare tasks better by reducing gradient interference.