Dealing with catastrophic forgetting when continually fine-tuning language models