Using the Amharic IA3 PEFT Models

Daniel Mekuriaw
2 min readJan 1, 2024

The article demonstrates how the mT5-small models that have been fine-tuned under this project can be used. It details the processes of loading these models for additional training, evaluation, or inference applications.

Loading a Model

To utilize the PEFT models, it is necessary to initially load them. The code snippet below demonstrates how to load an IA3 fine-tuned model from a specified file location, transfer it to the designated device, and display the count of trainable parameters.

# Loading model to be further fine-tuned
model = AutoModelForSeq2SeqLM.from_pretrained("PATH_TO_MODEL")

# Loading the PEFT mdel
# is_trainable argument can be true or false, depending on whether the models are to be trained further
model = PeftModel.from_pretrained(model, "PATH_TO_MODEL", is_trainable=True)

# Moving model to device
model.to(device)

# To see the number of trainable parameters
model.print_trainable_parameters()

The ‘is_trainable’ parameter indicates the intended use of the model, either for further training or for inference. Setting this parameter to ‘True’ implies that the model will undergo additional training, where only the trainable parameters defined by the IA3 approach will be fine-tuned. Conversely, setting it to ‘False’ means the model will be used solely for evaluation or inference purposes. In this case, the number of trainable parameters would be zero.

What’s next?

Once any of the models are loaded they could be further assessed. More evaluation and research could be done to assess the model performances and provide further understanding of this project as well as the Amharic language. The models could also be fine-tuned further with more data. This enables more comprehensive comparisons through quantitative and qualitative assessments, offering valuable insights.

Although this project only works with the mT5-small model as its base model, it is important to note that this same approach could be used to load other models that are fine-tuned with the IA3 or other PEFT approaches.

Resources

Link to Project Article: https://daniel-mekuriaw16.medium.com/parameter-efficient-amharic-text-summarization-5ce1bac73a01

Link to Final Report: https://drive.google.com/file/d/1LivrnwkGAJfmCo6kDLTUJNIoy8RlpRF7/view?usp=sharing

--

--

Daniel Mekuriaw

An undergraduate student at Yale University ('24) majoring in Computer Science and Statistics & Data Science