I've been wanting to do some translations of various subtitles and stumbled upon Opus-MT. It seems to be a university group that is building out various models to aid in translating across many languages and they seem to be doing a really good job.
Helsinki-NLP has the models and example code that makes using the models very easy. This was actually the way I did my translations as I didn't see the github until after I already finished the translations. Just glancing over the translations and they look pretty great. Looking forward to actually vetting them.
It was quite easy to get everything running but having some notes would be helpful if I ever want to do this again in the future.
Install the packages:
pip install torch
pip install transformers
pip install sentencepiece
pip install sacremoses
sentencepiece and scaremoses are tokenizers for the regular alphabet and I think the exotic alphabets that one might run into. I was originally only missing sentencepiece but got a warning to install sacremoses.
Now to the magic:
from transformers import pipeline
pipe = pipeline("translation", model="Helsinki-NLP/opus-mt-de-en")
print(pipe("Einen Moment, bitte."))
Outputs:
[{'translation_text': 'One moment, please.'}]
This program will download the opus model if it's not already available and use it to translate the sentence.