- Among mbart, I fine-tuned the models for English, Chinese, and Korean.
- Each model has a different number of layers. (For example, if model denotes 12-3, it represent this model is composed of 12 Encoder and 3 Decoder
- We plan to develop OpenSource called OpenSFT soon. Please look forward to it!
BLEU \ Model | 12-3 English | 12 -3 Korean | 12 - 3 Chinese |
---|---|---|---|
1st epoch | 53 | 35 | 27 |
2nd epoch | 52 | 35 | 25 |
3rd epoch | 51 | 33 | 23 |
Inference Time : 0.5S
Parameter Size : 262M(3.15G)
BLEU \ Model | 9-3 English | 9 -3 Korean | 9 - 3 Chinese |
---|---|---|---|
1st epoch | 54 | 36 | 24 |
2nd epoch | 55 | 35 | 25 |
3rd epoch | 54 | 35 | 23 |
Inference Time : 0.2S
Parameter Size : 224M(2.7G)