Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I compared this translation with Google Translate, I got a different result. #131

Open
thawon opened this issue Jan 31, 2025 · 6 comments

Comments

@thawon
Copy link

thawon commented Jan 31, 2025

Hi there!

We've noticed some users mentioning a recent dip in translation quality, and I've done some testing myself, comparing your library's results with Google Translate. I've found some discrepancies, with Google Translate sometimes performing better. Any insights you could share on why this might be happening would be greatly appreciated!

Regards,
Thawon Uttamavanit

@vitalets
Copy link
Owner

Hello!

Could you share the exact discrepancies, that you've found?

@thawon
Copy link
Author

thawon commented Jan 31, 2025

Hi vitalets,

Thank you for the reply.

Here is the text in Chinese:
"m2 正向電磁接觸器下方的過載電驛 r相欠相 已做更換(跟污水廠借了一顆)||m2 逆向電磁接觸器彈簧卡卡 s相沒有吸到很裡面 重複測試恢復正常之後 堪用||目前已投入使用"

The result we have from your library:
"โพสต์ไฟฟ้าของ M2 ที่อยู่ทางด้านขวาของคอนแทคเลนส์แม่เหล็กไฟฟ้าถูกแทนที่ (ยืมมาด้วยโรงงานน้ำเสีย) เข้าใช้งาน"

Google translate's result:
"รีเลย์โอเวอร์โหลดใต้คอนแทคเตอร์แม่เหล็กไฟฟ้าแบบหน้า M2 ได้รับการเปลี่ยนใหม่แล้ว (ยืมมาจากโรงบำบัดน้ำเสีย) สปริงของคอนแทคเตอร์แม่เหล็กไฟฟ้าแบบกลับ M2 ติดขัด และเฟส S ไม่ถูกดูดเข้าไปลึกมาก หลังจากทดสอบซ้ำแล้วซ้ำเล่า ก็ถือว่าปกติ ใช้||ใช้อยู่ในปัจจุบัน"

You are not probably speaking Thai but the discrepancies can be found in many cases.
Here is another example:

Text: "How was work today?"
The result from your library:
"วันนี้ทำงานอย่างไร?"
Google Translate's result:
"วันนี้งานเป็นอย่างไรบ้าง?"

The good results are the ones that comes from Google Translate. The incorrect translations are really off, and occasionally they don't make sense. I actually have tried the same examples on another NodeJS library named "@iamtraction/google-translate". The results are identical to yours. I think and according to our users, the discrepancies only started to occur a few days ago.

Regards,
Thawon Uttamavanit

@vitalets
Copy link
Owner

vitalets commented Jan 31, 2025

Could you show the code snippet, how do you perform the request?
And what is the version of google-translate-api.

@thawon
Copy link
Author

thawon commented Jan 31, 2025

The code is copied straight from the npm:
https://www.npmjs.com/package/@vitalets/google-translate-api

The version number is 9.2.1

import { translate } from '@vitalets/google-translate-api';
const { text } = await translate('How was work today?', { to: 'th' });
console.log(text) // => 'Hello World! How are you?'

@vitalets
Copy link
Owner

I've tested with russian and confirm the discrepancy.
This needs investigation, as this library uses undocumented APIs.
Could you post a raw response for your query:

const { text, raw } = await translate(sourceText, { to: 'th' });
console.dir(raw, { depth: null });

@Waste2Time
Copy link

The same question, I have raised two issues in other repos, I suspect that this is because Google uses different models for different APIs.

It seems Google is going to make changes to these undocumented APIs. :(

index.js

import { translate } from '@vitalets/google-translate-api';

async function translateText() {
  const { text, raw } = await translate('やれやれ、またドイツか、と僕は思った。', { to: 'ru' });
  console.log(text);
  console.dir(raw, {depth: null});
}
translateText();

output:

Я думал, что это снова Германия.
{
  sentences: [
    {
      trans: 'Я думал, что это снова Германия.',
      orig: 'やれやれ、またドイツか、と僕は思った。',
      backend: 3,
      model_specification: [ { label: 'offline' }, { label: 'offline' } ],
      translation_engine_debug_info: [
        {
          model_tracking: {
            checkpoint_md5: '07653cda2443db08a8e1f2435c678a44',
            launch_doc: 'efficient_models_2022q2.md'
          }
        },
        {
          model_tracking: {
            checkpoint_md5: 'edbff5b2398eeca464de2caaf36a7a7e',
            launch_doc: 'efficient_models_2022q2.md'
          }
        }
      ]
    },
    {
      translit: 'YA dumal, chto eto snova Germaniya.',
      src_translit: 'Yareyare, mata Doitsu ka, to boku wa omotta.'
    }
  ],
  src: 'ja',
  confidence: 1,
  spell: {},
  ld_result: {
    srclangs: [ 'ja' ],
    srclangs_confidences: [ 1 ],
    extended_srclangs: [ 'ja' ]
  }
}

I've tested with russian and confirm the discrepancy. This needs investigation, as this library uses undocumented APIs. Could you post a raw response for your query:

const { text, raw } = await translate(sourceText, { to: 'th' });
console.dir(raw, { depth: null });

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants