-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update uighur_arabic.yml #165
base: main
Are you sure you want to change the base?
Conversation
I was able to consult with a Uighur specialist, and now have a better understanding of this script. In this new version of the table, the r2s mapping only uses Arabic characters in the U+06XX block. This has simplified this section, as these mappings are not context-dependent except for initial vowels. The compatibility characters in the U+FEXX range are still included in the s2r section (alongside the U+06XX versions), though in practice they will not be encountered often. I have some test phrases that I can provide as well - please let me know where and in what format I should submit these.
@tventimi thanks for the improvement. You can add test strings to https://github.com/lcnetdev/scriptshifter/blob/main/test/data/script_samples/arabic2.csv or start a new test file for Uighur. You can run the tests via command line:
(or replace `arabic2 with the base name of your test table) |
Hi Stefano, I created a file uighur_arabic.csv with some test cases. I tried installing SS locally so I could do these tests myself, but unfortunately I kept running into errors with the installation. I'm sure it is just some quirk with my local environment. I will keep working on it. Apologizes in advance if there are any validation errors with these files. |
You can ping me privately to work out the install problems. I'm interested in what issues you may be running into. FYI I'm using Python 3.10 due to external library dependencies. |
@@ -1,338 +1,308 @@ | |||
--- | |||
general: | |||
name: Uighur (Arabic) | |||
case_sensitive: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed to ensure case-insensitive transliteration. Did you remove it for a particular reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is my mistake, I did not mean to leave it out.
@@ -0,0 +1,10 @@ | |||
uighur_arabic,Abbas Munyar Türkiyqan,"ابباس مۇنيار ، تۈركىيقان",r2s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, good to know! Yes, the round trip transliteration should work for these examples.
I updated the two files accordingly. |
I was able to consult with a Uighur specialist, and now have a better understanding of this script. In this new version of the table, the r2s mapping only uses Arabic characters in the U+06XX block. This has simplified this section, as these mappings are not context-dependent except for initial vowels. The compatibility characters in the U+FEXX range are still included in the s2r section (alongside the U+06XX versions), though in practice they will not be encountered often.
I have some test phrases that I can provide as well - please let me know where and in what format I should submit these.