Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update uighur_arabic.yml #165

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Update uighur_arabic.yml #165

wants to merge 4 commits into from

Conversation

tventimi
Copy link
Contributor

I was able to consult with a Uighur specialist, and now have a better understanding of this script. In this new version of the table, the r2s mapping only uses Arabic characters in the U+06XX block. This has simplified this section, as these mappings are not context-dependent except for initial vowels. The compatibility characters in the U+FEXX range are still included in the s2r section (alongside the U+06XX versions), though in practice they will not be encountered often.

I have some test phrases that I can provide as well - please let me know where and in what format I should submit these.

I was able to consult with a Uighur specialist, and now have a better understanding of this script.  In this new version of the table, the r2s mapping only uses Arabic characters in the U+06XX block.  This has simplified this section, as these mappings are not context-dependent except for initial vowels.  The compatibility characters in the U+FEXX range are still included in the s2r section (alongside the U+06XX versions), though in practice they will not  be encountered often.

I have some test phrases that I can provide as well - please let me know where and in what format I should submit these.
@scossu
Copy link
Collaborator

scossu commented Jan 15, 2025

@tventimi thanks for the improvement. You can add test strings to https://github.com/lcnetdev/scriptshifter/blob/main/test/data/script_samples/arabic2.csv or start a new test file for Uighur. You can run the tests via command line:

/path/to/sscli test samples arabic2

(or replace `arabic2 with the base name of your test table)

@tventimi
Copy link
Contributor Author

Hi Stefano,

I created a file uighur_arabic.csv with some test cases.

I tried installing SS locally so I could do these tests myself, but unfortunately I kept running into errors with the installation. I'm sure it is just some quirk with my local environment. I will keep working on it. Apologizes in advance if there are any validation errors with these files.

@scossu
Copy link
Collaborator

scossu commented Jan 16, 2025

You can ping me privately to work out the install problems. I'm interested in what issues you may be running into. FYI I'm using Python 3.10 due to external library dependencies.

@@ -1,338 +1,308 @@
---
general:
name: Uighur (Arabic)
case_sensitive: false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to ensure case-insensitive transliteration. Did you remove it for a particular reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is my mistake, I did not mean to leave it out.

@@ -0,0 +1,10 @@
uighur_arabic,Abbas Munyar Türkiyqan,"ابباس مۇنيار ، تۈركىيقان",r2s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you expect the transliteration to round-trip without loss, you can leave out the 4th column value to run both s2r and r2s (in this case, column #2 should be Uighur and #3 Roman). I should have updated my documentation—my bad. Your example should work as it is, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, good to know! Yes, the round trip transliteration should work for these examples.

@tventimi
Copy link
Contributor Author

I updated the two files accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants