Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
-
Updated
Nov 19, 2024 - Python
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.
Language Identification with Support for More Than 2000 Labels -- EMNLP 2023
NLP pipelines for Tagalog using spaCy
AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/
Code and datasets for the ACL 2021 paper "OntoED: Low-resource Event Detection with Ontology Embedding"
[SIGIR 2023] Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction
A Scandinavian Benchmark for sentence embeddings
This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, Hausa, Yoruba and Pidgin.
Materials for AACL-IJCNLP-2022 tutorial: Efficient and Robust Knowledge Graph Construction
[ACL'24] MC^2: A Multilingual Corpus of Minority Languages in China (Tibetan, Uyghur, Kazakh, and Mongolian)
[ACL'24 Findings] Teaching Large Language Models an Unseen Language on the Fly
A curated list of awesome sentiment analysis studies, in which attitude corresponds to the text position conveyed by Subject towards other Object mentioned in text such as: entities, events, etc.
Code for paper "ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models"
This repository contains the code, data, and associated models of the paper titled "BanglaParaphrase: A High-Quality Bangla Paraphrase Dataset", accepted in Proceedings of the Asia-Pacific Chapter of the Association for Computational Linguistics: AACL 2022.
Pashto Natural Language Processing Toolkit
Chatbot Solution for Resource-Poor Languages. Contains code and data for Journal Article 'Focused domain contextual AI chatbot framework for resource poor languages'.
Awesome Lao Natural Language Processing
This is an official Leaderboard for the RuSentRel-1.1 dataset originally described in paper (arxiv:1808.08932)
Enhanced awesome-align for low-resource languages and noise simulation: https://arxiv.org/abs/2301.09685
Add a description, image, and links to the low-resource-nlp topic page so that developers can more easily learn about it.
To associate your repository with the low-resource-nlp topic, visit your repo's landing page and select "manage topics."