Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Albanian language is missing in ISO2LANG #1100

Open
gmolledaj opened this issue Feb 22, 2025 · 2 comments
Open

Albanian language is missing in ISO2LANG #1100

gmolledaj opened this issue Feb 22, 2025 · 2 comments

Comments

@gmolledaj
Copy link

Describe the bug
Missing a language in ISO2LANG causes the Preprocess Text widget to not display the language list in Filtering - Stopwords.

To Reproduce
Steps to reproduce the behavior:

  1. Insert widget Preprocess Text
  2. Add, if not already, Filtering to the right side.
  3. View the list of languages ​​for Stopwords.
  4. The list is empty.

Expected behavior
The list of all nltk languages ​​should appear

NOTE: It is true that the nltk languages ​​must be downloaded, you must check that this is the case in a clean installation of Orange because I think they are not. If they are not, you must open python in the orange environment ($ python) and use two commands:

import nltk
nltk.download('stopwords')

Orange version:
3.37.0, 3.38.1

Text add-on version:
1.16.1 and 1.16.2

Screenshots
If applicable, add screenshots to help explain your problem.

Operating system:
Windows 10 and Linux Mint 22

Additional context
I have added the language so that the error disappears. It would be best if the code did not fail when adding new languages ​​in nltk, it should simply load those that appear in the ISO2LANG variable.

@gmolledaj
Copy link
Author

Solved: I have added the language so that the error disappears. It would be best if the code did not fail when adding new languages ​​in nltk, it should simply load those that appear in the ISO2LANG variable.
.../orange3/lib/python3.10/site-packages/orangecontrib/text/language.py
line 90
+ "sq": "Albanian",

@gmolledaj
Copy link
Author

gmolledaj commented Feb 22, 2025

diff --git a/orangecontrib/text/language.py b/orangecontrib/text/language.py
index fb64ddd..536ad25 100644
--- a/orangecontrib/text/language.py
+++ b/orangecontrib/text/language.py
@@ -87,6 +87,7 @@ ISO2LANG = {
     "si": "Sinhala",
     "sk": "Slovak",
     "sl": "Slovenian",
+    "sq": "Albanian",
     "sr": "Serbian",
     "sv": "Swedish",
     "ta": "Tamil",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant