Sourcery refactored master branch #1

sourcery-ai · 2022-08-21T17:03:41Z

Branch master refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the master branch, then run:

git fetch origin sourcery/master
git merge --ff-only FETCH_HEAD
git reset HEAD^

Help us improve this pull request!

sourcery-ai

Due to GitHub API limits, only the first 60 comments can be shown.

travis_pypi_setup.py

sourcery-ai · 2022-08-21T17:03:43Z

apps/col_data.py

-        data = body["_source"]
-        return data
+        return body["_source"]


Function Dictionary.get_word refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

sourcery-ai · 2022-08-21T17:03:44Z

apps/col_dictionary.py

-    search_text_box = placeholder.text_input('Word', value=st.session_state['current_word'], key='sidebar_text_input')
-    if search_text_box:
+    if search_text_box := placeholder.text_input(
+        'Word',
+        value=st.session_state['current_word'],
+        key='sidebar_text_input',
+    ):


Lines 43-59 refactored with the following changes:

Use named expression to simplify assignment and conditional [×2] (use-named-expression)

sourcery-ai · 2022-08-21T17:03:44Z

apps/col_dictionary_export_from_elasticsearch.py

-    f = open(DICTIONARY_FILE, "a")
-    i = 0
-    for res in query_doc(es, index_name):
-        docs = res["hits"]["hits"]
-        print(len(docs))
-        doc_content = extract_dictionary_from_elasticsearch(docs)
-        if len(doc_content) > 1:
-            dict_content = yaml.dump(doc_content, allow_unicode=True, sort_keys=True)
-            f.write(dict_content)
-            i += 1
-    f.close()
+    with open(DICTIONARY_FILE, "a") as f:
+        i = 0
+        for res in query_doc(es, index_name):
+            docs = res["hits"]["hits"]
+            print(len(docs))
+            doc_content = extract_dictionary_from_elasticsearch(docs)
+            if len(doc_content) > 1:
+                dict_content = yaml.dump(doc_content, allow_unicode=True, sort_keys=True)
+                f.write(dict_content)
+                i += 1


Lines 40-50 refactored with the following changes:

Use with when opening file to ensure closure (ensure-file-closed)

sourcery-ai · 2022-08-21T17:03:45Z

datasets/DI_Vietnamese-UVD/scripts/validate_with_corpus.py

-    dict = joblib.load(dict_file)
-    return dict
+    return joblib.load(dict_file)


Function load_dictionary refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

sourcery-ai · 2022-08-21T17:03:45Z

datasets/DI_Vietnamese-UVD/scripts/validate_with_corpus.py

-            "ADJ", "ADV", "INTJ", "NOUN", "PROPN", "PRON", "SYM", "X", "N:G", "VERB:G", "NY",
-            "N", "NB", "NNPy",
-            "NNP", "NNPy",
-            "V", "VERB",
-            "Num", "NUMx", "NUM", "NUMX"
+            "ADJ",
+            "ADV",
+            "INTJ",
+            "NOUN",
+            "PROPN",
+            "PRON",
+            "SYM",
+            "X",
+            "N:G",
+            "VERB:G",
+            "NY",
+            "N",
+            "NB",
+            "NNP",
+            "NNPy",
+            "V",
+            "VERB",
+            "Num",
+            "NUMx",
+            "NUM",
+            "NUMX",
        }
+


Lines 63-67 refactored with the following changes:

Remove duplicate keys when instantiating sets (remove-duplicate-set-key)

sourcery-ai · 2022-08-21T17:03:46Z

datasets/UD_Vietnamese-COL/scripts/make_corpus.py

-    new_sentence = "\n".join(result)
-    return new_sentence
+    return "\n".join(result)


Function add_lemma_column refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

sourcery-ai · 2022-08-21T17:03:46Z

examples/chatbot/baggage_claim/run_tests.py

-        if r is None:
-            return None
-        return r.json()
+        return None if r is None else r.json()


Function ChatUser.send refactored with the following changes:

Lift code into else after jump in control flow (reintroduce-else)

Replace if statement with if expression (assign-if-exp)

sourcery-ai · 2022-08-21T17:03:46Z

examples/ner/preprocess_data.py

+max_len = 0
 for file in ["train.txt", "test.txt", "dev.txt"]:
    print(file)
-    max_len = 0


Lines 31-31 refactored with the following changes:

Hoist statements out of for/while loops (hoist-statement-from-loop)

sourcery-ai · 2022-08-21T17:03:51Z

examples/ner/run_ner_pl.py

-        if label not in self.label2index:
-            index = self.vocab_size
-            self.label2index[label] = index
-            self.index2label[index] = label
-            self.vocab_size += 1
-            return index
-        else:
+        if label in self.label2index:
            return self.label2index[label]
+        index = self.vocab_size
+        self.label2index[label] = index
+        self.index2label[index] = label
+        self.vocab_size += 1
+        return index


Function LabelEncoder.encode refactored with the following changes:

Swap if/else branches (swap-if-else-branches)

Remove unnecessary else after guard condition (remove-unnecessary-else)

sourcery-ai · 2022-08-21T17:03:52Z

examples/ner/run_ner_pl.py

-        optimizer = AdamW(self.parameters(), lr=2e-5)
-        return optimizer
+        return AdamW(self.parameters(), lr=2e-5)


Function BertForTokenClassification.configure_optimizers refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

sourcery-ai · 2022-08-21T17:03:52Z

examples/ner/tasks.py

-                output_line = line.split()[0] + " " + preds_list[example_id].pop(0) + "\n"
+                output_line = f"{line.split()[0]} {preds_list[example_id].pop(0)}" + "\n"


Function NER.write_predictions_to_file refactored with the following changes:

Use f-string instead of string concatenation [×2] (use-fstring-for-concatenation)

sourcery-ai · 2022-08-21T17:03:52Z

examples/ner/tasks.py

-        if path:
-            with open(path, "r") as f:
-                labels = f.read().splitlines()
-            if "O" not in labels:
-                labels = ["O"] + labels
-            return labels
-        else:
+        if not path:
            return ["O", "B-MISC", "I-MISC", "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC"]
+        with open(path, "r") as f:
+            labels = f.read().splitlines()
+        if "O" not in labels:
+            labels = ["O"] + labels
+        return labels


Function NER.get_labels refactored with the following changes:

Swap if/else branches (swap-if-else-branches)

Remove unnecessary else after guard condition (remove-unnecessary-else)

sourcery-ai · 2022-08-21T17:03:52Z

examples/ner/tasks.py

-        if path:
-            with open(path, "r") as f:
-                labels = f.read().splitlines()
-            if "O" not in labels:
-                labels = ["O"] + labels
-            return labels
-        else:
+        if not path:


Function Chunk.get_labels refactored with the following changes:

Swap if/else branches (swap-if-else-branches)

Remove unnecessary else after guard condition (remove-unnecessary-else)

sourcery-ai · 2022-08-21T17:03:53Z

examples/sentiment/train_bert.py

-        optimizer = AdamW(self.parameters(), lr=2e-5)
-        return optimizer
+        return AdamW(self.parameters(), lr=2e-5)


Function BertForMultilabelClassification.configure_optimizers refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

sourcery-ai · 2022-08-21T17:03:54Z

examples/sentiment/train_gpt2.py

-        loss = 0
        gpt2_outputs = self.gpt2(input_ids)
        hidden_states = gpt2_outputs[0].squeeze()
        logits = self.logit(self.linear(hidden_states))
        batch_size, sequence_length = input_ids.shape[:2]
        logits = logits[range(batch_size), sequence_length]
-        if labels is not None:
-            loss = self.criterion(logits, labels)
+        loss = self.criterion(logits, labels) if labels is not None else 0


Function GPT2TextClassification.forward refactored with the following changes:

Replace if statement with if expression (assign-if-exp)

Move assignment closer to its usage within a block (move-assign-in-block)

Move setting of default value for variable into else branch (introduce-default-else)

sourcery-ai · 2022-08-21T17:03:54Z

examples/sentiment/train_gpt2.py

-        optimizer = SGD(self.parameters(), lr=1e-6)
-        return optimizer
+        return SGD(self.parameters(), lr=1e-6)


Function GPT2TextClassification.configure_optimizers refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

sourcery-ai · 2022-08-21T17:03:54Z

examples/text_normalize/compare_tool.py

-        for i, line in enumerate(f):
+        for line in f:
            word, freq = line.split("\t\t")
            other_words = Normalizer.normalize(word)
            uts_words = text_normalize(word)
            if word != "nghiêng" and len(word) > 6:
                continue
-            if other_words != word and other_words != uts_words:
+            if other_words not in [word, uts_words]:


Function compare_two_tools refactored with the following changes:

Remove unnecessary calls to enumerate when the index is not used (remove-unused-enumerate)

Replace multiple comparisons of same variable with in operator (merge-comparisons)

sourcery-ai · 2022-08-21T17:03:54Z

examples/text_normalize/evaluation.py

-    new_s = "\n".join(result)
-    return new_s
+    return "\n".join(result)


Function predict_sentence refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

sourcery-ai · 2022-08-21T17:03:57Z

examples/text_normalize/normalize.py

-for key in syllable_map_r:
-    items = syllable_map_r[key]
+for key, items in syllable_map_r.items():
    for item in items:
        syllable_map[item] = key
-NONE_DIACRITIC_SINGLE_VOWELS = set(["a", "e", "i", "o", "u", "y"])
-NONE_DIACRITIC_DOUBLE_VOWELS = set([
-    "ai", "ao", "au", "ay",
-    "eo", "eu", "ia", "ie", "iu", "oa", "oe", "oi", "oo",
-    "ua", "ue", "ui", "uo", "uu", "uy", "ye"
-])
-NONE_DIACRITIC_TRIPLE_VOWELS = set([
-    "iai", "ieu", "iua", "oai", "oao", "oay", "oeo",
-    "uao", "uai", "uay", "uoi", "uou", "uya", "uye", "uyu",
-    "yeu"
-])
+NONE_DIACRITIC_SINGLE_VOWELS = {"a", "e", "i", "o", "u", "y"}
+NONE_DIACRITIC_DOUBLE_VOWELS = {
+    "ai",
+    "ao",
+    "au",
+    "ay",
+    "eo",
+    "eu",
+    "ia",
+    "ie",
+    "iu",
+    "oa",
+    "oe",
+    "oi",
+    "oo",
+    "ua",
+    "ue",
+    "ui",
+    "uo",
+    "uu",
+    "uy",
+    "ye",
+}
+
+NONE_DIACRITIC_TRIPLE_VOWELS = {
+    "iai",
+    "ieu",
+    "iua",
+    "oai",
+    "oao",
+    "oay",
+    "oeo",
+    "uao",
+    "uai",
+    "uay",
+    "uoi",
+    "uou",
+    "uya",
+    "uye",
+    "uyu",
+    "yeu",
+}
+


Lines 81-95 refactored with the following changes:

Use items() to directly unpack dictionary values (use-dict-items)

Unwrap a constant iterable constructor [×3] (unwrap-iterable-construction)

sourcery-ai · 2022-08-21T17:03:58Z

examples/text_normalize/normalize.py

-        if group in NONE_DIACRITIC_VOWELS:
-            miss_spell = False
-        else:
-            miss_spell = True
+        miss_spell = group not in NONE_DIACRITIC_VOWELS


Function AnalysableWord.__init__ refactored with the following changes:

Simplify boolean if expression (boolean-if-exp-identity)

Remove unnecessary casts to int, str, float or bool (remove-unnecessary-cast)

Replace if statement with if expression (assign-if-exp)

sourcery-ai · 2022-08-21T17:03:58Z

examples/text_normalize/tools/nvh.py

-    dic = {}
    char1252 = 'à|á|ả|ã|ạ|ầ|ấ|ẩ|ẫ|ậ|ằ|ắ|ẳ|ẵ|ặ|è|é|ẻ|ẽ|ẹ|ề|ế|ể|ễ|ệ|ì|í|ỉ|ĩ|ị|ò|ó|ỏ|õ|ọ|ồ|ố|ổ|ỗ|ộ|ờ|ớ|ở|ỡ|ợ|ù|ú|ủ|ũ|ụ|ừ|ứ|ử|ữ|ự|ỳ|ý|ỷ|ỹ|ỵ|À|Á|Ả|Ã|Ạ|Ầ|Ấ|Ẩ|Ẫ|Ậ|Ằ|Ắ|Ẳ|Ẵ|Ặ|È|É|Ẻ|Ẽ|Ẹ|Ề|Ế|Ể|Ễ|Ệ|Ì|Í|Ỉ|Ĩ|Ị|Ò|Ó|Ỏ|Õ|Ọ|Ồ|Ố|Ổ|Ỗ|Ộ|Ờ|Ớ|Ở|Ỡ|Ợ|Ù|Ú|Ủ|Ũ|Ụ|Ừ|Ứ|Ử|Ữ|Ự|Ỳ|Ý|Ỷ|Ỹ|Ỵ'.split(
        '|')
    charutf8 = "à|á|ả|ã|ạ|ầ|ấ|ẩ|ẫ|ậ|ằ|ắ|ẳ|ẵ|ặ|è|é|ẻ|ẽ|ẹ|ề|ế|ể|ễ|ệ|ì|í|ỉ|ĩ|ị|ò|ó|ỏ|õ|ọ|ồ|ố|ổ|ỗ|ộ|ờ|ớ|ở|ỡ|ợ|ù|ú|ủ|ũ|ụ|ừ|ứ|ử|ữ|ự|ỳ|ý|ỷ|ỹ|ỵ|À|Á|Ả|Ã|Ạ|Ầ|Ấ|Ẩ|Ẫ|Ậ|Ằ|Ắ|Ẳ|Ẵ|Ặ|È|É|Ẻ|Ẽ|Ẹ|Ề|Ế|Ể|Ễ|Ệ|Ì|Í|Ỉ|Ĩ|Ị|Ò|Ó|Ỏ|Õ|Ọ|Ồ|Ố|Ổ|Ỗ|Ộ|Ờ|Ớ|Ở|Ỡ|Ợ|Ù|Ú|Ủ|Ũ|Ụ|Ừ|Ứ|Ử|Ữ|Ự|Ỳ|Ý|Ỷ|Ỹ|Ỵ".split(
        '|')
-    for i in range(len(char1252)):
-        dic[char1252[i]] = charutf8[i]
-    return dic
+    return {char1252[i]: charutf8[i] for i in range(len(char1252))}


Function loaddicchar refactored with the following changes:

Move assignment closer to its usage within a block (move-assign-in-block)

Inline variable that is immediately returned (inline-immediately-returned-variable)

Convert for loop into dictionary comprehension (dict-comprehension)

sourcery-ai · 2022-08-21T17:03:58Z

examples/text_normalize/tools/nvh.py

-        if x == 4 or x == 8:  # ê, ơ
+        if x in [4, 8]:  # ê, ơ


Function chuan_hoa_dau_tu_tieng_viet refactored with the following changes:

Replace multiple comparisons of same variable with in operator (merge-comparisons)

sourcery-ai · 2022-08-21T17:03:58Z

examples/text_normalize/tools/nvh.py

-            if nguyen_am_index == -1:
+            if nguyen_am_index == -1 or index - nguyen_am_index == 1:
                nguyen_am_index = index
            else:
-                if index - nguyen_am_index != 1:
-                    return False
-                nguyen_am_index = index
+                return False


Function is_valid_vietnam_word refactored with the following changes:

Merge nested if conditions (merge-nested-ifs)

Lift code into else after jump in control flow (reintroduce-else)

Hoist nested repeated code outside conditional statements [×2] (hoist-similar-statement-from-if)

Swap positions of nested conditionals [×2] (swap-nested-ifs)

Swap if/else to remove empty if body (remove-pass-body)

Hoist repeated code outside conditional statement (hoist-statement-from-if)

Swap if/else branches (swap-if-else-branches)

sourcery-ai · 2022-08-21T17:04:01Z

underthesea/file_utils.py

-    if response.status_code not in [200, 302]:
-        if "www.dropbox.com" in url:
-            # dropbox return code 301, so we ignore this error
-            pass
-        else:
-            raise IOError("HEAD request failed for url {}".format(url))
+    if response.status_code not in [200, 302] and "www.dropbox.com" not in url:
+        raise IOError(f"HEAD request failed for url {url}")


Function get_from_cache refactored with the following changes:

Merge nested if conditions (merge-nested-ifs)

Swap if/else to remove empty if body (remove-pass-body)

Replace call to format with f-string (use-fstring-for-formatting)

This removes the following comments ( why? ):

# dropbox return code 301, so we ignore this error

sourcery-ai · 2022-08-21T17:04:01Z

underthesea/file_utils.py

-        if use_slower_interval:
-            Tqdm.default_mininterval = 10.0
-        else:
-            Tqdm.default_mininterval = 0.1
+        Tqdm.default_mininterval = 10.0 if use_slower_interval else 0.1


Function Tqdm.set_slower_interval refactored with the following changes:

Replace if statement with if expression (assign-if-exp)

sourcery-ai · 2022-08-21T17:04:01Z

underthesea/model_fetcher.py

-            if not all:
-                if license == "Close":
-                    continue
+            if not all and license == "Close":
+                continue


Function ModelFetcher.list refactored with the following changes:

Merge nested if conditions (merge-nested-ifs)

sourcery-ai · 2022-08-21T17:04:02Z

underthesea/corpus/data.py

-        if 0.0 <= score <= 1.0:
-            self._score = score
-        else:
-            self._score = 1.0
+        self._score = score if 0.0 <= score <= 1.0 else 1.0


Function Label.score refactored with the following changes:

Replace if statement with if expression (assign-if-exp)

sourcery-ai · 2022-08-21T17:04:02Z

underthesea/corpus/data.py

-        return "{} ({})".format(self._value, self._score)
+        return f"{self._value} ({self._score})"


Function Label.__str__ refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)

sourcery-ai · 2022-08-21T17:04:35Z

Sourcery Code Quality Report

✅ Merging this PR will increase code quality in the affected files by 0.13%.

Quality metrics	Before	After	Change
Complexity	4.96 ⭐	4.66 ⭐	-0.30 👍
Method Length	52.44 ⭐	51.92 ⭐	-0.52 👍
Working memory	7.15 🙂	7.21 🙂	0.06 👎
Quality	72.66% 🙂	72.79% 🙂	0.13% 👍

Other metrics	Before	After	Change
Lines	6961	6972	11

Changed files	Quality Before	Quality After	Quality Change
travis_pypi_setup.py	90.30% ⭐	89.72% ⭐	-0.58% 👎
apps/col_data.py	87.31% ⭐	87.29% ⭐	-0.02% 👎
apps/col_dictionary.py	71.13% 🙂	70.08% 🙂	-1.05% 👎
apps/col_dictionary_export_from_elasticsearch.py	75.83% ⭐	76.34% ⭐	0.51% 👍
apps/col_dictionary_import_to_elasticsearch.py	76.90% ⭐	76.75% ⭐	-0.15% 👎
apps/col_streamlit.py	76.45% ⭐	75.48% ⭐	-0.97% 👎
datasets/DI_Vietnamese-UVD/scripts/correct_v1.0.alpha.py	76.93% ⭐	78.31% ⭐	1.38% 👍
datasets/DI_Vietnamese-UVD/scripts/data.py	88.72% ⭐	90.53% ⭐	1.81% 👍
datasets/DI_Vietnamese-UVD/scripts/underthesea_v170_dictionary.py	46.35% 😞	46.47% 😞	0.12% 👍
datasets/DI_Vietnamese-UVD/scripts/validate.py	83.49% ⭐	83.38% ⭐	-0.11% 👎
datasets/DI_Vietnamese-UVD/scripts/validate_with_corpus.py	73.48% 🙂	72.77% 🙂	-0.71% 👎
datasets/UD_Vietnamese-COL/scripts/make_corpus.py	75.70% ⭐	76.28% ⭐	0.58% 👍
examples/chatbot/baggage_claim/run_tests.py	67.78% 🙂	66.93% 🙂	-0.85% 👎
examples/ner/preprocess_data.py	78.05% ⭐	78.05% ⭐	0.00%
examples/ner/run_ner_pl.py	84.80% ⭐	84.78% ⭐	-0.02% 👎
examples/ner/tasks.py	64.96% 🙂	65.56% 🙂	0.60% 👍
examples/ner/utils_ner.py	39.26% 😞	39.28% 😞	0.02% 👍
examples/sentiment/data.py	81.11% ⭐	81.99% ⭐	0.88% 👍
examples/sentiment/train_bert.py	78.89% ⭐	78.31% ⭐	-0.58% 👎
examples/sentiment/train_gpt2.py	80.38% ⭐	80.72% ⭐	0.34% 👍
examples/text_normalize/compare_tool.py	61.47% 🙂	62.66% 🙂	1.19% 👍
examples/text_normalize/evaluation.py	63.10% 🙂	63.44% 🙂	0.34% 👍
examples/text_normalize/normalize.py	73.98% 🙂	74.04% 🙂	0.06% 👍
examples/text_normalize/tools/nvh.py	52.37% 🙂	51.58% 🙂	-0.79% 👎
examples/text_normalize/tools/vtm.py	96.34% ⭐	99.50% ⭐	3.16% 👍
examples/text_normalize/tools/vtt.py	96.34% ⭐	99.50% ⭐	3.16% 👍
extensions/underthesea_core/lab_underthesea_core.py	83.50% ⭐	83.80% ⭐	0.30% 👍
tests/featurizers/benchmark_featurizers.py	49.77% 😞	51.35% 🙂	1.58% 👍
underthesea/init.py	73.88% 🙂	73.71% 🙂	-0.17% 👎
underthesea/data_fetcher.py	70.00% 🙂	70.32% 🙂	0.32% 👍
underthesea/file_utils.py	64.65% 🙂	65.19% 🙂	0.54% 👍
underthesea/model_fetcher.py	51.35% 🙂	51.53% 🙂	0.18% 👍
underthesea/corpus/data.py	93.81% ⭐	93.68% ⭐	-0.13% 👎
underthesea/corpus/tagged_corpus.py	86.97% ⭐	87.08% ⭐	0.11% 👍
underthesea/corpus/util.py	70.59% 🙂	69.83% 🙂	-0.76% 👎
underthesea/corpus/validate_corpus.py	63.49% 🙂	63.51% 🙂	0.02% 👍
underthesea/corpus/word_tokenize_corpus.py	84.56% ⭐	84.55% ⭐	-0.01% 👎
underthesea/corpus/ws_corpus.py	80.61% ⭐	80.36% ⭐	-0.25% 👎
underthesea/datasets/uit_absa_hotel/uit_absa_hotel.py	82.75% ⭐	82.85% ⭐	0.10% 👍
underthesea/datasets/uit_absa_restaurant/uit_absa_restaurant.py	82.75% ⭐	82.85% ⭐	0.10% 👍
underthesea/datasets/vlsp2013_wtk/revise_1.py	65.56% 🙂	68.63% 🙂	3.07% 👍
underthesea/datasets/vlsp2013_wtk/revise_2.py	68.07% 🙂	69.44% 🙂	1.37% 👍
underthesea/datasets/vlsp2013_wtk/revise_corpus.py	92.62% ⭐	92.66% ⭐	0.04% 👍
underthesea/dictionary/init.py	93.95% ⭐	92.15% ⭐	-1.80% 👎
underthesea/feature_engineering/feature.py	71.02% 🙂	74.33% 🙂	3.31% 👍
underthesea/models/crf_sequence_tagger.py	89.58% ⭐	89.66% ⭐	0.08% 👍
underthesea/models/text_classifier.py	58.04% 🙂	59.64% 🙂	1.60% 👍
underthesea/modules/base.py	71.24% 🙂	71.26% 🙂	0.02% 👍
underthesea/modules/embeddings.py	96.17% ⭐	96.63% ⭐	0.46% 👍
underthesea/pipeline/chunking/init.py	90.86% ⭐	91.22% ⭐	0.36% 👍
underthesea/pipeline/chunking/model_crf.py	88.06% ⭐	89.09% ⭐	1.03% 👍
underthesea/pipeline/chunking/tagged_feature.py	69.49% 🙂	72.65% 🙂	3.16% 👍
underthesea/pipeline/classification/text_features.py	89.33% ⭐	90.42% ⭐	1.09% 👍
underthesea/pipeline/ner/init.py	90.86% ⭐	91.22% ⭐	0.36% 👍
underthesea/pipeline/ner/model_crf.py	87.74% ⭐	88.80% ⭐	1.06% 👍
underthesea/pipeline/ner/tagged_feature.py	69.49% 🙂	72.65% 🙂	3.16% 👍
underthesea/pipeline/pos_tag/init.py	91.06% ⭐	91.55% ⭐	0.49% 👍
underthesea/pipeline/pos_tag/tagged_feature.py	69.49% 🙂	72.65% 🙂	3.16% 👍
underthesea/pipeline/sent_tokenize/init.py	79.91% ⭐	79.76% ⭐	-0.15% 👎
underthesea/pipeline/sentiment/bank/text_features.py	89.33% ⭐	90.42% ⭐	1.09% 👍
underthesea/pipeline/sentiment/general/init.py	73.70% 🙂	74.19% 🙂	0.49% 👍
underthesea/pipeline/sentiment/general/text_features.py	88.77% ⭐	89.52% ⭐	0.75% 👍
underthesea/pipeline/text_normalize/init.py	86.94% ⭐	86.19% ⭐	-0.75% 👎
underthesea/pipeline/text_normalize/token_normalize.py	90.04% ⭐	90.20% ⭐	0.16% 👍
underthesea/pipeline/word_tokenize/init.py	62.76% 🙂	62.88% 🙂	0.12% 👍
underthesea/pipeline/word_tokenize/model.py	86.68% ⭐	86.63% ⭐	-0.05% 👎
underthesea/pipeline/word_tokenize/nightly.py	79.78% ⭐	79.47% ⭐	-0.31% 👎
underthesea/pipeline/word_tokenize/regex_tokenize.py	50.40% 🙂	49.74% 😞	-0.66% 👎
underthesea/transformer/number.py	87.07% ⭐	90.55% ⭐	3.48% 👍
underthesea/transformer/tagged.py	51.12% 🙂	52.42% 🙂	1.30% 👍
underthesea/transformer/tagged_feature.py	91.02% ⭐	93.21% ⭐	2.19% 👍
underthesea/transformer/word_vector.py	81.69% ⭐	81.91% ⭐	0.22% 👍
underthesea/transforms/conll.py	68.75% 🙂	68.97% 🙂	0.22% 👍
underthesea/util/init.py	73.07% 🙂	70.05% 🙂	-3.02% 👎
underthesea/utils/init.py	51.71% 🙂	56.14% 🙂	4.43% 👍
underthesea/utils/col_analyzer.py	83.73% ⭐	83.92% ⭐	0.19% 👍
underthesea/utils/col_dictionary.py	81.89% ⭐	81.80% ⭐	-0.09% 👎
underthesea/utils/col_external_dictionary.py	79.30% ⭐	78.89% ⭐	-0.41% 👎
underthesea/utils/col_lyrics.py	66.17% 🙂	65.82% 🙂	-0.35% 👎
underthesea/utils/col_script.py	85.25% ⭐	85.17% ⭐	-0.08% 👎
underthesea/utils/col_sketchengine.py	75.53% ⭐	75.47% ⭐	-0.06% 👎
underthesea/utils/col_stopwords.py	82.73% ⭐	82.78% ⭐	0.05% 👍
underthesea/utils/col_wiki_clean.py	83.81% ⭐	82.35% ⭐	-1.46% 👎
underthesea/utils/col_wiki_ud.py	82.60% ⭐	81.22% ⭐	-1.38% 👎
underthesea/utils/sp_config.py	86.98% ⭐	86.74% ⭐	-0.24% 👎
underthesea/utils/sp_data.py	75.83% ⭐	76.08% ⭐	0.25% 👍
underthesea/utils/sp_embedding.py	88.11% ⭐	88.95% ⭐	0.84% 👍
underthesea/utils/sp_field.py	72.77% 🙂	73.23% 🙂	0.46% 👍
underthesea/utils/sp_vocab.py	87.61% ⭐	87.26% ⭐	-0.35% 👎

Here are some functions in these files that still need a tune-up:

File	Function	Complexity	Length	Working Memory	Quality	Recommendation
examples/ner/utils_ner.py	TokenClassificationTask.convert_examples_to_features	30 😞	477 ⛔	24 ⛔	12.31% ⛔	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
examples/text_normalize/tools/nvh.py	chuan_hoa_dau_tu_tieng_viet	44 ⛔	339 ⛔	16 ⛔	14.24% ⛔	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
underthesea/corpus/validate_corpus.py	validate_token	17 🙂	280 ⛔	14 😞	31.65% 😞	Try splitting into smaller methods. Extract out complex expressions
underthesea/transformer/tagged.py	TaggedTransformer.word2features	24 😞	179 😞	15 😞	31.68% 😞	Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
underthesea/modules/base.py	BiLSTM.forward	5 ⭐	302 ⛔	17 ⛔	38.20% 😞	Try splitting into smaller methods. Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

⭐ excellent
🙂 good
😞 poor
⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.

Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

* Open file with encoding='utf-8'

undertheseanlpGH-560: Add encoding='utf-8' to fix "UnicodeDecodeError"

'Refactored by Sourcery'

262d2c2

sourcery-ai bot requested a review from hungpham3112 August 21, 2022 17:03

sourcery-ai bot commented Aug 21, 2022

View reviewed changes

hungpham3112 added 2 commits August 22, 2022 17:05

undertheseanlpGH-560: Add encoding='utf-8' to fix "UnicodeDecodeError"

9ab07c3

* Open file with encoding='utf-8'

Merge pull request #2 from undertheseanlp/master

8684227

undertheseanlpGH-560: Add encoding='utf-8' to fix "UnicodeDecodeError"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sourcery refactored master branch #1

Sourcery refactored master branch #1

sourcery-ai bot commented Aug 21, 2022

sourcery-ai bot left a comment

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot Aug 21, 2022

sourcery-ai bot commented Aug 21, 2022 •

edited

Loading

		output_line = line.split()[0] + " " + preds_list[example_id].pop(0) + "\n"
		output_line = f"{line.split()[0]} {preds_list[example_id].pop(0)}" + "\n"

		return "{} ({})".format(self._value, self._score)
		return f"{self._value} ({self._score})"

Sourcery refactored master branch #1

Are you sure you want to change the base?

Sourcery refactored master branch #1

Conversation

sourcery-ai bot commented Aug 21, 2022

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot Aug 21, 2022

Choose a reason for hiding this comment

sourcery-ai bot commented Aug 21, 2022 • edited Loading

Sourcery Code Quality Report

Legend and Explanation

sourcery-ai bot commented Aug 21, 2022 •

edited

Loading