Moved files to language-specific folders

sebastianruder · sebastianruder · commit ae4990bc079a · 2018-10-24T21:35:35.000+01:00
diff --git a/README.md b/README.md
@@ -4,59 +4,56 @@
 
 ### English
 
-- [ASR](asr.md)
-- [CCG supertagging](ccg_supertagging.md)
-- [Chunking](chunking.md)
-- [Constituency parsing](constituency_parsing.md)
-- [Coreference resolution](coreference_resolution.md)
-- [Dependency parsing](dependency_parsing.md)
-- [Dialog](dialog.md)
-- [Domain adaptation](domain_adaptation.md)
-- [Entity Linking](entity_linking.md)
-- [Grammatical Error Correction](grammatical_error_correction.md)
-- [Information Extraction](information_extraction.md)
-- [Language modeling](language_modeling.md)
-- [Lexical Normalization](lexical_normalization.md)
-- [Machine translation](machine_translation.md)
-- [Multi-task learning](multi-task_learning.md)
-- [Multimodal](multimodal.md)
-- [Named entity recognition](named_entity_recognition.md)
-- [Natural language inference](natural_language_inference.md)
-- [Part-of-speech tagging](part-of-speech_tagging.md)
-- [Question answering](question_answering.md)
-- [Relation Prediction](relation_prediction.md)
-- [Relationship extraction](relationship_extraction.md)
-- [Semantic textual similarity](semantic_textual_similarity.md)
-- [Sentiment analysis](sentiment_analysis.md)
-- [Semantic parsing](semantic_parsing.md)
-- [Semantic role labeling](semantic_role_labeling.md)
-- [Stance detection](stance_detection.md)
-- [Summarization](summarization.md)
-- [Taxonomy learning](taxonomy_learning.md)
-- [Temporal Processing](temporal_processing.md)
-- [Text classification](text_classification.md)
-- [Word Sense Disambiguation](word_sense_disambiguation.md)
+- [ASR](english/asr.md)
+- [CCG supertagging](english/ccg_supertagging.md)
+- [Chunking](english/chunking.md)
+- [Constituency parsing](english/constituency_parsing.md)
+- [Coreference resolution](english/coreference_resolution.md)
+- [Dependency parsing](english/dependency_parsing.md)
+- [Dialog](english/dialog.md)
+- [Domain adaptation](english/domain_adaptation.md)
+- [Entity Linking](english/entity_linking.md)
+- [Grammatical Error Correction](english/grammatical_error_correction.md)
+- [Information Extraction](english/information_extraction.md)
+- [Language modeling](english/language_modeling.md)
+- [Lexical Normalization](english/lexical_normalization.md)
+- [Machine translation](english/machine_translation.md)
+- [Multi-task learning](english/multi-task_learning.md)
+- [Multimodal](english/multimodal.md)
+- [Named entity recognition](english/named_entity_recognition.md)
+- [Natural language inference](english/natural_language_inference.md)
+- [Part-of-speech tagging](english/part-of-speech_tagging.md)
+- [Question answering](english/question_answering.md)
+- [Relation Prediction](english/relation_prediction.md)
+- [Relationship extraction](english/relationship_extraction.md)
+- [Semantic textual similarity](english/semantic_textual_similarity.md)
+- [Sentiment analysis](english/sentiment_analysis.md)
+- [Semantic parsing](english/semantic_parsing.md)
+- [Semantic role labeling](english/semantic_role_labeling.md)
+- [Stance detection](english/stance_detection.md)
+- [Summarization](english/summarization.md)
+- [Taxonomy learning](english/taxonomy_learning.md)
+- [Temporal Processing](english/temporal_processing.md)
+- [Text classification](english/text_classification.md)
+- [Word Sense Disambiguation](english/word_sense_disambiguation.md)
 
 ### Korean
 
-- [Chunking](korean.md)
-- [Part-of-speech tagging](korean.md)
+- [Chunking](korean/korean.md)
+- [Part-of-speech tagging](korean/korean.md)
 
 ### Hindi
 
-- [Chunking](hindi.md)
-- [Machine Translation](hindi.md)
+- [Chunking](hindi/hindi.md)
+- [Machine Translation](hindi/hindi.md)
 
 ### Vietnamese
 
-- [Word segmentation](vietnamese.md)
-- [Part-of-speech tagging](vietnamese.md)
-- [Named entity recognition](vietnamese.md)
-- [Dependency parsing](vietnamese.md)
-- [Machine translation](vietnamese.md)
-
-
-
+- [Word segmentation](vietnamese/vietnamese.md)
+- [Part-of-speech tagging](vietnamese/vietnamese.md)
+- [Named entity recognition](vietnamese/vietnamese.md)
+- [Dependency parsing](vietnamese/vietnamese.md)
+- [Machine translation](vietnamese/vietnamese.md)
 
 This document aims to track the progress in Natural Language Processing (NLP) and give an overview
 of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets.
diff --git a/english/asr.md b/english/asr.md
diff --git a/english/ccg_supertagging.md b/english/ccg_supertagging.md
@@ -21,4 +21,4 @@ Performance is only calculated on the 425 most frequent labels. Models are evalu
 
 {% include chart.html results=site.data.ccg_supertagging score='accuracy' %}
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/chunking.md b/english/chunking.md
@@ -18,4 +18,4 @@ for testing. Models are evaluated based on F1.
 
 {% include chart.html results=site.data.chunking score='F1 score' %}
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/constituency_parsing.md b/english/constituency_parsing.md
@@ -33,4 +33,4 @@ For a comparison of single models trained only on WSJ, refer to [Kitaev and Klei
 
 {% include chart.html results=site.data.constituency_parsing score='F1 score' %}
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/coreference_resolution.md b/english/coreference_resolution.md
@@ -27,4 +27,4 @@ CoNLL-2012 evaluation scripts. The main evaluation metric is the average F1 of t
 | (Lee et al., 2017)+ELMo (Peters et al., 2018) | 70.4 | [Deep contextualized word representatIions](https://arxiv.org/abs/1802.05365) | |
 | Lee et al. (2017) | 67.2 | [End-to-end Neural Coreference Resolution](https://arxiv.org/abs/1707.07045) | |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/dependency_parsing.md b/english/dependency_parsing.md
@@ -51,4 +51,4 @@ accuracy').
   results=site.data.dependency_parsing.Unsupervised_Penn_Treebank
   scores='UAS' %}
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/dialog.md b/english/dialog.md
@@ -18,4 +18,4 @@ evaluated based on accuracy on both individual and joint slot tracking.
 
 {% include chart.html results=site.data.dialog score='Joint' %}
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/domain_adaptation.md b/english/domain_adaptation.md
@@ -16,4 +16,4 @@ metric is accuracy and scores are averaged across each domain.
    results=site.data.domain_adaptation
    scores='DVD,Books,Electronics,Kitchen,Average' %}
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/entity_linking.md b/english/entity_linking.md
@@ -83,7 +83,7 @@ Nevertheless, GERBIL is an excellent resource for standardising how EL systems a
 
 [Usbeck] Usbeck et al. GERBIL - General Entity Annotator Benchmarking Framework. WWW 2015. http://svn.aksw.org/papers/2015/WWW_GERBIL/public.pdf
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
 
 [Sil]: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16501/16101 "Neural Cross-Lingual Entity Linking"
 [Shen]: http://dbgroup.cs.tsinghua.edu.cn/wangjy/papers/TKDE14-entitylinking.pdf "Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions"
diff --git a/english/grammatical_error_correction.md b/english/grammatical_error_correction.md
diff --git a/english/information_extraction.md b/english/information_extraction.md
@@ -20,4 +20,4 @@ Open Information Extraction approaches leads to creation of large Knowledge base
 | CESI (Vashishth et al., 2018) |     98.2      |     99.8     |  99.9  |     66.2      |       92.4        | 91.9   |     62.7      |    84.4    |  81.9  | [CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information](https://github.com/malllabiisc/cesi) |
 | Galárraga et al., 2014 ( IDF) |     94.8      |     97.9     |  98.3  |     67.9      |       82.9        | 79.3   |     71.6      |    50.8    |  0.5   | [Canonicalizing Open Knowledge Bases](https://suchanek.name/work/publications/cikm2014.pdf) |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/language_modeling.md b/english/language_modeling.md
@@ -66,4 +66,4 @@ The vocabulary of the words in the character-level dataset is limited to 10 000
   results=site.data.language_modeling.Char_Level.Penn_Treebank
   scores='Bits per Character (BPC),Number of params (M)' %}
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/lexical_normalization.md b/english/lexical_normalization.md
@@ -50,5 +50,5 @@ but chooses the wrong normalization, it is penalized twice.
 
 {% include table.html results=site.data.lexical_normalization_lexnorm2015 scores='F1' %}
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
 
diff --git a/english/machine_translation.md b/english/machine_translation.md
@@ -35,4 +35,4 @@ on BLEU.
 | ConvS2S (Gehring et al., 2017) | 40.46 | [Convolutional Sequence to Sequence Learning](https://arxiv.org/abs/1705.03122) | 
 | Transformer Base (Vaswani et al., 2017) | 38.1 | [Attention Is All You Need](https://arxiv.org/abs/1706.03762) |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/multi-task_learning.md b/english/multi-task_learning.md
@@ -12,4 +12,4 @@ average accuracy across all tasks.
 
 The state-of-the-art results can be seen on the public [GLUE leaderboard](https://gluebenchmark.com/leaderboard).
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/multimodal.md b/english/multimodal.md
@@ -45,4 +45,4 @@ The MOSI dataset ([Zadeh et al., 2016](https://arxiv.org/pdf/1606.06259.pdf)) is
 | bc-LSTM (Poria et al., 2017) | 80.3%  | [Context-Dependent Sentiment Analysis in User-Generated Videos](http://sentic.net/context-dependent-sentiment-analysis-in-user-generated-videos.pdf) |
 | MARN (Zadeh et al., 2018) | 77.1%  | [Multi-attention Recurrent Network for Human Communication Comprehension](https://arxiv.org/pdf/1802.00923.pdf) |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/named_entity_recognition.md b/english/named_entity_recognition.md
@@ -59,4 +59,4 @@ The [Ontonotes corpus v5](https://catalog.ldc.upenn.edu/docs/LDC2013T19/OntoNote
 
 
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/natural_language_inference.md b/english/natural_language_inference.md
@@ -49,4 +49,4 @@ corpus were used as premises. Models are evaluated based on accuracy.
 | Hierarchical BiLSTM Max Pooling (Talman et al., 2018) | 86.0 | [Natural Language Inference with Hierarchical BiLSTM Max Pooling](https://arxiv.org/abs/1808.08762)
 | CAFE (Tay et al., 2018) | 83.3 | [A Compare-Propagate Architecture with Alignment Factorization for Natural Language Inference](https://arxiv.org/abs/1801.00102) |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/part-of-speech_tagging.md b/english/part-of-speech_tagging.md
@@ -51,4 +51,4 @@ Models are typically evaluated based on the average test accuracy across 28 lang
 | Bi-LSTM (Plank et al., 2016) | 96.40 | [Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss](https://arxiv.org/abs/1604.05529) | 
 | Joint Bi-LSTM (Nguyen et al., 2017) | 95.55 | [A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing](https://arxiv.org/abs/1705.05952) |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/question_answering.md b/english/question_answering.md
@@ -182,4 +182,4 @@ Answer 0: the trophy. Answer 1: the suitcase
 The public leaderboard is available on the [RecipeQA website](https://hucvl.github.io/recipeqa/).
 
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/relation_prediction.md b/english/relation_prediction.md
@@ -40,4 +40,4 @@ The test set is composed of triplets, each used to create two test instances, on
    results=site.data.relation_prediction.WN18RR
    scores='H@10,H@1,MRR' %}
 
-[Back to README](README.md)
+[Back to README](../README.md)
diff --git a/english/relationship_extraction.md b/english/relationship_extraction.md
@@ -63,4 +63,4 @@ reported here are the highest achieved by the model using any external resources
 | UTD (Rink and Harabagiu, 2010)      | 82.2  | [UTD: Classifying Semantic Relations by Combining Lexical and Semantic Resources](http://www.aclweb.org/anthology/S10-1057) ||
 
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/semantic_parsing.md b/english/semantic_parsing.md
@@ -215,4 +215,4 @@ Example:
 | Iyer et al., (2017) | 10 | 4 | [Learning a neural semantic parser from user feedback](http://www.aclweb.org/anthology/P17-1089) | [System](https://github.com/sriniiyer/nl2sql) |
 | Template Baseline (Finegan-Dollak et al., 2018) | 0 | 0 | [Improving Text-to-SQL Evaluation Methodology](http://arxiv.org/abs/1806.09029) | [Data and System](https://github.com/jkkummerfeld/text2sql-data) |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/semantic_role_labeling.md b/english/semantic_role_labeling.md
@@ -22,4 +22,4 @@ Models are typically evaluated on the [OntoNotes benchmark](http://www.aclweb.or
 | He et al. (2018) | 82.1 | [Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling](http://aclweb.org/anthology/P18-2058) | 
 | He et al. (2017) | 81.7 | [Deep Semantic Role Labeling: What Works and What’s Next](http://aclweb.org/anthology/P17-1044) |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/semantic_textual_similarity.md b/english/semantic_textual_similarity.md
@@ -39,4 +39,4 @@ duplicate of the other. Models are evaluated based on accuracy.
 | BiMPM (Wang et al., 2017) | 88.17 | [Bilateral Multi-Perspective Matching for Natural Language Sentences](https://arxiv.org/abs/1702.03814) | [Official](https://github.com/zhiguowang/BiMPM) |
 | GenSen (Subramanian et al., 2018) | 87.01 | [Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning](https://arxiv.org/abs/1804.00079) | [Official](https://github.com/Maluuba/gensen) |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/sentiment_analysis.md b/english/sentiment_analysis.md
@@ -125,4 +125,4 @@ A related task to sentiment analysis is the subjectivity analysis with the goal
 | USE (Cer et al., 2018) | 93.90 | [Universal Sentence Encoder](https://arxiv.org/pdf/1803.11175.pdf) |
 | Fast Dropout (Wang and Manning, 2013) | 93.60 | [Fast Dropout Training](http://proceedings.mlr.press/v28/wang13a.pdf) |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/stance_detection.md b/english/stance_detection.md
@@ -20,4 +20,4 @@ This dataset subsumes the large [PHEME collection of rumors and stance](http://j
 | Bahuleyan and Vechtomova 2017| 0.780 | [UWaterloo at SemEval-2017 Task 8: Detecting Stance towards Rumours with Topic Independent Features](http://www.aclweb.org/anthology/S/S17/S17-2080.pdf) |
 |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/summarization.md b/english/summarization.md
@@ -48,4 +48,4 @@ Models are evaluated using the following metrics:
 | LSTM (Filippova et al., 2015) | 0.82 | 0.38 | [Sentence Compression by Deletion with LSTMs](https://research.google.com/pubs/archive/43852.pdf) | |
 | BiLSTM (Wang et al., 2017) | 0.8 | 0.43 | [Can Syntax Help? Improving an LSTM-based Sentence Compression Model for New Domains](http://www.aclweb.org/anthology/P17-1127) |  |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/taxonomy_learning.md b/english/taxonomy_learning.md
diff --git a/english/temporal_processing.md b/english/temporal_processing.md
@@ -112,4 +112,4 @@ The [Parsing Time Normalizations corpus](https://github.com/bethard/anafora-anno
 | Chrono | 0.70 | [Chrono at SemEval-2018 task 6: A system for normalizing temporal expressions](http://aclweb.org/anthology/S18-1012) | 
 
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/text_classification.md b/english/text_classification.md
@@ -58,4 +58,4 @@ TREC-50:
 | Rules (Madabushi and Lee, 2016) | 2.8 |[High Accuracy Rule-based Question Classification using Question Syntax and Semantics](http://www.aclweb.org/anthology/C16-1116)| |
 | SVM (Van-Tu and Anh-Cuong, 2016) | 8.4 | [Improving Question Classification by Feature Extraction and Selection](https://www.researchgate.net/publication/303553351_Improving_Question_Classification_by_Feature_Extraction_and_Selection) | |
 
-[Go back to the README](README.md)
+[Go back to the README](../README.md)
diff --git a/english/word_sense_disambiguation.md b/english/word_sense_disambiguation.md
diff --git a/hindi/hindi.md b/hindi/hindi.md
diff --git a/korean/korean.md b/korean/korean.md
diff --git a/vietnamese/vietnamese.md b/vietnamese/vietnamese.md

Original file line number	Diff line number	Diff line change
`@@ -21,4 +21,4 @@ Performance is only calculated on the 425 most frequent labels. Models are evalu`
`21`	`21`
`22`	`22`	`{% include chart.html results=site.data.ccg_supertagging score='accuracy' %}`
`23`	`23`
`24`		`-[Go back to the README](README.md)`
	`24`	`+[Go back to the README](../README.md)`
Original file line number	Diff line number	Diff line change
`@@ -18,4 +18,4 @@ for testing. Models are evaluated based on F1.`
`18`	`18`
`19`	`19`	`{% include chart.html results=site.data.chunking score='F1 score' %}`
`20`	`20`
`21`		`-[Go back to the README](README.md)`
	`21`	`+[Go back to the README](../README.md)`
Original file line number	Diff line number	Diff line change
`@@ -33,4 +33,4 @@ For a comparison of single models trained only on WSJ, refer to [Kitaev and Klei`
`33`	`33`
`34`	`34`	`{% include chart.html results=site.data.constituency_parsing score='F1 score' %}`
`35`	`35`
`36`		`-[Go back to the README](README.md)`
	`36`	`+[Go back to the README](../README.md)`
Original file line number	Diff line number	Diff line change
`@@ -18,4 +18,4 @@ evaluated based on accuracy on both individual and joint slot tracking.`
`18`	`18`
`19`	`19`	`{% include chart.html results=site.data.dialog score='Joint' %}`
`20`	`20`
`21`		`-[Go back to the README](README.md)`
	`21`	`+[Go back to the README](../README.md)`
Original file line number	Diff line number	Diff line change
`@@ -50,5 +50,5 @@ but chooses the wrong normalization, it is penalized twice.`
`50`	`50`
`51`	`51`	`{% include table.html results=site.data.lexical_normalization_lexnorm2015 scores='F1' %}`
`52`	`52`
`53`		`-[Go back to the README](README.md)`
	`53`	`+[Go back to the README](../README.md)`
`54`	`54`
Original file line number	Diff line number	Diff line change
`@@ -12,4 +12,4 @@ average accuracy across all tasks.`
`12`	`12`
`13`	`13`	`The state-of-the-art results can be seen on the public [GLUE leaderboard](https://gluebenchmark.com/leaderboard).`
`14`	`14`
`15`		`-[Go back to the README](README.md)`
	`15`	`+[Go back to the README](../README.md)`
Original file line number	Diff line number	Diff line change
`@@ -59,4 +59,4 @@ The [Ontonotes corpus v5](https://catalog.ldc.upenn.edu/docs/LDC2013T19/OntoNote`
`59`	`59`
`60`	`60`
`61`	`61`
`62`		`-[Go back to the README](README.md)`
	`62`	`+[Go back to the README](../README.md)`
Original file line number	Diff line number	Diff line change
`@@ -182,4 +182,4 @@ Answer 0: the trophy. Answer 1: the suitcase`
`182`	`182`	`The public leaderboard is available on the [RecipeQA website](https://hucvl.github.io/recipeqa/).`
`183`	`183`
`184`	`184`
`185`		`-[Go back to the README](README.md)`
	`185`	`+[Go back to the README](../README.md)`
Original file line number	Diff line number	Diff line change
`@@ -63,4 +63,4 @@ reported here are the highest achieved by the model using any external resources`
`63`	`63`	`\| UTD (Rink and Harabagiu, 2010) \| 82.2 \| [UTD: Classifying Semantic Relations by Combining Lexical and Semantic Resources](http://www.aclweb.org/anthology/S10-1057) \|\|`
`64`	`64`
`65`	`65`
`66`		`-[Go back to the README](README.md)`
	`66`	`+[Go back to the README](../README.md)`
Original file line number	Diff line number	Diff line change
`@@ -20,4 +20,4 @@ This dataset subsumes the large [PHEME collection of rumors and stance](http://j`
`20`	`20`	`\| Bahuleyan and Vechtomova 2017\| 0.780 \| [UWaterloo at SemEval-2017 Task 8: Detecting Stance towards Rumours with Topic Independent Features](http://www.aclweb.org/anthology/S/S17/S17-2080.pdf) \|`
`21`	`21`	`\|`
`22`	`22`
`23`		`-[Go back to the README](README.md)`
	`23`	`+[Go back to the README](../README.md)`