JohnSnowLabs
diff --git a/‎docs/_posts/Ahmetemintek/2025-02-14-sent_xlm_roberta_biolord_2023_m_xx.md‎
Lines changed: 102 additions & 0 deletions b/‎docs/_posts/Ahmetemintek/2025-02-14-sent_xlm_roberta_biolord_2023_m_xx.md‎
Lines changed: 102 additions & 0 deletions
diff --git a/‎docs/_posts/DevinTDHa/2025-01-18-llava_v1.5_7b_Q4_0_gguf_en.md‎
Lines changed: 172 additions & 0 deletions b/‎docs/_posts/DevinTDHa/2025-01-18-llava_v1.5_7b_Q4_0_gguf_en.md‎
Lines changed: 172 additions & 0 deletions
diff --git a/‎docs/_posts/ahmedlone127/2025-01-29-20230818214706_en.md‎
Lines changed: 86 additions & 0 deletions b/‎docs/_posts/ahmedlone127/2025-01-29-20230818214706_en.md‎
Lines changed: 86 additions & 0 deletions
@@ -0,0 +1,102 @@
+---
+layout: model
+title: Multilingual BioLORD-2023-M XlmRoBertaSentenceEmbeddings from FremyCompany
+author: John Snow Labs
+name: sent_xlm_roberta_biolord_2023_m
+date: 2025-02-14
+tags: [multilingual, sentence_embeddings, xlm_roberta, open_source, xx, onnx]
+task: Embeddings
+language: xx
+edition: Spark NLP 5.5.2
+spark_version: 3.0
+supported: true
+engine: onnx
+annotator: XlmRoBertaSentenceEmbeddings
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+Pretrained `XlmRoBertaSentenceEmbeddings` model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP. `sent_xlm_roberta_biolord_2023_m` is a multilingual model originally trained by FremyCompany. It supports English, Spanish, French, German, Dutch, Danish and Swedish.
+
+## Predicted Entities
+
+
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/sent_xlm_roberta_biolord_2023_m_xx_5.5.2_3.0_1739548358592.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/sent_xlm_roberta_biolord_2023_m_xx_5.5.2_3.0_1739548358592.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+```python
+documentAssembler = DocumentAssembler() \
+      .setInputCol("text") \
+      .setOutputCol("document")
+
+embeddings = XlmRoBertaSentenceEmbeddings.pretrained("sent_xlm_roberta_biolord_2023_m","xx") \
+      .setInputCols(["document"]) \
+      .setOutputCol("embeddings")       
+        
+pipeline = Pipeline().setStages([documentAssembler, embeddings])
+
+data = spark.createDataFrame([["Disfruto trabajando con Spark-NLP."]]).toDF("text")
+pipelineModel = pipeline.fit(data)
+result = pipelineModel.transform(data)
+```
+```scala
+val documentAssembler = new DocumentAssembler()
+  .setInputCol("text")
+  .setOutputCol("document")
+
+val embeddings = XlmRoBertaSentenceEmbeddings
+  .pretrained("sent_xlm_roberta_biolord_2023_m", "xx")
+  .setInputCols(Array("document"))
+  .setOutputCol("embeddings")
+
+val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings))
+
+
+val data = Seq("Disfruto trabajando con Spark-NLP.").toDF("text")
+
+val pipelineModel = pipeline.fit(data)
+val result = pipelineModel.transform(data)
+```
+</div>
+
+## Results
+
+```bash
++----------------------------------+----------------------------------------------------------------------+----------------------------------------------------------------------+
+|                              text|                                                              document|                                                   sentence_embeddings|
++----------------------------------+----------------------------------------------------------------------+----------------------------------------------------------------------+
+|Disfruto trabajando con Spark-NLP.|[{document, 0, 33, Disfruto trabajando con Spark-NLP., {sentence ->...|[{sentence_embeddings, 0, 33, Disfruto trabajando con Spark-NLP., {...|
++----------------------------------+----------------------------------------------------------------------+----------------------------------------------------------------------+
+
+```
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|sent_xlm_roberta_biolord_2023_m|
+|Compatibility:|Spark NLP 5.5.2+|
+|License:|Open Source|
+|Edition:|Official|
+|Input Labels:|[document]|
+|Output Labels:|[xlm_sentence_embeddings]|
+|Language:|xx|
+|Size:|1.0 GB|
+
+## References
+
+https://huggingface.co/FremyCompany/BioLORD-2023-M
@@ -0,0 +1,172 @@
+---
+layout: model
+title: LLaVA v1.5 7B Q4 GGUF
+author: John Snow Labs
+name: llava_v1.5_7b_Q4_0_gguf
+date: 2025-01-18
+tags: [gguf, llamacpp, llava, en, quantized, open_source]
+task: Image Captioning
+language: en
+edition: Spark NLP 6.0.0
+spark_version: 3.0
+supported: true
+engine: llamacpp
+annotator: AutoGGUFVisionModel
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.
+
+Originally from https://huggingface.co/Mozilla/llava-v1.5-7b-llamafile
+
+## Predicted Entities
+
+
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/llava_v1.5_7b_Q4_0_gguf_en_6.0.0_3.0_1737207768652.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/llava_v1.5_7b_Q4_0_gguf_en_6.0.0_3.0_1737207768652.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+```python
+import sparknlp
+from sparknlp.base import *
+from sparknlp.annotator import *
+from pyspark.ml import Pipeline
+from pyspark.sql.functions import lit
+
+documentAssembler = DocumentAssembler() \
+    .setInputCol("caption") \
+    .setOutputCol("caption_document")
+imageAssembler = ImageAssembler() \
+    .setInputCol("image") \
+    .setOutputCol("image_assembler")
+
+imagesPath = "src/test/resources/image/"
+data = ImageAssembler \
+    .loadImagesAsBytes(spark, imagesPath) \
+    .withColumn("caption", lit("Caption this image.")) # Add a caption to each image.
+
+nPredict = 40
+model = AutoGGUFVisionModel.pretrained() \
+    .setInputCols(["caption_document", "image_assembler"]) \
+    .setOutputCol("completions") \
+    .setBatchSize(4) \
+    .setNGpuLayers(99) \
+    .setNCtx(4096) \
+    .setMinKeep(0) \
+    .setMinP(0.05) \
+    .setNPredict(nPredict) \
+    .setNProbs(0) \
+    .setPenalizeNl(False) \
+    .setRepeatLastN(256) \
+    .setRepeatPenalty(1.18) \
+    .setStopStrings(["</s>", "Llama:", "User:"]) \
+    .setTemperature(0.05) \
+    .setTfsZ(1) \
+    .setTypicalP(1) \
+    .setTopK(40) \
+    .setTopP(0.95)
+
+pipeline = Pipeline().setStages([documentAssembler, imageAssembler, model])
+pipeline.fit(data).transform(data) \
+    .selectExpr("reverse(split(image.origin, '/'))[0] as image_name", "completions.result") \
+    .show(truncate = False)
+
+
+```
+```scala
+import com.johnsnowlabs.nlp.ImageAssembler
+import com.johnsnowlabs.nlp.annotator._
+import com.johnsnowlabs.nlp.base._
+import org.apache.spark.ml.Pipeline
+import org.apache.spark.sql.DataFrame
+import org.apache.spark.sql.functions.lit
+
+val documentAssembler = new DocumentAssembler()
+  .setInputCol("caption")
+  .setOutputCol("caption_document")
+
+val imageAssembler = new ImageAssembler()
+  .setInputCol("image")
+  .setOutputCol("image_assembler")
+
+val imagesPath = "src/test/resources/image/"
+val data: DataFrame = ImageAssembler
+  .loadImagesAsBytes(ResourceHelper.spark, imagesPath)
+  .withColumn("caption", lit("Caption this image.")) // Add a caption to each image.
+
+val nPredict = 40
+val model = AutoGGUFVisionModel.pretrained()
+  .setInputCols("caption_document", "image_assembler")
+  .setOutputCol("completions")
+  .setBatchSize(4)
+  .setNGpuLayers(99)
+  .setNCtx(4096)
+  .setMinKeep(0)
+  .setMinP(0.05f)
+  .setNPredict(nPredict)
+  .setNProbs(0)
+  .setPenalizeNl(false)
+  .setRepeatLastN(256)
+  .setRepeatPenalty(1.18f)
+  .setStopStrings(Array("</s>", "Llama:", "User:"))
+  .setTemperature(0.05f)
+  .setTfsZ(1)
+  .setTypicalP(1)
+  .setTopK(40)
+  .setTopP(0.95f)
+
+val pipeline = new Pipeline().setStages(Array(documentAssembler, imageAssembler, model))
+pipeline
+  .fit(data)
+  .transform(data)
+  .selectExpr("reverse(split(image.origin, '/'))[0] as image_name", "completions.result")
+  .show(truncate = false)
+
+```
+</div>
+
+## Results
+
+```bash
++-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+|image_name       |result                                                                                                                                                                                        |
++-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+|palace.JPEG      |[ The image depicts a large, ornate room with high ceilings and beautifully decorated walls. There are several chairs placed throughout the space, some of which have cushions]               |
+|egyptian_cat.jpeg|[ The image features two cats lying on a pink surface, possibly a bed or sofa. One cat is positioned towards the left side of the scene and appears to be sleeping while holding]             |
+|hippopotamus.JPEG|[ A large brown hippo is swimming in a body of water, possibly an aquarium. The hippo appears to be enjoying its time in the water and seems relaxed as it floats]                            |
+|hen.JPEG         |[ The image features a large chicken standing next to several baby chickens. In total, there are five birds in the scene: one adult and four young ones. They appear to be gathered together] |
+|ostrich.JPEG     |[ The image features a large, long-necked bird standing in the grass. It appears to be an ostrich or similar species with its head held high and looking around. In addition to]              |
+|junco.JPEG       |[ A small bird with a black head and white chest is standing on the snow. It appears to be looking at something, possibly food or another animal in its vicinity. The scene takes place out]  |
+|bluetick.jpg     |[ A dog with a red collar is sitting on the floor, looking at something. The dog appears to be staring into the distance or focusing its attention on an object in front of it.]              |
+|chihuahua.jpg    |[ A small brown dog wearing a sweater is sitting on the floor. The dog appears to be looking at something, possibly its owner or another animal in the room. It seems comfortable and relaxed]|
+|tractor.JPEG     |[ A man is sitting in the driver's seat of a green tractor, which has yellow wheels and tires. The tractor appears to be parked on top of an empty field with]                                |
+|ox.JPEG          |[ A large bull with horns is standing in a grassy field.]                                                                                                                                     |
++-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+```
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|llava_v1.5_7b_Q4_0_gguf|
+|Compatibility:|Spark NLP 6.0.0+|
+|License:|Open Source|
+|Edition:|Official|
+|Input Labels:|[caption_document, image_assembler]|
+|Output Labels:|[completions]|
+|Language:|en|
+|Size:|4.2 GB|
@@ -0,0 +1,86 @@
+---
+layout: model
+title: English 20230818214706 BertForQuestionAnswering from dkqjrm
+author: John Snow Labs
+name: 20230818214706
+date: 2025-01-29
+tags: [en, open_source, onnx, question_answering, bert]
+task: Question Answering
+language: en
+edition: Spark NLP 5.5.1
+spark_version: 3.0
+supported: true
+engine: onnx
+annotator: BertForQuestionAnswering
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+Pretrained BertForQuestionAnswering model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`20230818214706` is a English model originally trained by dkqjrm.
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/20230818214706_en_5.5.1_3.0_1738185378925.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/20230818214706_en_5.5.1_3.0_1738185378925.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+```python
+             
+documentAssembler = MultiDocumentAssembler() \
+     .setInputCol(["question", "context"]) \
+     .setOutputCol(["document_question", "document_context"])
+    
+spanClassifier = BertForQuestionAnswering.pretrained("20230818214706","en") \
+     .setInputCols(["document_question","document_context"]) \
+     .setOutputCol("answer")
+
+pipeline = Pipeline().setStages([documentAssembler, spanClassifier])
+data = spark.createDataFrame([["What framework do I use?","I use spark-nlp."]]).toDF("document_question", "document_context")
+pipelineModel = pipeline.fit(data)
+pipelineDF = pipelineModel.transform(data)
+
+```
+```scala
+
+val documentAssembler = new MultiDocumentAssembler()
+    .setInputCol(Array("question", "context")) 
+    .setOutputCol(Array("document_question", "document_context"))
+    
+val spanClassifier = BertForQuestionAnswering.pretrained("20230818214706", "en")
+    .setInputCols(Array("document_question","document_context")) 
+    .setOutputCol("answer") 
+    
+val pipeline = new Pipeline().setStages(Array(documentAssembler, spanClassifier))
+val data = Seq("What framework do I use?","I use spark-nlp.").toDS.toDF("document_question", "document_context")
+val pipelineModel = pipeline.fit(data)
+val pipelineDF = pipelineModel.transform(data)
+
+```
+</div>
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|20230818214706|
+|Compatibility:|Spark NLP 5.5.1+|
+|License:|Open Source|
+|Edition:|Official|
+|Input Labels:|[document_question, document_context]|
+|Output Labels:|[answer]|
+|Language:|en|
+|Size:|1.2 GB|
+
+## References
+
+https://huggingface.co/dkqjrm/20230818214706