SuperAnnotate is the cornerstone of your data labeling pipeline. It brings you a cutting-edge annotation tool for all types of data including image, video, text, LiDAR, audio, and more.
This Python package provides a set of utilities for working with SuperAnnotate data on Databricks. It includes functionality to process SuperAnnotate data and save it to Delta tables.
Convert superannotate annotation data to Apache Spark™ Data Frames. Project types supported: - Vector - Text
Copy the notebooks in the demo folder to your databricks workspace to get started with SuperAnnotate quickly!
pip install superannotate_databricks_connector
Run tests by building the Dockerfile.test file using
docker build -f Dockerfile.test -t test_package .
If you are running the tests for the first you first have to build the base dockerfile containing pyspark.
docker build -f Dockerfile.spark -t spark_docker_base .
In the main directory, run the following to generate a .whl file.
python -m build
First import the required function
from superannotate_databricks_conector.vector import get_vector_dataframe
from superannotate import SAClient
You can then convert your annotations to a spark dataframe
sa = SAClient(token="<TOKEN>")
annotations = sa.get_annotations("<PROJECT_NAME>)
df = get_vector_dataframe(annotations, spark)