Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG-python/deployment] Upgrading server from 1.26.0 → >2.2.0, with no pre-existing postgres #5814

Open
mpjuhasz opened this issue Feb 20, 2025 · 0 comments

Comments

@mpjuhasz
Copy link

mpjuhasz commented Feb 20, 2025

Describe the bug

We've been running an argilla server on v1.26.0 with a bunch of FeedbackDatasets, and are now upgrading to 2.x. Based on the migration guide, as our datasets are compatible with v2, this is a version bump on the docker-compose, and we can carry on working with them.

The issue is, that for running our existing server we're using an older version of the argilla suggested docker-compose (see below), which didn't include postgres and redis. If I'm bumping the image version in this docker-compose, it does indeed migrate seemlessly up until 2.2.0, but anything above that causes the argilla container to exit.

If, however I use the new docker-compose from here, it corrupts the volume that contains the data. After attempting to bring up the stack, it's not even possible to roll back to the previous version.

The question is: how can I migrate the persisted datasets from the legacy version (probably using sqlite) to the new version (using postgres)?

For reference, here's the "old" docker-compose we're using:

services:
  argilla:
    image: argilla/argilla-server:v1.26.0
    container_name: argilla_server
    restart: unless-stopped
    ports:
      - 6900:6900
    environment:
      ARGILLA_HOME_PATH: /var/lib/argilla
      ARGILLA_ELASTICSEARCH: http://elasticsearch:9200
      ARGILLA_AUTH_SECRET_KEY: ${ARGILLA_AUTH_SECRET_KEY:? Please generate a 32 character random string with `openssl rand -hex 32`}
      ARGILLA_ENABLE_TELEMETRY: 0
      DEFAULT_USER_ENABLED: false
    volumes:
    - argilladata:/var/lib/argilla
    networks:
      - argilla

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.5.3
    environment:
      - node.name=elasticsearch
      - cluster.name=es-argilla-local
      - discovery.type=single-node
      - ES_JAVA_OPTS=-Xms512m -Xmx512m
      - cluster.routing.allocation.disk.threshold_enabled=false
      - xpack.security.enabled=false
    ulimits:
      memlock:
        soft: -1
        hard: -1
    networks:
      - argilla
    ports:
      - 9200:9200
      - 9300:9300
    volumes:
      - elasticdata:/usr/share/elasticsearch/data/
  kibana:
    image: docker.elastic.co/kibana/kibana:8.5.3
    ports:
      - 5601:5601
    environment:
      ELASTICSEARCH_URL: http://elasticsearch:9200
      ELASTICSEARCH_HOSTS: '["http://elasticsearch:9200"]'
    networks:
      - argilla
networks:
  argilla:
    driver: bridge
volumes:
  elasticdata:
  argilladata:

and here's the new one:

x-common-variables: &common-variables
  ARGILLA_HOME_PATH: /var/lib/argilla
  ARGILLA_ELASTICSEARCH: http://elasticsearch:9200
  ARGILLA_DATABASE_URL: postgresql+asyncpg://postgres:postgres@postgres:5432/argilla
  ARGILLA_REDIS_URL: redis://redis:6379/0

services:
  argilla:
    image: argilla/argilla-server:v2.0.0
    restart: unless-stopped
    ports:
      - "6900:6900"
    environment:
      <<: *common-variables
      USERNAME: argilla
      PASSWORD: 12345678
      API_KEY: argilla.apikey
      WORKSPACE: default
      # Uncomment the following line to reindex Argilla datasets into the search engine when starting up
      # REINDEX_DATASETS: 1
      # Opt-out for telemetry https://huggingface.co/docs/huggingface_hub/main/en/package_reference/utilities#huggingface_hub.utils.send_telemetry
      # HF_HUB_DISABLE_TELEMETRY: 1
      # Opt-out for telemetry https://huggingface.co/docs/huggingface_hub/main/en/package_reference/utilities#huggingface_hub.utils.send_telemetry
      # HF_HUB_OFFLINE: 1
    networks:
      - argilla
    volumes:
      # ARGILLA_HOME_PATH is used to define where Argilla will save it's application data.
      # If you change ARGILLA_HOME_PATH value please copy that same value to argilladata volume too.
      - argilladata:/var/lib/argilla
    depends_on:
      - postgres
      - elasticsearch
      - redis

  worker:
    image: argilla/argilla-server:v2.0.0
    environment:
      <<: *common-variables
      BACKGROUND_NUM_WORKERS: 2
    networks:
      - argilla
    depends_on:
      - postgres
      - elasticsearch
      - redis
    command: sh -c 'python -m argilla_server worker --num-workers $${BACKGROUND_NUM_WORKERS}'

  postgres:
    image: postgres:14
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: argilla
    networks:
      - argilla
    volumes:
      - postgresdata:/var/lib/postgresql/data

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.17.0
    environment:
      - ES_JAVA_OPTS=-Xms512m -Xmx512m -XX:UseSVE=0
      - CLI_JAVA_OPTS=-XX:UseSVE=0
      - node.name=elasticsearch
      - cluster.name=es-argilla-local
      - discovery.type=single-node
      - cluster.routing.allocation.disk.threshold_enabled=false
      - xpack.security.enabled=false
    ulimits:
      memlock:
        soft: -1
        hard: -1
    networks:
      - argilla
    volumes:
      - elasticdata:/usr/share/elasticsearch/data/

  redis:
    image: redis
    networks:
      - argilla

networks:
  argilla:
    driver: bridge

volumes:
  argilladata:
  elasticdata:
  postgresdata:

Stacktrace and code to create the bug

It's running subsequent docker compose up on the different configs.

Expected behavior

Our data to migrate to the new postgres storage and be able to upgrade to latest version.

Environment

  • Argilla Version [e.g. 1.0.0]: 2.7.0
  • ElasticSearch Version [e.g. 7.10.2]: 8.5.3 and 8.17.0
  • Docker Image (optional) [e.g. argilla:v1.0.0]: 1.26.0, 2.0.0, 2.2.0, 2.3.0

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant