Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS S3 storage driver (boto3) not found #39

Open
vladseve7n opened this issue Dec 15, 2022 · 7 comments
Open

AWS S3 storage driver (boto3) not found #39

vladseve7n opened this issue Dec 15, 2022 · 7 comments

Comments

@vladseve7n
Copy link

vladseve7n commented Dec 15, 2022

Hello! I am trying to use clearml-serving to serve my PyTorch pretrained model.
I deploy ClearML Server and using S3 Minio on local network to store artifacts and pretrained weights.

There is no problem with storing and getting models using Input\Output Models. Everything works correctly.
But clearml-serving (particularly the clearml-serving-triton container) don't have opportunity to work with Minio as it has not python module boto3

Using tutorial I add S3 credentials to example.env:

CLEARML_WEB_HOST=http://192.168.3.217:8080
CLEARML_API_HOST=http://192.168.3.217:8008
CLEARML_FILES_HOST=http://192.168.3.217:8081
CLEARML_API_ACCESS_KEY=CLEARML_API_ACCESS_KEY
CLEARML_API_SECRET_KEY=CLEARML_API_SECRET_KEY
CLEARML_SERVING_TASK_ID="ccfed15e442242a19338c20772562df2"
AWS_ACCESS_KEY_ID=AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY=AWS_SECRET_ACCESS_KEY

After that it doesn't work as AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY not be sent to docker-compose. I add those variables to clearml-serving-triton container in the way to solve it:

  clearml-serving-triton:
    image: allegroai/clearml-serving-triton:latest
    container_name: clearml-serving-triton
    restart: unless-stopped
    # optimize perforamnce
    security_opt:
      - seccomp:unconfined
    # ports:
      # - "8001:8001"
    environment:
      CLEARML_WEB_HOST: ${CLEARML_WEB_HOST:-https://app.clear.ml}
      CLEARML_API_HOST: ${CLEARML_API_HOST:-https://api.clear.ml}
      CLEARML_FILES_HOST: ${CLEARML_FILES_HOST:-https://files.clear.ml}
      CLEARML_API_ACCESS_KEY: ${CLEARML_API_ACCESS_KEY}
      CLEARML_API_SECRET_KEY: ${CLEARML_API_SECRET_KEY}
      CLEARML_SERVING_TASK_ID: ${CLEARML_SERVING_TASK_ID:-}
      CLEARML_TRITON_POLL_FREQ: ${CLEARML_TRITON_POLL_FREQ:-1.0}
      CLEARML_TRITON_METRIC_FREQ: ${CLEARML_TRITON_METRIC_FREQ:-1.0}
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID:-ACCES_KEY}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY:-SECRET_ACCESS_KEY}

But after that there is an error in this container:

clearml-serving-triton        | 2022-12-15 05:05:45,607 - clearml.storage - ERROR - AWS S3 storage driver (boto3) not found. Please install driver using: pip install "boto3>=1.9"

I guess that it can be fixed adding "boto3>=1.9" to the container requirements.txt here:
https://github.com/allegroai/clearml-serving/blob/main/clearml_serving/engines/triton/requirements.txt

After doing this and building using local docker image I get the following error:

clearml-serving-triton        | 2022-12-15 05:10:54,624 - clearml.storage - ERROR - Could not download s3://192.168.3.217:9000/models/test/RegNet.b04da49b696a472b94677e26762078d1/models/regnet_y_400MF.pt , err: SSL validation failed for https://192.168.3.217:9000/models/test/RegNet.b04da49b696a472b94677e26762078d1/models/regnet_y_400MF.pt [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1131)

And I don't have any ideas how to disable secure connection in this container

@qwaxys
Copy link

qwaxys commented Jan 10, 2023

We're looking into this! I'll update when I have more news.

@ainoam
Copy link

ainoam commented Jan 26, 2023

Hey @vladseve7n, apologies for the long turn around...

You were on the right path:

Does this work for you?

@JamesNeumann
Copy link

Hi, I'm facing the same problem.

While testing the given example with the triton-gpu docker compose file, everything is working. I disabled SLL in the clearml.conf file. The exported model gets correctly uploaded to the minIO server. But trying to do inference with the following command:

curl -X POST "http://XXX.XXX.XXX.XXX:8080/serve/test_model_sklearn" -H "accept: application/json" -H "Content-Type: application/json" -d '{"x0": 1, "x1": 2}'

results in the following error:

clearml.storage - WARNING - Failed getting object size: SSLError('SSL validation failed for https://XXX.XXX.XXX.XXX:9000/clearml/serving%20examples/train%20sklearn%20model.a5a017b0c83c49e48d31a56e173030d2/models/sklearn-model.pkl [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:992)')

clearml.storage - ERROR - Could not download s3://XXX.XXX.XXX.XXX:9000/clearml/serving examples/train sklearn model.a5a017b0c83c49e48d31a56e173030d2/models/sklearn-model.pkl , err: SSL validation failed for https://XXX.XXX.XXX.XXX:9000/clearml/serving%20examples/train%20sklearn%20model.a5a017b0c83c49e48d31a56e173030d2/models/sklearn-model.pkl [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:992)

My configuration of minIO in the clearml.conf file looks the following:

 aws {
        s3 {
            # S3 credentials, used for read/write access by various SDK elements

            # default, used for any bucket not specified below
            key: ""
            secret: ""
            region: ""
            credentials: [
                {
                    host: "XXX.XXX.XXX.XXX:9000"
                    bucket: "clearml"
                    key: ...
                    secret: ...
                    multipart: false
                    secure: false
                }
            ]
        }
        boto3 {
            pool_connections: 512
            max_multipart_concurrency: 16
        }
    }

@jkhenning
Copy link
Member

@JamesNeumann the clearml.conf file is your local file (on your local machine), right?

@JamesNeumann
Copy link

@JamesNeumann the clearml.conf file is your local file (on your local machine), right?

Yes, it is the local file on the machine that is running clearml-serving.

@jkhenning
Copy link
Member

I think you'll need to apply the same setting to the clearml-serving-triton docker-compose service. I think that currently there's no way to specify this setting for the global AWS credentials section, which means you'll need to inject that somehow into the clearml-serving-triton service. I would try adding a file mount, mounting a clearml.conf file containing your minio configuration section in the format of:

sdk {
  aws {
    s3 {
      ...
    }
  }
}

into /root/clearml.conf inside the container

@JamesNeumann
Copy link

Mounting the clearml.conf into /root/clearml.conf on the clearml-serving-inference service resolved the issue.
The clearml-serving-triton service does not seem to be needing the file.

Thanks for the quick help @jkhenning!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants