Conversation
4fe81ac to
ea2bc77
Compare
This reverts commit ea2bc77.
DailyDreaming
left a comment
There was a problem hiding this comment.
Hmmm... I'm starting to think that the Google and Azure portions should be a little more separated.
Ideally, you should be able to operate on Google files if you have valid Google credentials and invalid Azure credentials. And vice versa. The same goes for running tests. Azure tests should print a "no azure credentials found message" and skip azure tests. The same goes for Google imo (I can add the Google side).
What do you think about an "azure" TNU_TESTMODE?
It looks like there may need to be some apt dependencies that should at least be added to the README.MD? (sudo apt install python3-gi python3-gi-cairo gir1.2-secret-1:
test_exists (__main__.TestBlobStore) ... EnvironmentCredential.get_token failed: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.
ImdsCredential.get_token failed: ManagedIdentityCredential authentication unavailable, no managed identity endpoint found.
ManagedIdentityCredential.get_token failed: ManagedIdentityCredential authentication unavailable, no managed identity endpoint found.
Runtime dependency of PyGObject is missing.
Depends on your Linux distro, you could install it system-wide by something like:
sudo apt install python3-gi python3-gi-cairo gir1.2-secret-1
If necessary, please refer to PyGObject's doc:
https://pygobject.readthedocs.io/en/latest/getting_started.html
Traceback (most recent call last):
File "/home/quokka/venv/lib/python3.9/site-packages/msal_extensions/libsecret.py", line 21, in <module>
import gi # https://github.com/AzureAD/microsoft-authentication-extensions-for-python/wiki/Encryption-on-Linux
ModuleNotFoundError: No module named 'gi'
Also, I haven't checked all of the tests, but should the azure blobstore be included in the test_open() test?:
terra-notebook-utils/tests/test_blobstore.py
Line 141 in 843a26b
terra_notebook_utils/drs.py
Outdated
| bucket_name, key = _resolve_bucket_target(dst, src_info) | ||
| dst_blob = GSBlob(bucket_name, key) | ||
| # Azure url looks like https://qijlbdgpc4zqdee.blob.core.windows.net/qi-test-container/subdir/another/qi-blob3 | ||
| if "windows.net" in dst: |
There was a problem hiding this comment.
I feel like, ideally, we'd have a wasb:// scheme to distiguish these, especially if we want plain https:// support in the future.
Would you consider making this stricter? If so, would something like:
if url.startswth('https://') and url[len('https://'):].split('/')[0].endswith('.blob.core.windows.net'):
...
or even a function like:
def is_azure_url(self, azure_url: str) -> bool:
try:
self._resolve_azure_blob_path(azure_url)
return True
except:
return False
if is_azure_url(uri):
...
Work?
There was a problem hiding this comment.
think I'll use your first suggestion..._resolve_azure_blob_path is already being called within the branch, so I think we don't want to call it twice
| client.upload_blob(data) | ||
|
|
||
| # with open("test", "wb") as data: | ||
| # data.write(client.download_blob().readall()) |
There was a problem hiding this comment.
the commented-out code is reading data which can still be interesting to test...it's commented out just becuz I was testing just write
There was a problem hiding this comment.
Would it make sense to make this its own test?
There was a problem hiding this comment.
there is a test for the actual feature in ci, which uses an Azure access key....this manual test is really just for testing out auth/permission with DefaultAzureCredential...we still need to figure out how exactly we do auth still in terra (there's some on-going discussion)
tests/manual_azure_test.py
Outdated
| return service | ||
|
|
||
| client = get_service("qijlbdgpc4zqdee").get_blob_client("qi-test-container", "qi-blob-1") | ||
| # client = get_service("qinonmanagedapp").get_blob_client("qi-test-container", "qi-blob-1") |
There was a problem hiding this comment.
I was testing different storage accounts... I'll remove this for now
| bs.blob(key).put(expected_data) | ||
| self.assertEqual(expected_data, bs.blob(key).get()) | ||
| bs.blob(key).delete() | ||
| client = bs.blob(key) |
There was a problem hiding this comment.
This seems like a blob instead of a client?
There was a problem hiding this comment.
it's the client object for doing stuff with the blob
| _copy_intra_cloud(src_blob, dst_blob, indicator_type) | ||
| elif isinstance(dst_blob, CloudBlob): | ||
| elif isinstance(dst_blob, AzureBlob): | ||
| assert isinstance(src_blob, (URLBlob, GSBlob)) |
There was a problem hiding this comment.
What if the src_blob is a LocalBlob?
There was a problem hiding this comment.
I don't know this very well...but it's copied from https://github.com/DataBiosphere/terra-notebook-utils/blob/master/terra_notebook_utils/blobstore/copy_client.py#L73
There was a problem hiding this comment.
Hmmm... I'll investigate why this is disallowed then. My preference is to replace plain asserts with exceptions and helpful debug messages, but I don't know why this wouldn't be allowed right now.
I didn't test things on local laptop..I did this, which relies on a terra image, and that should have all the system dependencies |
I'll double check that there's a test for reading data...I don't know if |
|
@Qi77Qi The instructions you added were very helpful! Even though most use-cases will be inside of Terra, I'd like to keep instructions for TNU's use outside of Terra, as I think some of the functions, particularly DRS resolution, are quite useful as a stand-alone python API. Right now I think just noting the additional apt dependencies is sufficient and helpful when not in a Terra-like environment. |
|
|
https://broadworkbench.atlassian.net/browse/IA-2932
Tested on a terra VM using
b.adm.firecaccount