Skip to content

Commit 3289c53

Browse files
committed
Initial docs from existing issue
at #2
1 parent 6594001 commit 3289c53

File tree

2 files changed

+175
-0
lines changed

2 files changed

+175
-0
lines changed

README.rst

+4
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@ Plugin to swap OMERO filesets with NGFF
1313
Usage
1414
=====
1515

16+
For the full workflow used to update IDR with NGFF data, see
17+
docs.md.
18+
19+
1620
To create sql containing required functions and run it:
1721

1822
::

docs.md

+171
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
Once a submission has been processed by BioStudies, it will become available at a URL like:
2+
https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD815.html
3+
4+
We need to harvest the uuids from the links in the "Viewable Images" table. We can do this with the following JavaScript code, pasted into the `Console` tab of the browser dev tools:
5+
6+
```
7+
let csv = "";
8+
$("#viewable tbody tr").each(function() {
9+
let $this = $(this);
10+
if ($("a", $this).length == 0) return
11+
let uid = $( "a:first", $this).attr("href").replace(".html", "");
12+
let zarrname = $( "td:nth-child(3)", $this).text().replace(".zip", "");
13+
csv += `${zarrname},${uid}\n`
14+
});
15+
console.log(csv);
16+
```
17+
Which will print something like this:
18+
```
19+
idr0051/180712_H2B_22ss_Courtney_p00_c00_reg_preview.klb.ome.zarr,S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97
20+
idr0051/embryo_dmso_2_new_17-00-44_p00_c00_reg_preview.klb.ome.zarr,S-BIAD815/b2633930-86b0-489e-a845-d2a7afe6ff15
21+
idr0051/180712_H2B_22ss_Courtney1_20180712-163837_p00_c00_preview.ome.zarr,S-BIAD815/c49efcfd-e767-4ae5-adbf-299cafd92120
22+
idr0051/2018-06-28_21ss_DMSO_TF_20180628-185945_p00_c00_reg_preview.ome.zarr,S-BIAD815/e12a8e2a-4fce-4579-a78b-b0c4597c3ada
23+
```
24+
25+
That CSV is a table of `filesetName.ome.zarr, UUID`. We need to add the Fileset IDs from IDR to that table, using `idr-util` scripts from https://github.com/IDR/idr-utils/pull/56
26+
That PR contains a file `idr_filesets.csv` which contains `Fileset ID, filesetName.ome.zarr` from IDR.
27+
It also contains a script to take the csv from above and add the appropriate Fileset IDs (from `idr_filesets.csv`).
28+
29+
Checkout the `idr-utils` branch of that PR. This can be done on a local machine.
30+
Copy the csv generated by JavaScript above and save it into a file like `idr-utils/scripts/ngff_filesets/idr0051.csv`. You will see some examples included in that PR.
31+
Then run the script, passing in the IDR ID...
32+
33+
```
34+
$ cd idr-utils/scripts/ngff_filesets
35+
$ python parse_bia_uuids.py idr0051
36+
```
37+
38+
This will update the csv file you just created, adding in the Fileset IDs to a new 3rd column.
39+
40+
Now we want to use that data with `omero-mkngff`.
41+
We need to do everything as the `omero-server` user since we'll want to be able to create symlinks from the ManagedRep.
42+
43+
E.g. working on `idr0138-pilot`...
44+
45+
```
46+
$ sudo -u omero-server -s
47+
```
48+
49+
Created conda environment created as `omero-server` user, e.g. `mkngff` and installed omero-py and `omero-mkngff`
50+
51+
```
52+
conda create -n mkngff -c conda-forge -c ome omero-py bioformats2raw
53+
conda activate mkngff
54+
pip install 'omero-mkngff @ git+https://github.com/IDR/omero-mkngff@main'
55+
```
56+
57+
Get Database password (and host) needed for psql, and set these to env variables. Also set variable for `$IDRID` so you can copy and paste other commands from below...
58+
59+
```
60+
export IDRID=idr0012
61+
export OMERODIR=/opt/omero/server/OMERO.server
62+
omero config get | grep omero.db.host
63+
$ export DBHOST=192.168.10.231
64+
omero config get --show-password | grep omero.db.pass
65+
export PGPASSWORD=[********]
66+
```
67+
Use psql to get SECRET (last session ID). NB: for pilot servers we only have 1 process (as in this example). For other servers, update the `1` to `3` in this psql command:
68+
```
69+
psql -U omero -d idr -h $DBHOST -c "select uuid from (select * from session where node = 0 and owner = 0 and defaulteventtype = 'Sessions' order by id desc limit 1) x order by x.id asc limit 1;"
70+
uuid
71+
--------------------------------------
72+
8add790d-7855-46f6-8239-c6a72937d572
73+
(1 row)
74+
75+
export SECRET=8add790d-7855-46f6-8239-c6a72937d572
76+
```
77+
78+
Copy the contents of `idr0051.csv` table from above (contains `Fileset ID` and `UUID`) and create a copy of the csv in the `omero-server` user's home dir...
79+
80+
```
81+
$ cd
82+
$ vi $IDRID.csv # paste in the csv contents from above
83+
```
84+
85+
Now we can read that csv and create an sql file for each Fileset (named `FILESET_ID.sql`).
86+
In the loop below, `biapath` is like `S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97` and `uuid` is like `51afff7c-eed4-44b4-95c7-1437d8807b97`.
87+
88+
89+
The BIA s3 repository should be mounted under `/bia-integrator-data`:
90+
91+
```
92+
sudo mkdir /bia-integrator-data && sudo /opt/goofys --endpoint https://uk1s3.embassy.ebi.ac.uk/ -o allow_other bia-integrator-data /bia-integrator-data
93+
```
94+
95+
Check that e.g. `$ ls /bia-integrator-data/S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97/51afff7c-eed4-44b4-95c7-1437d8807b97.zarr` will give you `0 OME`
96+
The`omero mkngff` command below also creates the symlinks we need, from the ManagedRepository to the s3-mounted data (if they don't already exist).
97+
98+
```
99+
# first output sql functions and login...
100+
omero mkngff setup > setup.sql
101+
omero login
102+
103+
$ mkdir -p $IDRID
104+
$ for r in $(cat $IDRID.csv); do
105+
biapath=$(echo $r | cut -d',' -f2)
106+
uuid=$(echo $biapath | cut -d'/' -f2)
107+
fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
108+
omero mkngff sql $fsid --clientpath="https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/$biapath/$uuid.zarr" "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
109+
done
110+
111+
# IF YOU WANT TO EXECUTE SQL IMMEDIATELY... include $SECRET and create symlinks...
112+
$ for r in $(cat $IDRID.csv); do
113+
biapath=$(echo $r | cut -d',' -f2)
114+
uuid=$(echo $biapath | cut -d'/' -f2)
115+
fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
116+
omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid --clientpath="https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/$biapath/$uuid.zarr" "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql" --bfoptions
117+
psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
118+
done
119+
120+
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
121+
Found prefix demo_2/Blitz-0-Ice.ThreadPool.Server-14/2018-11/26 // 10-39-49.639 for fileset 604306
122+
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-14/2018-11/26/10-39-49.639
123+
Creating dir at /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-14/2018-11/26/10-39-49.639_converted/bia-integrator-data/S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97
124+
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-14/2018-11/26/10-39-49.639_converted/bia-integrator-data/S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97/51afff7c-eed4-44b4-95c7-1437d8807b97.zarr -> /bia-integrator-data/S-BIAD815/51afff7c-eed4-44b4-95c7-1437d8807b97/51afff7c-eed4-44b4-95c7-1437d8807b97.zarr
125+
BEGIN
126+
mkngff_fileset
127+
----------------
128+
5811532
129+
(1 row)
130+
COMMIT
131+
...
132+
```
133+
134+
**Running sql on a different server (using saved sql)**
135+
136+
Zip and copy sql to a different server.
137+
Unzip and update the SECRET in all sql files, getting current `$SECRET` as above
138+
The replace didn't work using `$SECRET` etc in the regex, so just use actual values...
139+
`SECRETUUID` is the default placeholder if you didn't use `--secret` option to create sql.
140+
141+
```
142+
$ for i in $(ls); do sed -i 's/SECRETUUID/fc5d3566-eea0-412c-849e-daa6d3c6bfcc/g' $i; done
143+
```
144+
We want to execute all sql, using the csv, and also to use `omero mkngff` to do just the symlink creation...
145+
```
146+
$ for r in $(cat $IDRID.csv); do
147+
biapath=$(echo $r | cut -d',' -f2)
148+
uuid=$(echo $biapath | cut -d'/' -f2)
149+
fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
150+
psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
151+
omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/bia-integrator-data/$biapath/$uuid.zarr" --bfoptions
152+
done
153+
154+
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
155+
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2017-03/07/16-50-40.721
156+
Creating dir at /data/OMERO/ManagedRepository/demo_2/2017-03/07/16-50-40.721_mkngff
157+
Creating symlink /data/OMERO/ManagedRepository/demo_2/2017-03/07/16-50-40.721_mkngff/e45c988b-945e-49d6-8c6a-7284a2b0525e.zarr -> /bia-integrator-data/S-BIAD848/e45c988b-945e-49d6-8c6a-7284a2b0525e/e45c988b-945e-49d6-8c6a-7284a2b0525e.zarr
158+
```
159+
160+
Now we can try viewing the images in webclient.
161+
NB: sometimes this can take a while for the memo file to be regenerated. To check on the timings you can use unique string from the fileset name
162+
163+
```
164+
grep -A 2 "saved memo" /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep -A 2 "46.368_mkngff"
165+
166+
2023-08-29 12:21:51,993 DEBUG [ loci.formats.Memoizer] (l.Server-4) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_mkngff/HT20.ome.zarr/OME/.METADATA.ome.xml.bfmemo (3838714 bytes)
167+
2023-08-29 12:21:51,993 DEBUG [ loci.formats.Memoizer] (l.Server-4) start[1693309192879] time[2519114] tag[loci.formats.Memoizer.setId]
168+
2023-08-29 12:21:51,995 INFO [ ome.io.nio.PixelsService] (l.Server-4) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-2/2023-05/11/22-57-46.368_mkngff/HT20.ome.zarr/OME/METADATA.ome.xml Series: 0
169+
```
170+
E.g. `2519114` ms is 42 minutes.
171+

0 commit comments

Comments
 (0)