|
1 |
| -# byoc-ingestor |
| 1 | +# Utility tool for Sentinel Hub BYOC service |
2 | 2 |
|
| 3 | +The Sentinel Hub BYOC Tool is a utility tool available as a Docker image (in this case) and a Java jar, which can be used to prepare your data for use in Sentinel Hub. |
| 4 | + |
| 5 | +It converts your TIFF and JP2 files to Cloud Optimized GeoTIFFs, uploads them to AWS S3 and registers them in the Sentinel Hub BYOC service. When complete, your data should be visible in Sentinel Hub. The same steps can be done manually and are detailed in our documentation https://docs.sentinel-hub.com/api/latest/#/API/byoc, should you prefer or require more control over the process. |
| 6 | + |
| 7 | +## Prerequisites |
| 8 | + |
| 9 | +- A Sentinel Hub OAuth client -- if you don't have one, create one using our [web application](https://apps.sentinel-hub.com/dashboard). Click [here](https://docs.sentinel-hub.com/api/latest/#/API/authentication) for instructions. |
| 10 | + |
| 11 | +- A BYOC collection -- if you don't have one, create one using our [web application](https://apps.sentinel-hub.com/dashboard/#/byoc) or [API](https://docs.sentinel-hub.com/api/latest/reference/?service=byoc). |
| 12 | + |
| 13 | +- The AWS credentials with access to your bucket -- Get them from the AWS console. These are only used to upload your data and read data that is registered in BYOC service. |
| 14 | + |
| 15 | +- Your bucket configured so that Sentinel Hub can access data from it -- how to do this is documented [here](https://docs.sentinel-hub.com/api/latest/#/API/byoc?id=configuring-the-bucket). This is necessary because this tool and Sentinel Hub are separate. |
| 16 | + |
| 17 | +- Imagery! (Of course) |
| 18 | + |
| 19 | +## Basic setup |
| 20 | + |
| 21 | +Provide the Sentinel Hub OAuth client id and client secret in the environment variables `SH_CLIENT_ID` and `SH_CLIENT_SECRET`. |
| 22 | + |
| 23 | +Provide the AWS client credentials in the environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` or by mounting the folder `~/.aws/credentials`. |
| 24 | + |
| 25 | +You can bundle all variables in a text file, e.g. named env.txt: |
| 26 | + |
| 27 | +``` |
| 28 | +SH_CLIENT_ID=<MySentinelHubClientId> |
| 29 | +SH_CLIENT_SECRET=<MySentinelHubClientSecret> |
| 30 | +AWS_ACCESS_KEY_ID=<MyAwsAccessKeyId> |
| 31 | +AWS_SECRET_ACCESS_KEY=<MyAwsSecretAccessKey> |
| 32 | +``` |
| 33 | + |
| 34 | +and pass it to Docker: `docker run --env-file env.txt sentinelhub/byoc-tool <Arguments...>` |
| 35 | + |
| 36 | +## Basic Commands |
| 37 | + |
| 38 | +For a list of commands run: `docker run sentinelhub/byoc-tool --help` |
| 39 | + |
| 40 | +For a list of ingestion parameters run: `docker run sentinelhub/byoc-tool ingest --help` |
| 41 | + |
| 42 | +To ingest files, you need to mount the imagery folder inside Docker, for example: `-v <MyFolder>:/folder` to mount the folder `<MyFolder>` to `/folder` inside Docker. |
| 43 | + |
| 44 | +Then you give the tool the BYOC collection id `<MyCollectionId>` you wish to import to and the path to the folder inside the Docker (in this case `/folder`): |
| 45 | + |
| 46 | +The basic import command (see the next chapter for details) is thus: `docker run --env-file env.txt -v <MyFolder>:/folder sentinelhub/byoc-tool ingest <MyCollectionId> /folder` |
| 47 | + |
| 48 | + |
| 49 | +## The Simple Default Case |
| 50 | + |
| 51 | +The tool offers parameters which will allow tuning for various folder/file structures. The default case, which needs no additional parameters is as follows: |
| 52 | +By default, the tool takes the input folder and looks for folders inside which have tiff or jp2 images. In this case, each such folder found represents a tile and each file represents a band. For example, if you have files at the following locations: |
| 53 | + |
| 54 | +- folder/ |
| 55 | + - tile_1/ |
| 56 | + - B01.tif |
| 57 | + - B02.tif |
| 58 | + - tile_2/ |
| 59 | + - B01.tif |
| 60 | + - B02.tif |
| 61 | + |
| 62 | +with `folder/` as the input path, the tool would ingest 2 tiles with names `tile_1` and `tile_2`, and each tile would have two bands named `B01` and `B02`. By default, band names equal the file names without file extensions. |
| 63 | + |
| 64 | +The command will prepare Cloud Optimized GeoTIFFs and upload them to S3 bucket associated with the BYOC collection. Finally, it will register tiles in your BYOC collection. The file `/folder/tile_1/B01.tiff` will be uploaded to`s3://<MyBucket>/tile_1/B01.tiff`. |
| 65 | + |
| 66 | +For more elaborate folder, tile, band structures, see the help of the `--file-pattern` and `--file-map` parameters. |
| 67 | + |
| 68 | +Note that in this case the tile sensing time will not be set and that the tile coverage will not be traced (see the Tracing Coverage chapter). |
| 69 | + |
| 70 | +## Advanced Example |
| 71 | + |
| 72 | +The tool can be quite powerful with the right parameters. This example will attempt to showcase these without being too complicated. |
| 73 | + |
| 74 | +Suppose in this case that the folder structure is as follows: |
| 75 | + |
| 76 | +- folder/ |
| 77 | + - tile_1/ |
| 78 | + - DATA_and_sensing_time_1.tif |
| 79 | + - tile_2/ |
| 80 | + - DATA_and_sensing_time_2.tif |
| 81 | + |
| 82 | +In this case lets assume the DATA tiffs are three bands each, containing R,G,B bands. |
| 83 | + |
| 84 | +To effectively use the tool in this case, the `--file-pattern` and `--file-map` parameters need to be used. The `--file pattern` in this case can look something like this: `(?<tile>.*)\/.*(?<year>[0-9]{4})(?<month>[0-9]{2})(?<day>[0-9]{2})T(?<hour>[0-9]{2})(?<minute>[0-9]{2})(?<second>[0-9]{02})`. This will find files with the defined sensing time structure and use the name of their parent folder as the tile name. This can be modified to support multiple files per folder or even files in different folders which together represent one tile. |
| 85 | + |
| 86 | +The `--file-map` parameter allows all bands from the tiff file to be used. In this case since there is only one file per tile only one is needed and it can look something like this: `.*tif;1:R;2:G;3:B`. In words: From a .tif file extract band 1 and name it R, extract band 2 and name it G, extract band 3 and name it B. |
| 87 | + |
| 88 | +To remember: `--file-pattern` finds files using a regular expression. Files with an equal `tile` capture group value are grouped into that one tile. The `--file-map` pattern is then applied to each file within that tile. You can define as many `--file-map` parameters as are files in a tile so that each file can be mapped. |
| 89 | + |
| 90 | + |
| 91 | +## Tracing Coverage |
| 92 | + |
| 93 | +Information about what coverage tracing is and why it is important is available [here](https://docs.sentinel-hub.com/api/latest/#/API/byoc?id=a-note-about-cover-geometries). |
| 94 | + |
| 95 | +To enable geometry tracing set the flag `--trace-coverage`. See `--distance-tolerance` and `--negative-buffer` for tuning parameters. If not set, the cover geometry will equal the image bounding box. |
| 96 | + |
| 97 | +To speed up tracing, you can trace coverage from one of image overviews. For example, to trace coverage from the first overview, set the flag `--trace-image-idx 1`. |
| 98 | + |
| 99 | +## Building a docker image |
| 100 | + |
| 101 | +``` |
| 102 | +./gradlew build |
| 103 | +docker build -t byoc-tool . |
| 104 | +``` |
| 105 | + |
| 106 | +## Building an executable |
| 107 | + |
| 108 | +Download OpenJDK 14. |
| 109 | + |
| 110 | +Set JPACKAGE_HOME and JLINK_HOME to OpenJDK 14 location. |
| 111 | + |
| 112 | +Run `gradlew jpackage` |
| 113 | + |
| 114 | +The executable will be located in the project root. |
0 commit comments