Skip to content

Commit 5a2fdfa

Browse files
authored
feat: Add utilities for working with LocalStack & other tidying (#22)
* Added LOG_LEVEL environment variable * Enable consumer to work with localstack SQS * Add some utilities for common tasks * Remove unneeded scaffolding-related items
1 parent 3d055f3 commit 5a2fdfa

File tree

15 files changed

+187
-263
lines changed

15 files changed

+187
-263
lines changed

Dockerfile.consumer

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ FROM senzing/senzingsdk-runtime:4.0.0
33
USER root
44

55
RUN apt-get update \
6-
&& apt-get -y install --no-install-recommends curl python3 python3-pip python3-boto3 \
6+
&& apt-get -y install --no-install-recommends python3 python3-pip python3-boto3 \
77
&& apt-get -y autoremove \
88
&& apt-get -y clean
99

Dockerfile.exporter

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ FROM senzing/senzingsdk-runtime:4.0.0
33
USER root
44

55
RUN apt-get update \
6-
&& apt-get -y install --no-install-recommends curl python3 python3-pip python3-boto3 \
6+
&& apt-get -y install --no-install-recommends python3 python3-pip python3-boto3 \
77
&& apt-get -y autoremove \
88
&& apt-get -y clean
99

Dockerfile.tools

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ RUN echo "deb http://apt.postgresql.org/pub/repos/apt bookworm-pgdg main" > /etc
99
# Update packages and install additional dependencies.
1010
RUN apt-get update && \
1111
apt-get upgrade -y && \
12-
apt-get install --no-install-recommends -y awscli pipx postgresql-client senzingsdk-poc && \
12+
apt-get install --no-install-recommends -y awscli pipx postgresql-client senzingsdk-poc python3 python3-pip python3-boto3 && \
1313
apt-get autoremove \
1414
&& apt-get clean
1515

@@ -30,9 +30,15 @@ USER senzing
3030
ENV PATH="$PATH:/home/senzing/.local/bin"
3131
RUN pipx install awscli-local
3232

33+
ENV PYTHONPATH=$PYTHONPATH:/home/senzing/dev:/usr/lib/python3/dist-packages
34+
3335
# Define volumes necessary to support a read-only root filesystem on ECS
3436
# Fargate.
3537
VOLUME ["/home/senzing", "/var/lib/amazon", "/var/log"]
3638

3739
WORKDIR /home/senzing
40+
41+
RUN mkdir -p dev
42+
COPY dev-scripts/* dev
43+
3844
ENTRYPOINT ["/entrypoint.sh"]

README.md

Lines changed: 119 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -51,39 +51,13 @@ and run the consumer service on our local machine. This setup includes:
5151
docker compose up -d
5252
```
5353

54-
### Consumer
55-
56-
Spinning up a consumer service (intended to be a continually-running process; in
57-
a production scenarion, multiple instances could be running simultaneously as
58-
needed):
59-
60-
```bash
61-
docker compose run --env AWS_PROFILE=some-profile-name --env \
62-
Q_URL="http://sqs.us-east-1.localhost.localstack.cloud:4566/000000000000/sqs-senzing-local-ingest" \
63-
consumer
64-
```
65-
66-
### Exporter
67-
68-
Spinning up the exporter middleware (this is intended to be an ephemeral
69-
container):
70-
71-
```bash
72-
docker compose run --env AWS_PROFILE=localstack --env S3_BUCKET_NAME=sqs-senzing-local-export exporter
73-
```
74-
75-
You can view information about files in the Localstack S3 bucket by visiting
76-
this URL:
77-
78-
http://localhost:4566/sqs-senzing-local-export
54+
### Using the services (tools container)
7955

80-
### Using the services (Tools container)
56+
Access the `tools` container to interact with the services:
8157

82-
1. Access the `tools` container to interact with the services:
83-
84-
```bash
85-
docker compose run tools /bin/bash
86-
```
58+
```bash
59+
docker compose run tools /bin/bash
60+
```
8761

8862
The `tools` container should be configured with the necessary environment
8963
variables to interact with the SQS and S3 services in LocalStack, as well as the
@@ -116,6 +90,120 @@ sz_command -C add_record \
11690
PEOPLE 1 '{"NAME_FULL":"Robert Smith", "DATE_OF_BIRTH":"7/4/1976", "PHONE_NUMBER":"555-555-2088"}'
11791
```
11892

93+
#### Loading sample data
94+
95+
From inside the tools container:
96+
97+
1. Download the sample data sets; see:
98+
https://senzing.com/docs/quickstart/quickstart_docker/#download-the-files
99+
2. Register the data source names using `sz_configtool`; see:
100+
https://senzing.com/docs/quickstart/quickstart_docker/#add-the-data-source
101+
3. Actually load each of the data files into the Senzing database, i.e.:
102+
103+
sz_file_loader -f customers.jsonl
104+
sz_file_loader -f reference.jsonl
105+
sz_file_loader -f watchlist.jsonl
106+
107+
#### Additional utilities
108+
109+
##### Senzing and the database
110+
111+
Load a single record as a simple test:
112+
113+
docker compose run tools python dev/add_1_record.py
114+
115+
Purge the database:
116+
117+
docker compose run tools python dev/db_purge.py
118+
119+
##### S3
120+
121+
You might need to configure an AWS profile before using these S3-related
122+
utilities. See further down below for how to do that.
123+
124+
Copy a file out of the LocalStack S3 bucket into `~/tmp` on your machine (be
125+
sure this folder already exists -- on macOS, that would be
126+
`/Users/yourusername/tmp`):
127+
128+
> [!NOTE]
129+
> You will need to manually create `/Users/yourusername/tmp` if it
130+
> doesn't already exist.
131+
132+
# Here, `hemingway.txt` is the file you wish to retrieve from S3.
133+
docker compose run tools python3 dev/s3_get.py hemingway.txt
134+
135+
Purge the LocalStack S3 bucket:
136+
137+
docker compose run tools python3 dev/s3_purge.py
138+
139+
## Middleware
140+
141+
There are three middleware applications:
142+
143+
- consumer (continually-running service)
144+
- redoer (continually-running service)
145+
- exporter (ephemeral container)
146+
147+
### Configuring an AWS profile for LocalStack
148+
149+
To use the middleware (consumer, etc.) with LocalStack, an AWS profile specific
150+
to LocalStack will be needed.
151+
152+
Your `~/.aws/config` file should have something like:
153+
154+
[profile localstack]
155+
region = us-east-1
156+
output = json
157+
ignore_configure_endpoint_urls = true
158+
endpoint_url = http://localhost:4566
159+
160+
Your `~/.aws/credentials` file should have:
161+
162+
[localstack]
163+
aws_access_key_id=test
164+
aws_secret_access_key=test
165+
166+
Generally speaking, the `endpoint_url` argument will be needed when
167+
instantiating client objects for use with particular LocalStack services, e.g.:
168+
169+
sess = boto3.Session()
170+
if 'AWS_ENDPOINT_URL' in os.environ:
171+
return sess.client('s3', endpoint_url=os.environ['AWS_ENDPOINT_URL'])
172+
else:
173+
return sess.client('s3')
174+
175+
### Consumer
176+
177+
Spinning up the consumer middleware (intended to be a continually-running
178+
process; in a production scenario, multiple instances could be running
179+
simultaneously as needed):
180+
181+
```bash
182+
docker compose run --env AWS_PROFILE=localstack --env \
183+
Q_URL="http://sqs.us-east-1.localhost.localstack.cloud:4566/000000000000/sqs-senzing-local-ingest" \
184+
--env LOG_LEVEL=INFO consumer
185+
```
186+
187+
`LOG_LEVEL` is optional; defaults to `INFO`.
188+
189+
### Exporter
190+
191+
Spinning up the exporter middleware (this is intended to be an ephemeral
192+
container):
193+
194+
```bash
195+
docker compose run --env AWS_PROFILE=localstack --env S3_BUCKET_NAME=sqs-senzing-local-export \
196+
--env LOG_LEVEL=INFO exporter
197+
```
198+
199+
`LOG_LEVEL` is optional; defaults to `INFO`.
200+
201+
You can view information about files in the LocalStack S3 bucket by visiting
202+
this URL:
203+
204+
http://localhost:4566/sqs-senzing-local-export
205+
206+
119207
[awslocal]: https://docs.localstack.cloud/aws/integrations/aws-native-tools/aws-cli/#localstack-aws-cli-awslocal
120208
[localstack]: https://www.localstack.cloud/
121209
[senzing]: https://senzing.com
File renamed without changes.

middleware/sz_purge.py renamed to dev-scripts/db_purge.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,9 @@
1616
sz_factory = sz_core.SzAbstractFactoryCore("sz_factory_1", senzing_engine_configuration_json)
1717
sz_diagnostic = sz_factory.create_diagnostic()
1818

19-
print('Are you sure you want to purge the repository? If so, type YES:')
19+
print('Are you sure you want to purge the database? If so, type YES:')
2020
ans = input('>')
2121
if ans == 'YES':
2222
sz_diagnostic.purge_repository()
2323
else:
24-
print('Everything left as-is.')
24+
print('Nothing was done. Everything was left as-is.')

dev-scripts/s3_get.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
import os
2+
import sys
3+
import boto3
4+
5+
def make_s3_client():
6+
try:
7+
sess = boto3.Session()
8+
return sess.client('s3', endpoint_url=os.environ['AWS_ENDPOINT_URL'])
9+
except Exception as e:
10+
print(e)
11+
sys.exit(1)
12+
13+
def get_file_from_s3(key):
14+
'''Get file from S3 and write to /tmp (use docker-compose to map this
15+
to desired directory on host machine).'''
16+
s3 = make_s3_client()
17+
print('Grabbing file...')
18+
resp = s3.download_file(os.environ['S3_BUCKET_NAME'], key, '/tmp/'+key)
19+
print ('Got file, put in tmp')
20+
21+
print("Starting util_s3_retrieve ...")
22+
fname = sys.argv[1]
23+
get_file_from_s3(fname)
24+
print("Done")

dev-scripts/s3_purge.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
import os
2+
import boto3
3+
4+
def purge_s3():
5+
s3 = boto3.resource('s3', endpoint_url=os.environ['AWS_ENDPOINT_URL'])
6+
buck = s3.Bucket(os.environ['S3_BUCKET_NAME'])
7+
print('Purging...')
8+
buck.objects.all().delete()
9+
10+
print('Are you sure you want to purge the S3 bucket (' + os.environ['S3_BUCKET_NAME'] + ')? If so, type YES:')
11+
ans = input('>')
12+
if ans == 'YES':
13+
purge_s3()
14+
print('Done.')
15+
else:
16+
print('Nothing was done. Everything was left as-is.')

docker-compose.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,13 @@ services:
7575
"CONNECTION": "postgresql://${POSTGRES_USERNAME:-senzing}:${POSTGRES_PASSWORD:-senzing}@db:5432:${POSTGRES_DB:-G2}/?sslmode=disable"
7676
}
7777
}
78+
AWS_PROFILE: localstack
79+
S3_BUCKET_NAME: sqs-senzing-local-export
80+
PYTHONPATH: ${PYTHONPATH}:/opt/senzing/er/sdk/python:/home/senzing/dev:/usr/lib/python3/dist-packages
81+
PYTHONUNBUFFERED: 1 # Flush buffer - helps with print statements.
82+
volumes:
83+
- ~/tmp:/tmp
84+
- ~/.aws:/home/senzing/.aws
7885

7986
consumer:
8087
build:

middleware/consumer.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -51,9 +51,10 @@ def init():
5151
'''Returns sqs client object'''
5252
try:
5353
sess = _make_boto_session()
54-
sqs = sess.client('sqs')
55-
log.info(AWS_TAG + 'SQS client object instantiated.')
56-
return sqs
54+
if 'AWS_ENDPOINT_URL' in os.environ:
55+
return sess.client('sqs', endpoint_url=os.environ['AWS_ENDPOINT_URL'])
56+
else:
57+
return sess.client('sqs')
5758
except Exception as e:
5859
log.error(AWS_TAG + str(e))
5960
sys.exit(1)
@@ -66,9 +67,8 @@ def get_msgs(sqs, q_url):
6667
- Body -- here, should be the JSONL record as a string
6768
'''
6869
while 1:
69-
print('waiting for msg')
7070
try:
71-
log.info(AWS_TAG + 'Polling SQS for the next message')
71+
log.debug(AWS_TAG + 'Polling SQS for the next message')
7272
resp = sqs.receive_message(QueueUrl=q_url, MaxNumberOfMessages=1,
7373
WaitTimeSeconds=POLL_SECONDS)
7474
if 'Messages' in resp and len(resp['Messages']) == 1:
@@ -155,7 +155,7 @@ def go():
155155
# Get next message.
156156
msg = next(msgs)
157157
receipt_handle, body = msg['ReceiptHandle'], msg['Body']
158-
log.info('SQS message retrieved, having ReceiptHandle: '
158+
log.debug('SQS message retrieved, having ReceiptHandle: '
159159
+ receipt_handle)
160160
rcd = json.loads(body)
161161

0 commit comments

Comments
 (0)