Releases: Beit-Hatfutsot/mojp-dbs-pipelines
Releases · Beit-Hatfutsot/mojp-dbs-pipelines
0.2.0
0.1.4
- Add a new pipeline for the external source 'Bagnowka':
- added docker monitoring stack (grafana / cadvisor / influx) - http://devapi.dbs.bh.org.il:3180/dashboard/db/cadvisor
- many fixes to deal with memory problems and other bugs in the clearmash pipeline
- clearmash: limit downloading of related items only to items with collection + fewer then 50 relateds
- add main image url for photoUnits (part of #24)
- #78 improved handling and deletion of not allowed items
- #76 added support for images attached to each doc
- images field is synced to elasticsearch as part of the common schema
- it contains an array of images, like this:
[{"url": "http://url.to.image", "thumbnail_url": "http://url.to.thumbnail.image"}, ... ]
- Modify Bagnowka pipeline (download and convert process) to populate the image and thumbnail urls.
- added support for detecting collection by template_id (#72)
DB / schema changes
- added column display_allowed to clearmash-entities table
- added images attribute in clearmash (index mapping should be updated in Beit-Hatfutsot/dbs-back#215 before deploying)
0.1.3
- #14 clearmash api and sync should support start date for images
- #45 / #58 / #59 - clearmash api should support field types: related documents, child documents and media galleries
- #49 added support for related_documents field in the common data source and in clearmash
- field name in Elasticsearch is
related_documents_*
where * is source specific category / type of related document - for clearmash it is the field id, e.g.:
related_documents__c6_beit_hatfutsot_bh_base_template_related_place
- value could be null, string of single doc, or array of docs
- each doc is represented by the main ES doc id (e.g.
clearmash_261783
)
- field name in Elasticsearch is
- #64 refactor to pipelines structure, might solve the photoUnits problem
- see the PR for more details: #64
- add postgresql DB + elasticsearch services + misc. docker improvements
- #61 modify clearmash processors to allow efficient download of related docs
- disabled the automatic deploy on push to master because we have long running pipelines which are interrupted on deploy
0.1.2
- #25 Update api to get related docs of an item by field
- #29 change clearmash pipeline to have a separate pipeline for each folder / collection
- #19 add slugs for all items (#31)
- #33 add slugs attribute for ES with all slugs from all languages
- #34 add title_suggest field ensured to always have a value with min length of 1 (if no title, will use value of
_
) - #17 only sync items allowed to be shown
- clearmash: only processes items that have right permissions (based on the old BHP logic, adjusted for CM)
- common docs: only sync items that have content in either he or en
- #41 ensure all processors provide details about which document failed when raising exceptions
- added logging of all clearmash api calls in the pipelines dashboard log
- fix failure of pipeline due to items missing many fields (#39)
- allow to run processor only for specific items for debugging (using CLEARMASH_OVERRIDE_ITEM_IDS environment variable)
- #15 delete items which weren't synced
- delete processor runs after sync, it aggregates all ids which were synced, then compares with all ids in ES and deletes any items which were not in the sync
- this assumes we download all the data on every sync run (which we do for clearmash)
- #35 ensure slug uniqueness (if slug conflict is found, slug is prepended with the item id)
- sync to ES should save dicts as json
deployment
- first, deploy dbs-back 0.13.5 - and create the new index
- then, deploy normally (merge)
- check travis to ensure it deployed to dev
0.1.1
- Add lower-case titles for all available languages.
- travis: deploy to dev on push to master
deployment (to dev)
gcloud compute ssh bhs-dev-3
rm mojp-dbs-pipelines
git clone https://github.com/Beit-Hatfutsot/mojp-dbs-pipelines
cd mojp-dbs-pipelines
cp ../mojp-dbs-pipelines-0.0.2/docker-compose.override.yml ./
make docker-build
make docker-start
ssh-keygen -t rsa -b 4096 -C "deploy-mojp-dbs-pipelines" -f /home/bhs/deploy-mojp-dbs-pipelines.id_rsa
echo "command="/home/bhs/deploy-mojp-dbs-pipelines.sh $SSH_ORIGINAL_COMMAND" `cat /home/bhs/deploy-mojp-dbs-pipelines.id_rsa.pub`" >> ~/.ssh/authorized_keys
echo '#!/usr/bin/env bash
cd /home/bhs/mojp-dbs-pipelines
make deploy' > /home/bhs/deploy-mojp-dbs-pipelines.sh
chmod +x /home/bhs/deploy-mojp-dbs-pipelines.sh
v0.1.0
deployment to dev environment
gcloud compute ssh bhs-dev-3
sudo su -l bhs
wget https://github.com/Beit-Hatfutsot/mojp-dbs-pipelines/archive/v0.0.2.tar.gz -O mojp-dbs-pipelines-v0.0.2.tar.gz
tar -xzvf mojp-dbs-pipelines-v0.0.2.tar.gz
ln -s mojp-dbs-pipelines-0.0.2/ mojp-dbs-pipelines
sudo apt-get remove docker docker-engine docker-compose docker.io
sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable"
sudo apt-get update
sudo apt-get install docker-ce
curl -L https://github.com/docker/compose/releases/download/1.13.0/docker-compose-`uname -s`-`uname -m` | sudo tee /usr/local/bin/docker-compose > /dev/null
sudo chmod +x /usr/local/bin/docker-compose
sudo usermod -aG docker $USER
sudo su -l $USER
docker ps
cd ~/mojp-dbs-pipelines
make docker-build
cp docker-compose.override.yml{.example,}
nano docker-compose.override.yml
I had to set IP of the elasticsearch server, as there was a problem with using the internal DNS name, currently it writes to bhs-dev-db
and index next-mojp-dev