This is an application for evaluating parcels for upzoning in California based on California YIMBY laws from 2019-2023.
- Setup steps for your dev environment including Mamba environment setup.
- Old brew-based (deprecated) setup steps
- static-files.md : how React and static files are served in dev and prod.
When you've completed setup, you're ready to run. You need to run frontend and backend servers:
-
Make sure the Mamba environment is activated:
mamba activate mamba-parsnip -
Start the frontend server:
cd fe && yarn dev-- start frontend dev server. This will watch for changes and rebuild the JS bundle. -
Start the Django server:
cd be && ./manage.py runserver
Browse to http://localhost:8000/map or http://localhost:8000/dj/admin and see if things work.
If you haven't loaded any data, you should see an OpenStreetMap map at /map, but you won't see parcels.
Run ./manage.py with no options to see all the managemenet commands available, organized by Django 'app'.
This lists all commands defined by all Django apps in the project:
- built-in apps (eg "django", "auth", ...)
- third party apps (eg "django_extensions", "silk", ...)
- our apps (eg. "world", "co", "userflows", ...)
In world/management/commands/ and co/management/commands you'll find our custom command-line commands for
doing ETL and various other processing.
Some examples:
LOCAL_DB=0 ./manage.py scrape --fetch --no-cache-- daily scraping run. This requires Wireguard tunnel to cloud postgres to be running./manage.py-- list all management commands. The commmands we created are inworldandcoapps.
Custom management commands are the easiest way to write python code that interacts with our database and isn't a web app.
You can add a new file in this directory and /management/comands and it magically become a Django mgmt command.
It can be run with ./manage.py <command> <params>, where is the name of the file you created.{
We use pytest for testing. Run tests with:
poetry run pytest -sto run all tests , ORpoetry run pytest -s -k <match>to run any class or test matching
Any file that starts with test*.py is picked up by pytest. See examples in userflows/tests.py and
lib/co/test_co_eligibility.py for inspiration.
The test DB is empty by default. You can add more data by following the example.
You can get data from the existing DB by dumping it:
./manage.py dumpdata <app.model> --pks pk1,pk2,pk3 --format yaml
We've extended the dumpdata command to support APNs. Any model with an APN field can be looked up by APN,
using the --apns option. For example:
./manage.py dumpdata world.Parcel --apns 4151721900,4472421600,5571022400 --format yaml
You can take that output and load it as a fixture. See conftest.py to either append to the existing fixtures
or create a new one and load it in that file.
Tools we use, and how to run them. You should run these before committing code:
- Python:
black .: autoformat python code (and write changes to files)ruff check .: Lint python code
- Typescript:
yarn lint: Run eslintyarn prettier: autoformat typescript code (and write changes to files)
You don't usually need to do this, but if you need to debug an environment more similar to production, you can run the app in production mode as follows:
-
Build frontend files:
cd fe && yarn build && cd .. -
Collect static files:
cd be && ./manage.py collectstatic -v3 --noinput -
Serve with similar command line as in production:
DJANGO_ENV=production LOCAL_DB=1 DJANGO_SECRET_KEY=12345 poetry run gunicorn --bin :8080 --workers 3 parsnip.wsgi:application
It is possible to go even higher fidelity, by running in a docker container.
=======
We use poetry as our python package manager. It installs packages and manages your virtual environment.
A few useful commands:
poetry run <command>: run a single command in the virtual environment (alternative topoetry shellfor a single command)poetry add <package>: add a package to the project (and update pyproject.toml and poetry.lock)poetry add --group dev <package>: add a development-only packagepoetry env info: show info about the virtual environmentpoetry show --tree: show the dependency treepoetry install: install dependencies based on pyproject.toml and poetry.lock into a venv.poetry shell: start a shell in the virtual environment. That shell will give you the right python version and all the packages installed. NOTE: I think this is not necessary with the mamba-based environment.
The app is a Django application at its core.
The major components of the system are:
- DB: Postgres with PostGIS
- App server:
- Django with GeoDjango library (built-in),
- Shapely (geometry manipulation)
- GeoPandas (GeoDataFrame and GeoSeries), which includes pandas for data analysis and shapely for geometry manipulation.
- Cron server:
- Periodic jobs are run using supercron. A separate instance of the django app server is instantiated with supercron.
- Front end: (deployed in app server)
- React using Parcel for bundling.
- Deck.gl (for mapping using webgl)
- Some older pages using Shapely and react-table (react-table is kind of a nightmare)
We have separate staging and production environments, at stage-app.turboprop.ai and app.turboprop.ai. Each environment has its own Django server and Postgres+PostGIS DB server running as a set of fly.io VMs.
Both the staging and prod apps are proxied through the Cloudflare CDN. Cloudflare's TLS mode is set to "Full" for each domain so we have connection security between Cloudflare and the client, and between Cloudflare and our Django app server.
You don't need this section if you're using data that's already loaded into our cloud DB. But if you're loading new data, or setting up a new DB, follow these instructions:
- Download Parcels, Building_outlines (under MISCELLANEOUS), Zoning_base_sd, Topos_2014_2Ft_PowayLaMesa ZIP, and Topos_2014_2Ft_LaJolla.gdb files from https://www.sangis.org/ . You'll need a free account. Get the associated PDF files as well, as they are useful in describing what the data means.
- Unzip and put all files in world/data/
- Load the shape files into the DB. NOTE: Use LOCAL_DB=0 or LOCAL_DB=1 to specify cloud or local DB to load.
./manage.py load Zoning
./manage.py load Parcel
./manage.py load Buildings
./manage.py load Topography Topo_2014_2Ft_PowayLaMesa.gdb
./manage.py load Topography Topo_2014_2Ft_LaJolla.gdb
./manage.py load Roads
./manage.py load HousingSolutionArea
- Run ETL jobs as necessary, eg:
./manage.py dataprep labels all: (re-)generate labels for zones
If you're adding a new class of GIS data based on a shape files, there are some tools to make it easier.
- Try inspecting your new shape file:
ogrinfo -so <shapefile>.shp-- shows layersogrinfo -so <shapefile>.shp <layername>-- examines a layer
- Generate Django models and mapping automatically:
- Use
./manage.py ogrinspect <shapefile> <ModelName> --srid=4326 --mapping --multi - Add the generated model and mapping to models.py.
- Check which fields are nullable by running
load.py MODEL_NAME --check-nulls. This will print out which fields have null in them. You'll need to addblank=True null=Trueto the respective models.py fields.
- Perform Django migrations
./manage.py makemigrations./manage.py migrate
- Update the
load.pymanagement command to load this new type of shape file - Execute the load management command as per Importing Data section above.
See also the django GIS tutorial here, which shows using ogrinspect this way
You'll need to manipulate the generated models in a few ways:
- Load still might fail during load if any data field is empty. You'll need to add
blank=True null=Trueto the models.py field that can be null, and make and run another migration.
- TIP: To make things easier, you can set up a custom start point for the data to save, so you don't have to always run
load.pyfrom the start again. Simply addfid_range=(START,END)as an argument tolm.save(). For reference: https://docs.djangoproject.com/en/4.0/ref/contrib/gis/layermapping/
- There are no indexes or foreign keys in this model. Depending on how you intend to use it, you should consider adding those. They can be added later, of course.
The local Django app can be pointed at our fly.io DB:
- Set up Wireguard tunnel running from your machine to cloud DB at fly.io. Fly.io has instructions on this.
- From deploy/postgres directory:
fly proxy 15999:5432to put the production DB at local port 15999 - Add LOCAL_DB=0 before any
./manage.pycommand to use the cloud DB instead of local DB.
We mostly use this config for running the daily listings scrape + analysis.
You don't need this section for initial setup. But once you have data set up in one DB, and want to set up a remote DB, this info can come in handy. It's much faster to move large tables (like the table from the Parcel.shp file) using these instructions, compared to using Django management commands.
Here are some steps for copying data between databases:
Using a SQL client to create and move a table is pretty fast. Even the 3GB Parcel table can be exported and imported in a few minutes.
Here's an example of copying from your local DB to our cloud DB:
- Make sure the destination table is empty:
Delete from world_*on the correct DB would work. Just be careful! pg_dump -a -t 'world_parcel' geodjango | gzip > world_parcel_dump.sql.gzwhere 'geodjango' is the local DB name and world_parcel is the DB table to export- Optional - Send the file to the remote machine to be close to the DB. Example for AcuGIS server:
scp -i id_rsa ../world_dump.sql.gz [email protected]:~ - Load the file into the new DB. Example for AcuGIS server:
gunzip -c world_parcel_dump.sql.gz | psql -U hgiswebg_nils hgiswebg_geodjango
You can move data with Django commands. But this is very slow (Parcel table would take ~20 hours) if django is running far away from the DB, eg with Django running locally and your DB running in the cloud.
- Make sure the destination table is empty:
Delete from world_parcelon the correct DB would work. Just be careful! LOCAL_DB=1 ./manage.py dumpdata world.Parcel > parcel.json.gz: dump the data from local DB./manage.py loaddata parcel.json.gz: upload data into central DB
We use the LOCAL_DB=1 flag in our django app to select your local DB instance.
Note: it's smart to inspect the json file to make sure no other STDOUT output went into it.
We use Github Actions to run our tests and deploy to fly.io.
The config is in .github/workflows.
Github Action's runner images are found in this repo. We currently use the Ubuntu 20.04 image which includes Postgres 14.6.
This blog post has an example of a Github Action configuration with Django and Postgres.
Sometimes we get segmentation faults because of stupid issues with shapely libraries. An example of how to debug that:
lldb --arch arm64 python ./manage.py scrape -- --parcel 3091021200 --verbose -v3 --dry-run- Inside lldb, run
runto start the program, andcontwhen it pauses - When it crashes, run
btto get a backtrace
- Editable layers in the map, eg to define a housing unit. Use something like https://github.com/uber/nebula.gl, which works with deck.gl