You must be signed in to change notification settings - Fork 0
Community Meeting Notes Archive
The archive of Community Meeting Notes. See the most recent and tentative agenda for the next meeting on hackmd.
- 2022-01-27
- 2021-12-02
- 2021-09-30
- 2021-08-05
- 2021-05-27
- 2021-05-24 GSOC coordination
- 2021-03-25
- 2021-01-14
- 2020-08-20: A second meeting!
- 2020-05-07: A first meeting!
(attending: Martin Fleischmann, Brendan Ward, Thomas Statham, Matt Richards, Levi Wolf, Joris Van den Bossche, Alan Snow)
- Community call time
- Matt is in UTC+10, Brendan UTC-8, Alan is UTC-6, Joris is UTC+1, Martin and Levi are UTC
- Shall we consider different time or switching between them periodically?
- Next time will try later UTC time (20:00 UTC?)
- Ecocystem updates
- GeoPandas
- Shapely 2.0
- pygeos is now merged into
; this breaks GeoPandas but PR underway
- pygeos is now merged into
- Dask-geopandas
- GeoArrow spec
- Pyogrio (need another Conda release)
- GDAL / GeoArrow bridge; use GeoArrow as transport between GDAL/OGR and numpy arrays instead of WKB
- XYZ services: also have made recent updates
- NumFOCUS SDG (S2?)
- Round 1
- Call for Proposals Announcement: February 4, 2022
- Proposal Submission Deadline: March 4, 2022
- Committee Selection Deadline: March 18, 2022
- Notification to Applicants Deadline: April 15, 2022
- Ideas:
- seed funding to start bindings to Google S2 (need to follow up with Benoit); if still interested schedule follow up meeting to work out proposal details
- https://github.com/geopandas/community/issues/10
- Round 1
- GSoC
- New flexible format: contributors determine if they are short or long format
- https://developers.google.com/open-source/gsoc/timeline
- will work with NumFOCUS like last time
- Need to start drafting list of ideas; discuss at next meeting
- Consider including S2 as a project
- pure python I/O (mostly on geopackage side)
- make mapping better
- see the last year https://github.com/geopandas/geopandas/wiki/Google-Summer-of-Code-2021
- Complete outstanding tasks from GSoC 2021
- need notebook demonstrating new stuff (Thomas has ready now, just needs review)
- other minor tasks; see dask-geopandas issues
GeoPython 2022
- Basel Switzerland June 13-15 (hybrid; might have in-person component)
- Talk submission end of Feb, workshops end of March
- Might be good to have a talk on state of GeoPandas and ecosystem
- Might be good to have workshop on dask-geopandas
- Firenze Italy, Aug 22-28
- Deadlines: Talks/papers: end of Feb
- Let's follow up offline with Martin, Joris, Levi
- GeoPandas 0.11 release timeline
- should there be one more release before shapely 2.0 support?
- Issues around GeoDataFrame constructor / active geometry columns
- 0 geometry columns -> no GeoDataFrame
- >=1 geometry columns -> keep GeoDataFrame, even if active is not present
- better handle the case of the active geometry column being present
- better handle crs on the geodataframe
- dask-geopandas 0.1 release timeline
- want to include spatial shuffle, documentation updates, Hilbert distance with numpy
via pyogrio would also be nice to include and nearly ready; Joris will look at this again soon
- Hilbert distance: do we want to have this in GEOS C API and just include in Shapely 2.0? Unlikely to be faster having this as a scalar ufunc against GEOS
- TODO: open issue upstream at GEOS
regression in
operation- dask renames columns in intermediate aggregation results then names them back; this creates a new GeoDataFrame with no geometry, which then fails in subsequent step
- want to include spatial shuffle, documentation updates, Hilbert distance with numpy
- possible dask-geopandas funding from the GDSL
- may be opportunity to fund someone on dask-geopandas
- GeoPandas roadmap
- moved from last meeting
- Some notes about this in the first meeting notes: https://github.com/geopandas/community/wiki/Community-Meeting-Notes-Archive#2020-05-07-a-first-meeting
(attending: Martin Fleischmann, Joris Van den Bossche, Brendan Ward, Benoit Bovy, Alan Snow, Jan Simbera)
- Ecosystem updates
- GeoPandas:
- pyogrio engine (https://github.com/geopandas/geopandas/pull/2225)
- longer term may want to do a hard switch from Fiona to pyogrio; some problems if both are installed via pip (conda is OK)
- may also want to make the backends optional, and install pure python support for Shapefile / geopackage by default (or leave all as optional)
- may want to look into xarray engine loading model
- need to figure out how to build wheels for pyogrio
- pyogrio engine (https://github.com/geopandas/geopandas/pull/2225)
- Shapely 2.0 / pygoes
- The merge is finally happening! (this will also mean that Shapely main branch is temporarily not working with GeoPandas)
- push new feature development to shapely instead of pygeos
- pygeos 0.12 release coming soon
- once pygeos is fully integrated into Shapely and stable, then archive pygeos; will remove pygeos opt dep. from geopandas by geopandas 1.0
- Dask-GeoPandas
- much of the core functionality, mostly working on spatial partitions
- GeoArrow specification
- First draft: https://github.com/geopandas/geo-arrow-spec/pull/12
in Python? - goal is to have arrow native way to store geometries instead of WKB for storage, uses compact storage of coordinates
- approach already used by cuSpatial for copying spatial data to GPU
- already some of the basic functionality in pygeos / Shapely 2.0 (get rings, coordinates, etc); requires multiple steps, but already faster than WKB conversion.
- goal is to have one function that does this conversion
- GeoPandas:
- Expansion of the team
- Martin's time is restricted in the following months leading to long response times on issues and PRs
- consider using triage approach
- need to formalize approach for adding new committers
- S2 geometry engine
- see this thread for context
- https://github.com/benbovy/pys2index
- overview from Benoit:
- lightweight wrapper for S2 point index with API similar to
- performance in benchmarks so far is quite fast
- would like to have vectorized wrappers for S2
- S2 appears to be actively maintained and about to get additional functionality soon
- used python-xtensor and pybind11 to work with S2 and numpy arrays
- lightweight wrapper for S2 point index with API similar to
- Two possible approaches to integrate:
- own way to store geometries specfic to backend engine; convert geometries on the fly to S2 objects as part of specific operation (e.g., predicate)
- R library converts on the fly to GEOS or S2 as needed
- More info: https://r-spatial.github.io/sf/articles/sf7.html
- own way to store geometries specfic to backend engine; convert geometries on the fly to S2 objects as part of specific operation (e.g., predicate)
- Wrapper classes:
- pygeos uses Python C extension wrapper for GEOS geometries so that GEOS objects are managed according to Python object lifecycles
- related issue
- consider putting in a request for NumFOCUS small development grant to start building out some of this support (next cycle may open early 2022): https://numfocus.org/programs/small-development-grants
- Formalise and publish a roadmap
- Some notes about this in the first meeting notes: https://github.com/geopandas/community/wiki/Community-Meeting-Notes-Archive#2020-05-07-a-first-meeting
- 2022 meetings schedule
- keep current cycle: last Thurs of uneven months, same UTC time
(attending: Martin Fleischmann, Joris Van den Bossche, Levi Wolf, Thomas Louf, Brendan Ward, Daniel Alejandro Mesejo-Leon, Imanol)
- 0.10 release
- #2076
API- #1977 comment
- decision:
takes parametersreturn_all=True/False
,max_distance=None/float value
- use pygeos.nearest_all under the hood unless parameters are such that nearest will suffice
- sjoin/overlay/clip as methods (https://github.com/geopandas/geopandas/issues/2141)
- decision: discuss more on the issue
- deprecations (https://github.com/geopandas/geopandas/pull/2100)
- decision: merge after 0.10
- approved / partially approved PRs tagged to 0.10:
Add id_as_index argument to GeoDataFrame.from_features
- decision: not ready for 0.10
ENH: expose points_from_xy as a GeoSeries method
- decision: ready for merge
BUG: Fix multipoint clipping
- decision: Joris will review after meeting
Add id_as_index argument to GeoDataFrame.from_features
- let's release this evening?
- decision: no.
- geopandas.org domain
- any updated regarding ownership?
- need to find someone who can get a direct response from Kelsey
- geopandas/benchmarks repo (or benchmark-data)
- for macro benchmarks that don't fit into the ASV benchmarks
- use issues to nominate datasets to use for benchmarks
- GADM polygons often offer very good variety in terms of points-per-polygon
- railways, municipalities..
- EPSRC grant call
- grant proposal is coming along; due Oct 14
- integrating pygeos and geopandas-dask philosophies into core geopandas plus integration with other libraries
- need
- letter of support from Tom Augspurger (w/ planetary computer) (if anyone has better contact info than his gmail, send to [email protected])
- 2 page resume/CV for @jorisvandenbossche & @martinfleis
- proposal text:
- Sub-project status updates:
- dask-geopandas
- once shuffle is in place will make next release (0.1) and publicize more
- already in planetary computer docker images
- had a Google Summer of Code project on this focused on spatial partitioning methods
- pygeos / Shapely 2.0
- GEOS 3.10 release coming soon; will have some things that we'd like to add
- Shapely 1.8 is ready with all deprecations in place, just needs to be reviewed / released
- will migrate pygeos into shapely after 1.8 is out
- need to coordinate with Sean Gillies re: committers / admin rights
- pyogrio
- after geopandas 0.10, add an engine keyword to read_file / to_file to use
- create issue for this (-> who?)
- longer term (geopandas 1.0) aim to have Fiona replaced by pyogrio
- we can also expose the pyogrio helper functions in geopandas (eg
- after geopandas 0.10, add an engine keyword to read_file / to_file to use
- xyzservices (https://github.com/geopandas/xyzservices)
- contextily
- dask-geopandas
(attending: Martin Fleischmann, Levi Wolf, Stefanie Lumnitz, Tom Augspurger, Brendan Ward, Joris Van den Bossche, Thomas Statham)
- Microsoft / planetary-computer & tabular data
- https://planetarycomputer.microsoft.com/catalog
- based around STAC; supports Zarr / NetCDF. Working on expanding to tabular support
- wanting to refine recommendations for representing tabular data
- use Parquet format
- need to finalize the implementation of geo-arrow-spec in Geopandas
- metadata is mostly done
- storage currently uses WKB (will always support this as a fallback), planning to revist this to optimize using Arrow data structures
- would like GDAL to support this as well; longer term want to use the Arrow C data API (both for file formats as well as transport after reading those to downstream libs like Geopandas)
- See Cloud Data Warehouse Geospatial Interoperability
- this is just getting off the ground
NumFOCUS dask-geopandas IO project
- see proposal
- Parquet support is mostly complete
- Feather dataset (https://github.com/geopandas/dask-geopandas/pull/91/)
- Plan is to:
- read bounds from file (already implemented)
- use methods in
to determine partitions (via Hilbert curve distance, etc) - then read underlying features into those paritions
- Timeline is next ~3 months
- limited to UK research staff
- plan is to expand on work in
plus other foundational work around spatial indexes, topologies, top-K nearest geometries - expression of interest due in a month
- letters of recommendation in late Sept.
- will post public request for comment about the work proposed here
- needs to be driven by community demand
xyzservices release
- https://github.com/geopandas/xyzservices
- takes contextily providers (metadata for tile providers) and puts into dedicated package
- goal is centralized package to be used within the ecosystem
- contextily will be updated soon to use this
- pushing more broadly within ecosystem; other packages starting to use or expressed interest
installation as
to either get minimal dependencies or most dependencies- https://github.com/geopandas/geopandas/issues/1313
- https://github.com/geopandas/geopandas/issues/1261
- now a
on conda-forge - need to decide what to do about
installs- if make it leaner (remove fiona, rtree) will make it much easier to install
- use install options,e.g.,
to add the others
- may want to consider
as optional dep and lazy load -
has good support withpip
(getting better very soon with CI wheel builds)
API for tools
vsgeopandas.sjoin(left, right)
#1984 - need to define rules for what is a method vs a function
- method approach is more common for pandas
- for
method is preferred -
is potentially problematic since supported by pandas, but since it is a numeric method not applicable to geometries anyway (currently fails), probably OK to makeclip
here support only geometry implementation
- duplicate vs deprecate functional approach?
- in favor of deprecation, though a bit annoying for community since functions are widely used
- short term can pass through functional to method approach to limit duplication of code
- start teaching around the method approach
- next release aim to have method approach, release after mark functional ones as deprecated
API of matrix binary operations
- https://github.com/geopandas/geopandas/pull/1674
- We now have an implementation based on sparse matrix which works really well for all the use cases
- API:
- always return sparse array (use
package as optional dependency)- basic support for sparse in dask (using scipy.sparse) but lots of things not yet in place
- have a keyword for sparse backend
or pydatasparse
- single method (predicate is a parameter) vs one method per predicate (which could use the former internally)
- could also have
for everything, and also exposeintersects_matrix
since this is most likely used
- could also have
- always return sparse array (use
pyData Global
- possible talk on updates in Geopandas
type hints
- lots of outstanding PRs -> start with reviewing the geoseries.py file
- testing
- may not want to do for next release
- for internal functions, aim to have strict types
0.10 release target
- add
as a highlight in this release
- add
Shapely 2.0
- slowly moving forward
- STRtree discussion resolved
- need to have a shapely 1.8 release first
- branch in shapely using pygeos is ready to merge into master
- numpy warnings -> also ignore in geopandas (TODO Joris)
functions in pygeos as methods on GeoSeries
- https://github.com/geopandas/geopandas/issues/2010
- in favor of doing this; hard requirement on pygeos is fine
(attending Martin, Joris, Stefanie, Thomas, Brendan)
- Dask Summit workshop debrief
Google Summer of Code
- We have one project on dask-geopandas development
- Logistics:
- smaller meetings every week, aim for Thurs 4-5 PM UTC; Martin will setup meetings
- every 2 months a larger GeoPandas meeting
- use Github issues, PRs, GeoPandas gitter, dask-geopandas gitter
- Martin is admin point of contact
- Blog posts from GSOC: these to get linked into NumFOCUS blog
- Goals:
- spatial partitioning
- explore writing out to Parquet?
- need to figure out partitioning methods, e.g., Hilbert curve
- probably want to implement a couple methods: Hilbert, maybe a gridded approach
- first identify some of the options:
- simple grid
- known regions (can do spatial clustering for getting more or less homogeneous sized partitions)
- hilbert curve
- quadtree: might work well, not exposed yet in GEOS C API / pygeos
- strtree: don't have access to nodes / leaves via GEOS C API / pygeos
- storage of partitions
- right now just polygons as a geoseries
- spatial indexing
- also want to make sure this gets done
- only place this is currently used is for writing to Parquet and
coordinate indexer - good starter PR: simple predicates:
; check for overlap with partition first, before checking geometries within partition
- spatial partitioning
- Logistics:
- feedback to rejected projects?
- We have one project on dask-geopandas development
NumFocus SDG
- Joris wants to apply for SDG to work on
- Focus more on I/O
- Read large dataset, have
figure out partitioning to files - Read index and bounding boxes into memory to drive the partitioning, then use the partion bounding boxes or lists of indexes to query out chunks of data
- Optimize parquet: store coordinates instead of WKB
- Feather support? Right now using the dask support for Parquet, not available for Feather in dask; Joris has a prototype Feather file reader for dask
- Convert GDAL directly to Arrow memory format instead of WKB
- maybe do directly in GDAL
- try first in
- Read large dataset, have
- Joris wants to apply for SDG to work on
GeoPandas Blog
- Shall we create GeoPandas blog? We can follow pandas model with an aggregator.
- https://pandas.pydata.org/community/blog/
- Joris to reach out to Kelsey re: domain name for geopandas
- Shall we create GeoPandas blog? We can follow pandas model with an aggregator.
API of matrix binary operations
- https://github.com/geopandas/geopandas/pull/1674
- We now have an implementation based on sparse matrix which works really well for all the use cases
- Qs:
- which sparse backend?
or pydatasparse
? - Martin is planning to base the implementation around sparse approach
- Discuss next time
API for interactive plotting
- https://github.com/geopandas/geopandas/issues/1904
- We want pluggable interactive plotting backends. How to do it smoothly?
- interest from some of the plotting backends
- don't really want global config for plot method
- want to keep usage of static and interactive plotting separate, don't clobber the static implementation by using interactive plotting; keep these in separate methods
- add another method:
for interactive maps
- datashader option to HVplot:
- works quite well for large data
- Joris follow up with them: can instance check be expanded to include geopandas geodataframes (via
), not just spatialpandas frames
Community calls
- we have a shared Google Calendar for GeoPandas-related events
- meetings are set to 17:00 UTC every two months (last Thursday)
- new package under geopandas umbrella
- formerly
- https://github.com/geopandas/xyzservices
- planning to have available before next release of geopandas
- will have 2 JSON formats:
- pretty version that includes metadata
- compiled / compressed version that is actually used in code; plan is to create via Github action
Ecosystem update
should fully support geopandas-cuspatial dataframe conversion in the next release
Shapely 2.0
- Joris planning to do more on this in June
- main blocking issue is the discussion around STRtree
- Differences in minimum rotated rectangle between Shapely's pure python method and method in GEOS
- Follow up with GEOS team about differences
- OpenCV method same as SHapely
- Also a method in PostGIS - is it the same
- Brendan: transfer to GeoPandas org
- Other
- Weekly meetings
- Use public channels for discussion / questions (github issues, gitter channel, (specific? -> make a dask-geopandas channel))
- Single Point of Contact (more for administrative questions)
- Martin
- Blog: on NumFOCUS & personal site is fine, no need for GeoPandas branded one
(attending: Martin, Joris, James, Brendan, Sangarshanan, Levi)
Google Summer of Code
- We have submitted 3 project ideas
- Pure Python IO
- Plotting enhancements
- https://github.com/geopandas/geopandas/wiki/Google-Summer-of-Code-2021
- Students should get in touch now and submit proposals within weeks
- students will start applying next Monday
- We need to select students between mid-April and mid-May
- Should we advertise it more? Prospect on possible students?
- TODO: Post on Twitter again (done)
- PySAL: primarily recruits from own students; ~1/2 have been affiliated that way
- We have submitted 3 project ideas
Community repository
- we have a new geopandas/community repo
- if not package specific to not specific to code, governance, code of conduct, post to this
- if specific to GeoPandas post issues to GeoPandas instead
- use for announcing meetings or proposals (workshops, funding)
- how should we efficiently use it?
- https://github.com/geopandas/community
- TODO: post issue for how to get funding for GeoPandas features or ideas list for potential future grants
- we have a new geopandas/community repo
Community calls
- shall we switch to some predictable schedule? (Bi-)Monthly?
- start with bimonthly on last Thursday of each month
- TODO: post schedule to community repo
- archive prior call notes to community repo; keep markdown doc for latest meeting
- repository moved to GeoPandas org
- https://github.com/geopandas/dask-geopandas
- Dask-Summit workshop proposal
- In May: https://summit.dask.org/
- submitted proposal around scaling GeoPandas vector operations
- Could have a presentation about current status of dask-geopandas
- Some discussion around spatial partitioning
- Look for ways to collaborate with spatial pandas
- Would be good to do visualization of bigger data
- TODO: add issue in community repo for ideas for this workshop
- First alpha released on PyPI, still needs conda-forge
- Martin: will add to conda-forge
- Biggest needs: spatial index and overlap operations
User-friendly API of matrix binary operations
- would be nice to have "
" in 0.10 - We should agree on the API design, implementation should be straigtforward based on
, - https://github.com/geopandas/geopandas/pull/1674
- returning a list maybe not particularly useful
- might be a good to have a few example use cases
- does any polygon in input intersect any in right dataframe
- which of them in left dataframe intersects any in right dataframe
- how many intersects
- use outer strategy with sparse argument
- currently don't depend on scipy; makes it harder to use sparse option
- can keep sparse as an optional argument; fall back to full matrix
- another alternative is to use xarray and pydata sparse backend (optional dependencies)
- could just return dense pandas table of left and right indices
- would be nice to have "
Interactive plotting
- the existing tools are not as friendly as we thought
- folium-based implementation of
mirroring the language ofplot()
- https://github.com/martinfleis/geopandas-view
- should it be embedded in GeoPandas? Or as an affiliated project under GeoPandas repo?
- @sangarshanan is willing to help maintaining it
- status: most of the stuff supported for static plotting in matplotlib is now supported against folium
- considerations for API:
- plotting backend provider
- namespacing folium / interactive methods to prevent collision with static plotting
- over some threshold do not want to plot in folium
- might be good to look at how
in R handles translation to backend providers - implementation of backend can be outside GeoPandas; might be easier to have this directly in GeoPandas in order to allow it as a default (not a lot of code)
- will do a bit more work to polish then migrate into GeoPandas
contextily providers module
- there is an idea to convert contextily providers module to a separate package
- both contextily and
could be using it + others - https://github.com/geopandas/contextily/issues/153 and partially https://github.com/geopandas/contextily/issues/172
Ecosystem update
- pygeos/shapely2.0
- Current blocker: STRtree design (https://github.com/Toblerity/Shapely/pull/1064, https://github.com/Toblerity/Shapely/pull/1094)
- Shapely 1.8 release in prep for the transition; will raise deprecation warnings
- After 1.8, move pygeos code into Shapely; will need to coordinate with pygeos
- pyogrio
- Windows support?
- Do we need something similar as
- pygeos/shapely2.0
- we still don't have access to the domain to point it to RTD
- Joris will ping Kelsey J.
- also need to have ownership in Pypi; need to be able to add others
- conda forge:
- anyone can help maintain this
- currently Joris, James, Filipe
- we still don't have access to the domain to point it to RTD
NumFOCUS small grants
- do we want to apply for something in the near future?
- anyone has capacity?
- next round likely before summer
- open issue on community repo
(attending: Martin, Joris, James, Brendan, Sangarshanan, Levi)
Google Summer of Code
- We have submitted 3 project ideas
- Pure Python IO
- Plotting enhancements
- https://github.com/geopandas/geopandas/wiki/Google-Summer-of-Code-2021
- Students should get in touch now and submit proposals within weeks
- students will start applying next Monday
- We need to select students between mid-April and mid-May
- Should we advertise it more? Prospect on possible students?
- TODO: Post on Twitter again (done)
- PySAL: primarily recruits from own students; ~1/2 have been affiliated that way
- We have submitted 3 project ideas
Community repository
- we have a new geopandas/community repo
- if not package specific to not specific to code, governance, code of conduct, post to this
- if specific to GeoPandas post issues to GeoPandas instead
- use for announcing meetings or proposals (workshops, funding)
- how should we efficiently use it?
- https://github.com/geopandas/community
- TODO: post issue for how to get funding for GeoPandas features or ideas list for potential future grants
- we have a new geopandas/community repo
Community calls
- shall we switch to some predictable schedule? (Bi-)Monthly?
- start with bimonthly on last Thursday of each month
- TODO: post schedule to community repo
- archive prior call notes to community repo; keep markdown doc for latest meeting
- repository moved to GeoPandas org
- https://github.com/geopandas/dask-geopandas
- Dask-Summit workshop proposal
- In May: https://summit.dask.org/
- submitted proposal around scaling GeoPandas vector operations
- Could have a presentation about current status of dask-geopandas
- Some discussion around spatial partitioning
- Look for ways to collaborate with spatial pandas
- Would be good to do visualization of bigger data
- TODO: add issue in community repo for ideas for this workshop
- First alpha released on PyPI, still needs conda-forge
- Martin: will add to conda-forge
- Biggest needs: spatial index and overlap operations
User-friendly API of matrix binary operations
- would be nice to have "
" in 0.10 - We should agree on the API design, implementation should be straigtforward based on
, - https://github.com/geopandas/geopandas/pull/1674
- returning a list maybe not particularly useful
- might be a good to have a few example use cases
- does any polygon in input intersect any in right dataframe
- which of them in left dataframe intersects any in right dataframe
- how many intersects
- use outer strategy with sparse argument
- currently don't depend on scipy; makes it harder to use sparse option
- can keep sparse as an optional argument; fall back to full matrix
- another alternative is to use xarray and pydata sparse backend (optional dependencies)
- could just return dense pandas table of left and right indices
- would be nice to have "
Interactive plotting
- the existing tools are not as friendly as we thought
- folium-based implementation of
mirroring the language ofplot()
- https://github.com/martinfleis/geopandas-view
- should it be embedded in GeoPandas? Or as an affiliated project under GeoPandas repo?
- @sangarshanan is willing to help maintaining it
- status: most of the stuff supported for static plotting in matplotlib is now supported against folium
- considerations for API:
- plotting backend provider
- namespacing folium / interactive methods to prevent collision with static plotting
- over some threshold do not want to plot in folium
- might be good to look at how
in R handles translation to backend providers - implementation of backend can be outside GeoPandas; might be easier to have this directly in GeoPandas in order to allow it as a default (not a lot of code)
- will do a bit more work to polish then migrate into GeoPandas
contextily providers module
- there is an idea to convert contextily providers module to a separate package
- both contextily and
could be using it + others - https://github.com/geopandas/contextily/issues/153 and partially https://github.com/geopandas/contextily/issues/172
Ecosystem update
- pygeos/shapely2.0
- Current blocker: STRtree design (https://github.com/Toblerity/Shapely/pull/1064, https://github.com/Toblerity/Shapely/pull/1094)
- Shapely 1.8 release in prep for the transition; will raise deprecation warnings
- After 1.8, move pygeos code into Shapely; will need to coordinate with pygeos
- pyogrio
- Windows support?
- Do we need something similar as
- pygeos/shapely2.0
- we still don't have access to the domain to point it to RTD
- Joris will ping Kelsey J.
- also need to have ownership in Pypi; need to be able to add others
- conda forge:
- anyone can help maintain this
- currently Joris, James, Filipe
- we still don't have access to the domain to point it to RTD
NumFOCUS small grants
- do we want to apply for something in the near future?
- anyone has capacity?
- next round likely before summer
- open issue on community repo
User Survey Review
- Let's see what people think
- https://github.com/geopandas/geopandas-user-surveys/pull/1
- make private repo to store private responses
- Some points:
- interactive plotting: more examples
- performance is a consistent mentioned issue
Core dev team organisation
- Have official list of people?
- Mailing list
- Org like https://github.com/dask/community/
- Expanding the team?
- governance questions
- code of conduct
- mediation
- violations of CoC
- adding developers/removing (retiring?) developers
NumFOCUS fiscal sponsorship
- Status of Martin's work
- https://github.com/geopandas/geopandas/pull/1759, https://github.com/geopandas/geopandas/pull/1757
- having an option to depend on shapely only
- pure-Python I/O, no CRS
pyogrio integration
- Discuss integration plan for testing I/O using
instead offiona
(seeing about 10-16x speedups)- try to package up on conda forge
- Discuss integration plan for testing I/O using
- non-GDAL IO
- pygpkg
- pyshp
- GSOC application focusing on non-GDAL IO @martin
pyogrio integration
- think about participating in GSOC 21
- https://opensource.googleblog.com/2020/10/google-summer-of-code-2021-is-bringing.html
- Python GPGK IO project?
- Henrikki is looking for a home for pyrosm (yes to us)
GeoPandas paper
- REGION OA (no APC) journal
- https://openjournals.wu.ac.at/ojs/index.php/region/index
Ecosystem update
- pygeos/shapely2.0
- dask-geopandas
0.9 release
NumFOCUS Documentation project
- I'd like to update you on current development and discuss a bit further steps to decide on priorities and time frame.
- context: https://github.com/geopandas/geopandas/issues/1564
- Martin provided an update on the latest direction in documentation work in https://github.com/geopandas/geopandas/issues/1564
- some examples will move to user guide where they are using the core functions
- for examples gallery may use nb-sphynx instead of sphynx-gallery
- Will bulk up installation instructions to help alleviate many of the complaints around installation issues
- will add a longer-term roadmap within the docs
- Going forward, Martin will add examples incrementally but will try to get this reviewed as a larger PR
- New Advanced Guide will include more advanced topics like using spatial index and vectorization
- Will need to add redirects from important pages from existing readthedocs pages to the new documentation structure
Select final logo
- https://github.com/geopandas/geopandas/issues/1405
- Let's make the final decision!
- Go with the one with highest votes
- This will go into a separate PR with all the versions and source files
- Add a page to documentation with the logo and specific colors used
- Share logo back to NumFOCUS
- TODO: update the logo on twitter, etc
GitHub Sponsors
- We may consider using GitHub Sponsor button. Someone recently asked how to support GeoPandas and I was not sure if there is any possibility of a direct (financial) support, apart from donating to NumFOCUS.
- In order to have NumFOCUS accept $ on behalf of GeoPandas, may need to become a fiscally-sponsored project instead of just an affiliated project; Joris will check into this
- For GitHub Sponsor have seen examples of sponsoring individuals; will need to see what it would take to sponsor the larger project
GeoPandas usage / promotion
- Would like to feature groups that use GeoPandas as part of their work, maybe on GeoPandas blog (if there was one)
- Blog: would like to do this outside sphynx
GeoPandas domain
- Joris will follow up with Kelsey
- Also request PyPi access from Kelsey
- Joris will follow up with Kelsey
Packaging automation
- Can use GitHub Actions to publish packages to PyPi / Conda
- Can derive this from Pydata project
Social media
- Twitter
- Joris is currently maintaining this
- Martin can help with this; Joris will share access
- Example that came up on twitter from COVID-19 dashboards around showing density of points, maybe by hexagon; might want to add something like this as an example in the docs
- Twitter
GeoPandas academic paper
- Geographical Analysis journal is having a special issue on Open Source Software for Spatial Analysis, edited by Luc Anselin and Serge Rey (both PySAL). We had a small exchange about the possibility of writing a paper about GeoPandas (which is long overdue I'd say) with Joris and Serge on twitter: https://twitter.com/jorisvdbossche/status/1282208649335779328 I feel that this would be great thing to do, although it naturally takes time to write a proper paper.
- Special issue will require more background documentation & contextualization; not just a description about the project
- Need to position it into the wider ecosystem; directly address how it has advanced spatial analysis in Python
- Could start brainstorming / collecting ideas
- Martin will make a google doc
- Martin will check to see if there is sponsorship from the university for making this open access
- Full fee is $3,000 US
- If we don't go for this, make sure to go after a different publication that allows open access
GeoPandas Survey
- Discuss plan to finish up and post GeoPandas survey: https://docs.google.com/document/d/1caityqUUfgAN2u9VUJN78mTyS3fMYZI-ZvgtfLfio9A/edit?usp=sharing
- Martin will add the GDPR compliance
- Use Google Forms to release this
- Can be individual owner
- Martin can create the form
- Timeframe:
- Would like to launch as soon as possible, aim for sometime in Sept.
GeoPandas 0.9 roadmap
- If we want to release 0.9 in December (we discussed switching to 6-month release cycle), we could discuss what do we want to (ideally) include.
- Binary predicates change - https://gist.github.com/martinfleis/abc7cdbf9f9266bf9ed369080eec7cea
- proposal is to build this on the output of query bulk
- people normally interested in 2 questions: does my polygon intersect any in the other data frame (not just same line), which polygons from right data frame are intersected with the one on the left
(in R) doesn't return series, they return metrics (sparse / dense) - could have a function that gives more direct access to sindex bulk query
- general agreement about keeping the existing predicate behavior as is, but adding a new set of methods on GeoSeries to add the cross / matrix oriented approach
- Martin will add a new issue for this with notebook example
- spatial index
- do we want to expose interface to multiple spatial index or abstract base class that can wrap other spatial index implementations
- can revise the issue based on discussion but don't target for 0.9
- revisit once pygeos / shapely 2.0 integration is complete and no longer optional; STRtree will be default as part of that
- Brendan will try to get outstanding pygeos issue to add other predicates to STRtree in for next pygeos version:
- Upcoming pygeos features in next release: mostly around multithreading, adding support for Z values to coordinate ops
- geodetic distance / area calculations
- this was tricky to write these to be performant, dealing with wrap around the poles
- there is project to extract out the S2 ideas into a general purpose library
- Create an example out of this work and put in documentation
- Create an issue about adapting ideas from
- Aim for supporting different spatial backend (e.g.,
) after 1.0 - Look into some of the other backends
- cuSpatial:
- want to support interoperability, not sure about supporting different underlying geometry providers / backends
- Longer term, maybe consider making GDAL / Fiona optional (e.g., read data from Parquet...)
- vectorized snap
- e.g., make larger linestring out of 2 disconnected segments
- in GEOS overlay refactor, this will include a precision-based snap
Future NumFOCUS grants
- I am not aware of the schedule of future funding rounds, but we should be prepared (if anyone has a capacity).
- Normally should be 3rd round for this year, but haven't heard yet
- I am not aware of the schedule of future funding rounds, but we should be prepared (if anyone has a capacity).
- Discuss the current state and future of
. - Big work items underway:
- I/O methods: Joris adding Parquet support from geopandas
- making use of spatial partitioning
- Discuss the current state and future of
- Small development grants ideas:
- better documentation
- better integration / leveraging spatial indexes for operations
- small improvements to topological operations (relates operations); elementwise vs all-pairwise
- Small development grants ideas:
- https://github.com/geopandas/geopandas/issues/1405/
- Joris: check with pandas
- Try different color, otherwise go with it!
Lowering barriers to effective engagement / involving community
- reviewing PR bottlenecks
- time of core maintainers
- huge PRs, can we suggest folks make smaller PRs?
- reviewing PR bottlenecks
Maintenance bottlenecks
Roadmap (1.0?)
- Shapely 2.0 / pygeos speed-ups
- API for topological operations
- IO
- parquet/feather
- faster GDAL
- databases
- consistent API
- Integrating raster operations
- zonal stats is problematic for large data
- geodetic distance etc (geography)
- visualization
- maybe geoplot becomes an affiliate like contextily
- residentmario may not have time naymore for maintenance
- Vectorized snap feature to other feature
Do something like http://xarray.pydata.org/en/stable/roadmap.html
- Open an issue for this
places to ask questions vs. filing an issue? document.
- notebooks/examples
Installation issues