Skip to content

Community Meeting Notes Archive

Martin Fleischmann edited this page Sep 26, 2022 · 5 revisions

The archive of Community Meeting Notes. See the most recent and tentative agenda for the next meeting on hackmd.

2022-09-01

Attendees: Martin Fleischmann, Brendan Ward, Joris van den Bossche, Matt Richards

2022-07-28

Attendees: Martin Fleischmann, Brendan Ward, Thomson Comer

  • Preserving sindex in copies of GeoDataFrame (https://github.com/geopandas/geopandas/issues/2510)
    • in favor of the idea but may be very complex
    • with pygeos / shapely 2.0 it is very fast to build the index, but slower with RTree; checking against original may be more computationally expensive than just building new tree
      • need to time it to be sure
  • Discontinued Windows wheels (https://github.com/geopandas/geopandas/issues/2465)
  • Projecting to WGS84 when outputing GeoJSON
    • old PR: https://github.com/geopandas/geopandas/pull/416
    • may need to document thatwe are following the 2008 spec (no automatic reprojection) rather than RFC7946 (2016 spec).
    • Ideally would reproject because of latest / formal spec, but GDAL does not do by default
    • Fiona / pyogrio always
    • For to_json, add a flag that by default does what GDAL does (no automatic reprojection) with initial default of None (what is in PR); maybe flip this over a deprecation period to automatic
    • Path forward: work on getting existing PR merged, document inconsistency vs Fiona / Pyogrio
  • M1 / Arm64 installation issues via pip
    • seems to require uninstalling then building binary deps (pygeos, shapely, pyogrio) from source; do we want to note this in the docs?
    • Installation all works fine for Martin using mamba-forge / conda-forge; these are what we should document
    • Pip: generally no wheels available for M1 (pygeos has them)
    • https://github.com/geopandas/geopandas/issues/1816#issuecomment-1003093329
  • modernizing setup.py => pyproject.toml per PEP 621
  • cuSpatial:
    • Thomson joined to represent project
    • wants to have API parity with GeoPandas
    • Working on geoarrow spec
    • Working on I/O between cuSpatial and GeoPandas using geoarrow
    • For feature parity, will need to rewrite core GEOS functionality for CUDA
      • Working on implementing DE-9IM
      • then projective transforms (similar to pyproj)
    • team is largely focused on C++ header only libraries
    • Working on defining I/O between geopandas / GEOS objects and arrow for use in CUDA
    • also wanting to support dask-geopandas integration with cuSpatial
  • adding poetry-specific sections to pyproject.toml?
    • consider adding these in pyogrio as a test for folks that use poetry
  • S2 NumFOCUS grant
    • Follow up with Benoît re: schedule for this
  • Next NumFOCUS grant cycle(s)
    • next proposal submission deadline: September 2, 2022
    • Joris may have a contact to work on a NumFOCUS grant
  • shapely 2.0 update
    • getting close to 2.0a1 release
  • using quadtree to help with partitioning of geometries for dask-geopandas
    • already a python implementation of quadtree
    • could expose a CAPI in GEOS against quadtree
    • helps because you know the rectangular geometries of the quads, but to do this you need to process all the geometries at once
      • could maybe do as a 2 pass operation; create the quadtree structure then have workers populate geometries into it
      • look at the cuSpatial quadtree implementation for inspiration
  • pyogrio update:
    • fix conda-forge for 0.4.1 release; some issues building, Martin will follow up with conda-forge folks
  • next meeting schedule?
    • Martin will work with Joris to set next meeting date

2022-06-02

Attendees: Martin Fleischmann, Joris van den Bossche, Brendan Ward

  • GeoPandas 0.10.3 (aim for next couple of days? Nope: superseded by below)
  • GeoPandas 0.11 (aim for next couple of days)
  • S2 NumFOCUS grant
  • GSoC update
    • no projects selected this year
  • GeoPython 2022
    • dask-geopandas workshop (2 hours)
    • GeoPandas talk
  • FOSS4G
    • state of GeoPandas talk
  • NumFOCUS small development grants (https://numfocus.org/programs/small-development-grants):
    • round 2 closes June 5
    • round 3 closes Sept 2:
      • check in on this in July dev meeting
      • maybe put in for pyogrio GDAL Arrow I/O
  • Ecosystem updates:
    • pyogrio
    • shapely
      • STRtree API: query() returns indices, query_items() returns custom items, query_geoms() returns geometries; add deprecation warning to 1.8 branch
      • Adding new GEOS 3.11 functionality:
        • Consider adding more PRs soon
    • GDAL:
      • GeoParquet / GeoArrow support landed in 3.5.0
      • Arrow I/O interface coming in GDAL C API; pyogrio opt-in if GDAL new enough and pyarrow available
    • GeoParquet spec:
      • O.4.0 support now in place

2022-04-28

Attendees: Martin Fleischmann, Joris van den Bossche, Levi John Wolf, Matt Richards, Pieter Roggemans, Brendan Ward

  • Ecosystem updates:
    • pyogrio:
      • mixed geometries
        • https://github.com/geopandas/pyogrio/pull/75
        • Fiona always uses "Unknown" for mixed pluralities, ogr2ogr generally preserves existing type
        • by default (allow user to opt-out), try to write (promote if needed), fall back to "unknown"
        • maybe change geometry_type parameter to be instead allow toggling promote as needed (maybe don't need promote="always"; instead have a method on GeoDataFrame instead?)
        • maybe promote_to_multi where True = always, False = don't do it, None do it automatically
        • Consider promoting truly mixed geometries to lowest common representation (e.g., Points + Lines => GeometryCollection)
      • wheel building (GEOS)
      • 0.4 release
      • Issues with GeoPandas integration tests:
        • lots of GeoJSON files used for testing, but these don't match because pyogrio sets correct integer types
        • GDAL uses int32 by default whereas Fiona is using int64 by default
      • Missing features:
        • write support for datetime
        • allow write support for None geometry values
    • GeoParquet
      • 0.2 requirement for winding order rolled back in 0.3
      • CRS now optional and by default assumed to be OGS:CRS84
      • Could do a GeoPandas patch release for matching 0.1 spec; maybe do so for 0.3 but still ambiguous default re: missing CRS
    • Shapely 2.0
      • STRtree changes are merged
      • a few aliases still need to be updated
  • GSoC update
    • selection process ongoing (note that we cannot publish the decision yet)
    • for future GSoC, may get better response if we have more specific tasks (similar in focus to PySAL); GeoPandas roadmap might help too
    • try harder to get potential GSoC contributors to contribute something small early on while they are in the pre-proposal timeframe
  • NumFOCUS SDG update
    • S2 spherical geometry pilot grant received!
  • GeoPandas 0.11
    • would be nice to get pyogrio 0.4 in
    • if possible, get random sampling PR merged
    • given limited capacity, not try to pull in too many other features
  • Next community meeting:
    • consider moving to first week of June; aim to dedicate it to roadmap
    • cancel following one at end of June?

2022-03-31

(attending: Joris van den Bossche, Martin Fleischmann, Brendan Ward)

  • Ecosystem updates
    • pyogrio 0.4 release
      • release without arm64
      • aim for release next week
      • should be able to finish the integration into GeoPandas
    • shapely 2.0
    • xyz services keeps getting updated with new services every 1-2 months
    • dask-geopandas 0.1.0 released
    • GDAL: initial parquet / arrow driver support: https://github.com/OSGeo/gdal/pull/5477
  • GeoPandas roadmap
  • GSoC update
    • Applications open on April 4, closes on April 19
  • NumFOCUS SDG update
    • Benoit submitted proposal for Google S2 spherical geometry pilot
  • GeoPandas 0.11 release
    • can refresh pure-Python shapefile I/O PR and include in release (would be a different engine)
    • make pyogrio opt-in via engine keyword if installed; make default if installed in 0.12
    • need to decide if index_right being renamed to right dataframe index name from spatial join should make it into release PR but want to get in the other performance updates
  • growing maintenance backlog
    • need more help reviewing PRs
  • Plan to have monthly GeoPandas meeting
  • Next meeting (2022/04/28) will be at 20:00 UTC
    • consider having every other meeting at 17:00 UTC
  • Dask-geopandas meeting times and frequency
    • stop until next round of GSoC
  • Website
    • right now sphinx is both documentation and GeoPandas homepage
    • may want to consider splitting out GeoPandas documentation from homepage, so that can update homepage without having to push a release
    • put docs at docs.geopandas.org
    • may want to check w/ NumFOCUS if they can manage the domain
  • NumFOCUS
    • Need to investigate what it takes to become a fiscally sponsored project
      • Need roadmap and governance (but we have code of conduct)

2022-01-27

(attending: Martin Fleischmann, Brendan Ward, Thomas Statham, Matt Richards, Levi Wolf, Joris Van den Bossche, Alan Snow)

  • Community call time
    • Matt is in UTC+10, Brendan UTC-8, Alan is UTC-6, Joris is UTC+1, Martin and Levi are UTC
    • Shall we consider different time or switching between them periodically?
    • Next time will try later UTC time (20:00 UTC?)
  • Ecocystem updates
  • NumFOCUS SDG (S2?)
    • Round 1
      • Call for Proposals Announcement: February 4, 2022
      • Proposal Submission Deadline: March 4, 2022
      • Committee Selection Deadline: March 18, 2022
      • Notification to Applicants Deadline: April 15, 2022
    • Ideas:
  • GSoC
  • GeoPython 2022
    • Basel Switzerland June 13-15 (hybrid; might have in-person component)
    • Talk submission end of Feb, workshops end of March
    • Might be good to have a talk on state of GeoPandas and ecosystem
    • Might be good to have workshop on dask-geopandas
  • Foss4G
    • Firenze Italy, Aug 22-28
    • Deadlines: Talks/papers: end of Feb
    • Let's follow up offline with Martin, Joris, Levi
  • GeoPandas 0.11 release timeline
    • should there be one more release before shapely 2.0 support?
    • Issues around GeoDataFrame constructor / active geometry columns
      • 0 geometry columns -> no GeoDataFrame
      • >=1 geometry columns -> keep GeoDataFrame, even if active is not present
        • better handle the case of the active geometry column being present
        • better handle crs on the geodataframe
  • dask-geopandas 0.1 release timeline
    • want to include spatial shuffle, documentation updates, Hilbert distance with numpy
      • read_file via pyogrio would also be nice to include and nearly ready; Joris will look at this again soon
    • Hilbert distance: do we want to have this in GEOS C API and just include in Shapely 2.0? Unlikely to be faster having this as a scalar ufunc against GEOS
      • TODO: open issue upstream at GEOS
    • regression in dissolve operation
      • dask renames columns in intermediate aggregation results then names them back; this creates a new GeoDataFrame with no geometry, which then fails in subsequent step
  • possible dask-geopandas funding from the GDSL
    • may be opportunity to fund someone on dask-geopandas
  • GeoPandas roadmap

2021-12-02

(attending: Martin Fleischmann, Joris Van den Bossche, Brendan Ward, Benoit Bovy, Alan Snow, Jan Simbera)

  • Ecosystem updates
    • GeoPandas:
      • pyogrio engine (https://github.com/geopandas/geopandas/pull/2225)
        • longer term may want to do a hard switch from Fiona to pyogrio; some problems if both are installed via pip (conda is OK)
        • may also want to make the backends optional, and install pure python support for Shapefile / geopackage by default (or leave all as optional)
        • may want to look into xarray engine loading model
        • need to figure out how to build wheels for pyogrio
    • Shapely 2.0 / pygoes
      • The merge is finally happening! (this will also mean that Shapely main branch is temporarily not working with GeoPandas)
      • push new feature development to shapely instead of pygeos
      • pygeos 0.12 release coming soon
      • once pygeos is fully integrated into Shapely and stable, then archive pygeos; will remove pygeos opt dep. from geopandas by geopandas 1.0
    • Dask-GeoPandas
      • much of the core functionality, mostly working on spatial partitions
    • GeoArrow specification
      • First draft: https://github.com/geopandas/geo-arrow-spec/pull/12
      • __geo_arrow_interface__ in Python?
      • goal is to have arrow native way to store geometries instead of WKB for storage, uses compact storage of coordinates
      • approach already used by cuSpatial for copying spatial data to GPU
      • already some of the basic functionality in pygeos / Shapely 2.0 (get rings, coordinates, etc); requires multiple steps, but already faster than WKB conversion.
      • goal is to have one function that does this conversion
  • Expansion of the team
    • Martin's time is restricted in the following months leading to long response times on issues and PRs
    • consider using triage approach
    • need to formalize approach for adding new committers
  • S2 geometry engine
    • see this thread for context
    • https://github.com/benbovy/pys2index
    • overview from Benoit:
      • lightweight wrapper for S2 point index with API similar to scipy.spatial.cKDTree
      • performance in benchmarks so far is quite fast
      • would like to have vectorized wrappers for S2
      • S2 appears to be actively maintained and about to get additional functionality soon
      • used python-xtensor and pybind11 to work with S2 and numpy arrays
    • Two possible approaches to integrate:
      • own way to store geometries specfic to backend engine; convert geometries on the fly to S2 objects as part of specific operation (e.g., predicate)
    • Wrapper classes:
      • pygeos uses Python C extension wrapper for GEOS geometries so that GEOS objects are managed according to Python object lifecycles
    • related issue
    • consider putting in a request for NumFOCUS small development grant to start building out some of this support (next cycle may open early 2022): https://numfocus.org/programs/small-development-grants
  • Formalise and publish a roadmap
  • 2022 meetings schedule
    • keep current cycle: last Thurs of uneven months, same UTC time

2021-09-30

(attending: Martin Fleischmann, Joris Van den Bossche, Levi Wolf, Thomas Louf, Brendan Ward, Daniel Alejandro Mesejo-Leon, Imanol)

  • 0.10 release
  • geopandas.org domain
    • any updated regarding ownership?
    • need to find someone who can get a direct response from Kelsey
  • geopandas/benchmarks repo (or benchmark-data)
    • for macro benchmarks that don't fit into the ASV benchmarks
    • use issues to nominate datasets to use for benchmarks
    • GADM polygons often offer very good variety in terms of points-per-polygon
    • railways, municipalities..
  • EPSRC grant call
    • grant proposal is coming along; due Oct 14
    • integrating pygeos and geopandas-dask philosophies into core geopandas plus integration with other libraries
    • need
      • letter of support from Tom Augspurger (w/ planetary computer) (if anyone has better contact info than his gmail, send to [email protected])
      • 2 page resume/CV for @jorisvandenbossche & @martinfleis
    • proposal text:
  • Sub-project status updates:
    • dask-geopandas
      • once shuffle is in place will make next release (0.1) and publicize more
      • already in planetary computer docker images
      • had a Google Summer of Code project on this focused on spatial partitioning methods
    • pygeos / Shapely 2.0
      • GEOS 3.10 release coming soon; will have some things that we'd like to add
      • Shapely 1.8 is ready with all deprecations in place, just needs to be reviewed / released
      • will migrate pygeos into shapely after 1.8 is out
      • need to coordinate with Sean Gillies re: committers / admin rights
    • pyogrio
      • after geopandas 0.10, add an engine keyword to read_file / to_file to use
        • create issue for this (-> who?)
      • longer term (geopandas 1.0) aim to have Fiona replaced by pyogrio
      • we can also expose the pyogrio helper functions in geopandas (eg list_layers())
    • xyzservices (https://github.com/geopandas/xyzservices)
    • contextily

2021-08-05

(attending: Martin Fleischmann, Levi Wolf, Stefanie Lumnitz, Tom Augspurger, Brendan Ward, Joris Van den Bossche, Thomas Statham)

  • Microsoft / planetary-computer & tabular data
    • https://planetarycomputer.microsoft.com/catalog
    • based around STAC; supports Zarr / NetCDF. Working on expanding to tabular support
    • wanting to refine recommendations for representing tabular data
      • use Parquet format
      • need to finalize the implementation of geo-arrow-spec in Geopandas
        • metadata is mostly done
        • storage currently uses WKB (will always support this as a fallback), planning to revist this to optimize using Arrow data structures
        • would like GDAL to support this as well; longer term want to use the Arrow C data API (both for file formats as well as transport after reading those to downstream libs like Geopandas)
    • See Cloud Data Warehouse Geospatial Interoperability
      • this is just getting off the ground
  • NumFOCUS dask-geopandas IO project
    • see proposal
    • Parquet support is mostly complete
    • Feather dataset (https://github.com/geopandas/dask-geopandas/pull/91/)
    • Plan is to:
      • read bounds from file (already implemented)
      • use methods in dask-geopandas to determine partitions (via Hilbert curve distance, etc)
      • then read underlying features into those paritions
    • Timeline is next ~3 months
  • funding
  • xyzservices release
    • https://github.com/geopandas/xyzservices
    • takes contextily providers (metadata for tile providers) and puts into dedicated package
      • goal is centralized package to be used within the ecosystem
      • contextily will be updated soon to use this
    • pushing more broadly within ecosystem; other packages starting to use or expressed interest
  • installation as geopandas and geopandas-base to either get minimal dependencies or most dependencies
  • API for tools
    • gdf.sjoin(other) vs geopandas.sjoin(left, right) #1984
    • need to define rules for what is a method vs a function
      • method approach is more common for pandas
      • for dask-geopandas method is preferred
      • clip is potentially problematic since supported by pandas, but since it is a numeric method not applicable to geometries anyway (currently fails), probably OK to make clip here support only geometry implementation
    • duplicate vs deprecate functional approach?
      • in favor of deprecation, though a bit annoying for community since functions are widely used
      • short term can pass through functional to method approach to limit duplication of code
      • start teaching around the method approach
      • next release aim to have method approach, release after mark functional ones as deprecated
  • API of matrix binary operations
    • https://github.com/geopandas/geopandas/pull/1674
    • We now have an implementation based on sparse matrix which works really well for all the use cases
    • API:
      • always return sparse array (use sparse package as optional dependency)
        • basic support for sparse in dask (using scipy.sparse) but lots of things not yet in place
        • have a keyword for sparse backend scipy.sparse or pydata sparse?
      • single method (predicate is a parameter) vs one method per predicate (which could use the former internally)
        • could also have predicate_matrix for everything, and also expose intersects_matrix since this is most likely used
  • pyData Global
    • possible talk on updates in Geopandas
  • type hints
    • lots of outstanding PRs -> start with reviewing the geoseries.py file
    • testing
    • may not want to do for next release
    • for internal functions, aim to have strict types
  • 0.10 release target
    • add explore as a highlight in this release
  • Shapely 2.0
    • slowly moving forward
    • STRtree discussion resolved
    • need to have a shapely 1.8 release first
    • branch in shapely using pygeos is ready to merge into master
    • numpy warnings -> also ignore in geopandas (TODO Joris)
  • functions in pygeos as methods on GeoSeries

2021-05-27

(attending Martin, Joris, Stefanie, Thomas, Brendan)

  • dask-geopandas
    • Dask Summit workshop debrief
    • Google Summer of Code
      • We have one project on dask-geopandas development
        • Logistics:
          • smaller meetings every week, aim for Thurs 4-5 PM UTC; Martin will setup meetings
          • every 2 months a larger GeoPandas meeting
          • use Github issues, PRs, GeoPandas gitter, dask-geopandas gitter
          • Martin is admin point of contact
          • Blog posts from GSOC: these to get linked into NumFOCUS blog
        • Goals:
          • spatial partitioning
            • explore writing out to Parquet?
            • need to figure out partitioning methods, e.g., Hilbert curve
            • probably want to implement a couple methods: Hilbert, maybe a gridded approach
            • first identify some of the options:
              • simple grid
              • known regions (can do spatial clustering for getting more or less homogeneous sized partitions)
              • hilbert curve
              • quadtree: might work well, not exposed yet in GEOS C API / pygeos
              • strtree: don't have access to nodes / leaves via GEOS C API / pygeos
            • storage of partitions
              • right now just polygons as a geoseries
          • spatial indexing
            • also want to make sure this gets done
            • only place this is currently used is for writing to Parquet and cx coordinate indexer
            • good starter PR: simple predicates: intersects; check for overlap with partition first, before checking geometries within partition
      • feedback to rejected projects?
    • NumFocus SDG
      • Joris wants to apply for SDG to work on dask-geopandas
      • Focus more on I/O
        • Read large dataset, have dask-geopandas figure out partitioning to files
        • Read index and bounding boxes into memory to drive the partitioning, then use the partion bounding boxes or lists of indexes to query out chunks of data
        • Optimize parquet: store coordinates instead of WKB
        • Feather support? Right now using the dask support for Parquet, not available for Feather in dask; Joris has a prototype Feather file reader for dask
        • Convert GDAL directly to Arrow memory format instead of WKB
          • maybe do directly in GDAL
          • try first in pyogrio
  • GeoPandas Blog
  • API of matrix binary operations
    • https://github.com/geopandas/geopandas/pull/1674
    • We now have an implementation based on sparse matrix which works really well for all the use cases
    • Qs:
      • API
      • which sparse backend? scipy.sparse or pydata sparse?
      • Martin is planning to base the implementation around sparse approach
      • Discuss next time
  • API for interactive plotting
    • https://github.com/geopandas/geopandas/issues/1904
    • We want pluggable interactive plotting backends. How to do it smoothly?
      • interest from some of the plotting backends
      • don't really want global config for plot method
      • want to keep usage of static and interactive plotting separate, don't clobber the static implementation by using interactive plotting; keep these in separate methods
      • add another method: explore / view for interactive maps
    • datashader option to HVplot:
      • works quite well for large data
      • Joris follow up with them: can instance check be expanded to include geopandas geodataframes (via dask-geopandas), not just spatialpandas frames
  • Community calls
    • we have a shared Google Calendar for GeoPandas-related events
    • meetings are set to 17:00 UTC every two months (last Thursday)
  • xyzservices
    • new package under geopandas umbrella
    • formerly contextily.providers
    • https://github.com/geopandas/xyzservices
    • planning to have available before next release of geopandas
    • will have 2 JSON formats:
      • pretty version that includes metadata
      • compiled / compressed version that is actually used in code; plan is to create via Github action
  • Ecosystem update
    • cuSpatial should fully support geopandas-cuspatial dataframe conversion in the next release
  • Shapely 2.0
    • Joris planning to do more on this in June
    • main blocking issue is the discussion around STRtree
    • Differences in minimum rotated rectangle between Shapely's pure python method and method in GEOS
      • Follow up with GEOS team about differences
      • OpenCV method same as SHapely
      • Also a method in PostGIS - is it the same
  • Pyogrio
    • Brendan: transfer to GeoPandas org
  • Other

2021-05-24 GSOC coordination

  • Weekly meetings
  • Use public channels for discussion / questions (github issues, gitter channel, (specific? -> make a dask-geopandas channel))
  • Single Point of Contact (more for administrative questions)
    • Martin
  • Blog: on NumFOCUS & personal site is fine, no need for GeoPandas branded one

2021-03-25

(attending: Martin, Joris, James, Brendan, Sangarshanan, Levi)

  • Google Summer of Code
    • We have submitted 3 project ideas
    • Students should get in touch now and submit proposals within weeks
      • students will start applying next Monday
      • We need to select students between mid-April and mid-May
    • Should we advertise it more? Prospect on possible students?
      • TODO: Post on Twitter again (done)
      • PySAL: primarily recruits from own students; ~1/2 have been affiliated that way
  • Community repository
    • we have a new geopandas/community repo
      • if not package specific to not specific to code, governance, code of conduct, post to this
      • if specific to GeoPandas post issues to GeoPandas instead
      • use for announcing meetings or proposals (workshops, funding)
    • how should we efficiently use it?
    • https://github.com/geopandas/community
    • TODO: post issue for how to get funding for GeoPandas features or ideas list for potential future grants
  • Community calls
    • shall we switch to some predictable schedule? (Bi-)Monthly?
    • start with bimonthly on last Thursday of each month
      • TODO: post schedule to community repo
    • archive prior call notes to community repo; keep markdown doc for latest meeting
  • dask-geopandas
    • repository moved to GeoPandas org
    • https://github.com/geopandas/dask-geopandas
    • Dask-Summit workshop proposal
      • In May: https://summit.dask.org/
      • submitted proposal around scaling GeoPandas vector operations
      • Could have a presentation about current status of dask-geopandas
      • Some discussion around spatial partitioning
      • Look for ways to collaborate with spatial pandas
      • Would be good to do visualization of bigger data
      • TODO: add issue in community repo for ideas for this workshop
    • First alpha released on PyPI, still needs conda-forge
      • Martin: will add to conda-forge
      • Biggest needs: spatial index and overlap operations
  • User-friendly API of matrix binary operations
    • would be nice to have "intersects_matrix" in 0.10
    • We should agree on the API design, implementation should be straigtforward based on query_bulk,
    • https://github.com/geopandas/geopandas/pull/1674
    • returning a list maybe not particularly useful
    • might be a good to have a few example use cases
      • does any polygon in input intersect any in right dataframe
      • which of them in left dataframe intersects any in right dataframe
      • how many intersects
    • use outer strategy with sparse argument
      • currently don't depend on scipy; makes it harder to use sparse option
      • can keep sparse as an optional argument; fall back to full matrix
    • another alternative is to use xarray and pydata sparse backend (optional dependencies)
    • could just return dense pandas table of left and right indices
  • Interactive plotting
    • the existing tools are not as friendly as we thought
    • folium-based implementation of GeoDataFrame.view() mirroring the language of plot()
    • https://github.com/martinfleis/geopandas-view
    • should it be embedded in GeoPandas? Or as an affiliated project under GeoPandas repo?
    • @sangarshanan is willing to help maintaining it
    • status: most of the stuff supported for static plotting in matplotlib is now supported against folium
    • considerations for API:
      • plotting backend provider
      • namespacing folium / interactive methods to prevent collision with static plotting
      • over some threshold do not want to plot in folium
      • might be good to look at how sf in R handles translation to backend providers
      • implementation of backend can be outside GeoPandas; might be easier to have this directly in GeoPandas in order to allow it as a default (not a lot of code)
    • will do a bit more work to polish then migrate into GeoPandas
  • contextily providers module
  • Ecosystem update
  • geopandas.org
    • we still don't have access to the domain to point it to RTD
      • Joris will ping Kelsey J.
    • also need to have ownership in Pypi; need to be able to add others
    • conda forge:
      • anyone can help maintain this
      • currently Joris, James, Filipe
  • NumFOCUS small grants
    • do we want to apply for something in the near future?
    • anyone has capacity?
    • next round likely before summer
    • open issue on community repo

2021-03-25

(attending: Martin, Joris, James, Brendan, Sangarshanan, Levi)

  • Google Summer of Code
    • We have submitted 3 project ideas
    • Students should get in touch now and submit proposals within weeks
      • students will start applying next Monday
      • We need to select students between mid-April and mid-May
    • Should we advertise it more? Prospect on possible students?
      • TODO: Post on Twitter again (done)
      • PySAL: primarily recruits from own students; ~1/2 have been affiliated that way
  • Community repository
    • we have a new geopandas/community repo
      • if not package specific to not specific to code, governance, code of conduct, post to this
      • if specific to GeoPandas post issues to GeoPandas instead
      • use for announcing meetings or proposals (workshops, funding)
    • how should we efficiently use it?
    • https://github.com/geopandas/community
    • TODO: post issue for how to get funding for GeoPandas features or ideas list for potential future grants
  • Community calls
    • shall we switch to some predictable schedule? (Bi-)Monthly?
    • start with bimonthly on last Thursday of each month
      • TODO: post schedule to community repo
    • archive prior call notes to community repo; keep markdown doc for latest meeting
  • dask-geopandas
    • repository moved to GeoPandas org
    • https://github.com/geopandas/dask-geopandas
    • Dask-Summit workshop proposal
      • In May: https://summit.dask.org/
      • submitted proposal around scaling GeoPandas vector operations
      • Could have a presentation about current status of dask-geopandas
      • Some discussion around spatial partitioning
      • Look for ways to collaborate with spatial pandas
      • Would be good to do visualization of bigger data
      • TODO: add issue in community repo for ideas for this workshop
    • First alpha released on PyPI, still needs conda-forge
      • Martin: will add to conda-forge
      • Biggest needs: spatial index and overlap operations
  • User-friendly API of matrix binary operations
    • would be nice to have "intersects_matrix" in 0.10
    • We should agree on the API design, implementation should be straigtforward based on query_bulk,
    • https://github.com/geopandas/geopandas/pull/1674
    • returning a list maybe not particularly useful
    • might be a good to have a few example use cases
      • does any polygon in input intersect any in right dataframe
      • which of them in left dataframe intersects any in right dataframe
      • how many intersects
    • use outer strategy with sparse argument
      • currently don't depend on scipy; makes it harder to use sparse option
      • can keep sparse as an optional argument; fall back to full matrix
    • another alternative is to use xarray and pydata sparse backend (optional dependencies)
    • could just return dense pandas table of left and right indices
  • Interactive plotting
    • the existing tools are not as friendly as we thought
    • folium-based implementation of GeoDataFrame.view() mirroring the language of plot()
    • https://github.com/martinfleis/geopandas-view
    • should it be embedded in GeoPandas? Or as an affiliated project under GeoPandas repo?
    • @sangarshanan is willing to help maintaining it
    • status: most of the stuff supported for static plotting in matplotlib is now supported against folium
    • considerations for API:
      • plotting backend provider
      • namespacing folium / interactive methods to prevent collision with static plotting
      • over some threshold do not want to plot in folium
      • might be good to look at how sf in R handles translation to backend providers
      • implementation of backend can be outside GeoPandas; might be easier to have this directly in GeoPandas in order to allow it as a default (not a lot of code)
    • will do a bit more work to polish then migrate into GeoPandas
  • contextily providers module
  • Ecosystem update
  • geopandas.org
    • we still don't have access to the domain to point it to RTD
      • Joris will ping Kelsey J.
    • also need to have ownership in Pypi; need to be able to add others
    • conda forge:
      • anyone can help maintain this
      • currently Joris, James, Filipe
  • NumFOCUS small grants
    • do we want to apply for something in the near future?
    • anyone has capacity?
    • next round likely before summer
    • open issue on community repo

2021-01-14

2020-08-20: A second meeting!

Agenda

  • NumFOCUS Documentation project

    • I'd like to update you on current development and discuss a bit further steps to decide on priorities and time frame.
    • context: https://github.com/geopandas/geopandas/issues/1564
    • Martin provided an update on the latest direction in documentation work in https://github.com/geopandas/geopandas/issues/1564
      • some examples will move to user guide where they are using the core functions
      • for examples gallery may use nb-sphynx instead of sphynx-gallery
      • Will bulk up installation instructions to help alleviate many of the complaints around installation issues
      • will add a longer-term roadmap within the docs
    • Going forward, Martin will add examples incrementally but will try to get this reviewed as a larger PR
    • New Advanced Guide will include more advanced topics like using spatial index and vectorization
    • Will need to add redirects from important pages from existing readthedocs pages to the new documentation structure
  • Select final logo

    • https://github.com/geopandas/geopandas/issues/1405
    • Let's make the final decision!
      • Go with the one with highest votes
    • This will go into a separate PR with all the versions and source files
    • Add a page to documentation with the logo and specific colors used
    • Share logo back to NumFOCUS
    • TODO: update the logo on twitter, etc
  • GitHub Sponsors

    • We may consider using GitHub Sponsor button. Someone recently asked how to support GeoPandas and I was not sure if there is any possibility of a direct (financial) support, apart from donating to NumFOCUS.
    • In order to have NumFOCUS accept $ on behalf of GeoPandas, may need to become a fiscally-sponsored project instead of just an affiliated project; Joris will check into this
    • For GitHub Sponsor have seen examples of sponsoring individuals; will need to see what it would take to sponsor the larger project
  • GeoPandas usage / promotion

    • Would like to feature groups that use GeoPandas as part of their work, maybe on GeoPandas blog (if there was one)
    • Blog: would like to do this outside sphynx
  • GeoPandas domain

    • Joris will follow up with Kelsey
      • Also request PyPi access from Kelsey
  • Packaging automation

    • Can use GitHub Actions to publish packages to PyPi / Conda
    • Can derive this from Pydata project
  • Social media

    • Twitter
      • Joris is currently maintaining this
      • Martin can help with this; Joris will share access
      • Example that came up on twitter from COVID-19 dashboards around showing density of points, maybe by hexagon; might want to add something like this as an example in the docs
  • GeoPandas academic paper

    • Geographical Analysis journal is having a special issue on Open Source Software for Spatial Analysis, edited by Luc Anselin and Serge Rey (both PySAL). We had a small exchange about the possibility of writing a paper about GeoPandas (which is long overdue I'd say) with Joris and Serge on twitter: https://twitter.com/jorisvdbossche/status/1282208649335779328 I feel that this would be great thing to do, although it naturally takes time to write a proper paper.
    • Special issue will require more background documentation & contextualization; not just a description about the project
    • Need to position it into the wider ecosystem; directly address how it has advanced spatial analysis in Python
    • Could start brainstorming / collecting ideas
    • Martin will make a google doc
    • Martin will check to see if there is sponsorship from the university for making this open access
      • Full fee is $3,000 US
    • If we don't go for this, make sure to go after a different publication that allows open access
  • GeoPandas Survey

  • GeoPandas 0.9 roadmap

    • If we want to release 0.9 in December (we discussed switching to 6-month release cycle), we could discuss what do we want to (ideally) include.
    • Binary predicates change - https://gist.github.com/martinfleis/abc7cdbf9f9266bf9ed369080eec7cea
      • proposal is to build this on the output of query bulk
      • people normally interested in 2 questions: does my polygon intersect any in the other data frame (not just same line), which polygons from right data frame are intersected with the one on the left
      • sf (in R) doesn't return series, they return metrics (sparse / dense)
      • could have a function that gives more direct access to sindex bulk query
      • general agreement about keeping the existing predicate behavior as is, but adding a new set of methods on GeoSeries to add the cross / matrix oriented approach
      • Martin will add a new issue for this with notebook example
    • spatial index
      • do we want to expose interface to multiple spatial index or abstract base class that can wrap other spatial index implementations
      • can revise the issue based on discussion but don't target for 0.9
      • revisit once pygeos / shapely 2.0 integration is complete and no longer optional; STRtree will be default as part of that
    • Brendan will try to get outstanding pygeos issue to add other predicates to STRtree in for next pygeos version:
    • Upcoming pygeos features in next release: mostly around multithreading, adding support for Z values to coordinate ops
    • geodetic distance / area calculations
      • this was tricky to write these to be performant, dealing with wrap around the poles
      • there is project to extract out the S2 ideas into a general purpose library
      • Create an example out of this work and put in documentation
      • Create an issue about adapting ideas from sf
      • Aim for supporting different spatial backend (e.g., S2) after 1.0
      • Look into some of the other backends
    • cuSpatial:
      • want to support interoperability, not sure about supporting different underlying geometry providers / backends
    • Longer term, maybe consider making GDAL / Fiona optional (e.g., read data from Parquet...)
    • vectorized snap
      • e.g., make larger linestring out of 2 disconnected segments
      • in GEOS overlay refactor, this will include a precision-based snap
  • Future NumFOCUS grants

    • I am not aware of the schedule of future funding rounds, but we should be prepared (if anyone has a capacity).
      • Normally should be 3rd round for this year, but haven't heard yet
  • dask-geopandas

    • Discuss the current state and future of dask-geopandas.
    • Big work items underway:
      • I/O methods: Joris adding Parquet support from geopandas
      • making use of spatial partitioning

2020-05-07: A first meeting!

  • NumFOCUS

    • Small development grants ideas:
      • better documentation
      • better integration / leveraging spatial indexes for operations
      • small improvements to topological operations (relates operations); elementwise vs all-pairwise
  • Logo

  • Lowering barriers to effective engagement / involving community

    • reviewing PR bottlenecks
      • time of core maintainers
      • huge PRs, can we suggest folks make smaller PRs?
  • Maintenance bottlenecks

  • Roadmap (1.0?)

    • Shapely 2.0 / pygeos speed-ups
    • API for topological operations
    • IO
      • parquet/feather
      • faster GDAL
      • databases
      • consistent API
    • Integrating raster operations
      • zonal stats is problematic for large data
    • geodetic distance etc (geography)
    • visualization
      • maybe geoplot becomes an affiliate like contextily
      • residentmario may not have time naymore for maintenance
    • Vectorized snap feature to other feature
  • Do something like http://xarray.pydata.org/en/stable/roadmap.html

    • Open an issue for this
  • places to ask questions vs. filing an issue? document.

  • Documentation

    • notebooks/examples
  • Installation issues