Skip to content

Community Meeting Notes Archive

Martin Fleischmann edited this page Jan 27, 2022 · 5 revisions

The archive of Community Meeting Notes. See the most recent and tentative agenda for the next meeting on hackmd.

2022-01-27

(attending: Martin Fleischmann, Brendan Ward, Thomas Statham, Matt Richards, Levi Wolf, Joris Van den Bossche, Alan Snow)

  • Community call time
    • Matt is in UTC+10, Brendan UTC-8, Alan is UTC-6, Joris is UTC+1, Martin and Levi are UTC
    • Shall we consider different time or switching between them periodically?
    • Next time will try later UTC time (20:00 UTC?)
  • Ecocystem updates
  • NumFOCUS SDG (S2?)
    • Round 1
      • Call for Proposals Announcement: February 4, 2022
      • Proposal Submission Deadline: March 4, 2022
      • Committee Selection Deadline: March 18, 2022
      • Notification to Applicants Deadline: April 15, 2022
    • Ideas:
  • GSoC
  • GeoPython 2022
    • Basel Switzerland June 13-15 (hybrid; might have in-person component)
    • Talk submission end of Feb, workshops end of March
    • Might be good to have a talk on state of GeoPandas and ecosystem
    • Might be good to have workshop on dask-geopandas
  • Foss4G
    • Firenze Italy, Aug 22-28
    • Deadlines: Talks/papers: end of Feb
    • Let's follow up offline with Martin, Joris, Levi
  • GeoPandas 0.11 release timeline
    • should there be one more release before shapely 2.0 support?
    • Issues around GeoDataFrame constructor / active geometry columns
      • 0 geometry columns -> no GeoDataFrame
      • >=1 geometry columns -> keep GeoDataFrame, even if active is not present
        • better handle the case of the active geometry column being present
        • better handle crs on the geodataframe
  • dask-geopandas 0.1 release timeline
    • want to include spatial shuffle, documentation updates, Hilbert distance with numpy
      • read_file via pyogrio would also be nice to include and nearly ready; Joris will look at this again soon
    • Hilbert distance: do we want to have this in GEOS C API and just include in Shapely 2.0? Unlikely to be faster having this as a scalar ufunc against GEOS
      • TODO: open issue upstream at GEOS
    • regression in dissolve operation
      • dask renames columns in intermediate aggregation results then names them back; this creates a new GeoDataFrame with no geometry, which then fails in subsequent step
  • possible dask-geopandas funding from the GDSL
    • may be opportunity to fund someone on dask-geopandas
  • GeoPandas roadmap

2021-12-02

(attending: Martin Fleischmann, Joris Van den Bossche, Brendan Ward, Benoit Bovy, Alan Snow, Jan Simbera)

  • Ecosystem updates
    • GeoPandas:
      • pyogrio engine (https://github.com/geopandas/geopandas/pull/2225)
        • longer term may want to do a hard switch from Fiona to pyogrio; some problems if both are installed via pip (conda is OK)
        • may also want to make the backends optional, and install pure python support for Shapefile / geopackage by default (or leave all as optional)
        • may want to look into xarray engine loading model
        • need to figure out how to build wheels for pyogrio
    • Shapely 2.0 / pygoes
      • The merge is finally happening! (this will also mean that Shapely main branch is temporarily not working with GeoPandas)
      • push new feature development to shapely instead of pygeos
      • pygeos 0.12 release coming soon
      • once pygeos is fully integrated into Shapely and stable, then archive pygeos; will remove pygeos opt dep. from geopandas by geopandas 1.0
    • Dask-GeoPandas
      • much of the core functionality, mostly working on spatial partitions
    • GeoArrow specification
      • First draft: https://github.com/geopandas/geo-arrow-spec/pull/12
      • __geo_arrow_interface__ in Python?
      • goal is to have arrow native way to store geometries instead of WKB for storage, uses compact storage of coordinates
      • approach already used by cuSpatial for copying spatial data to GPU
      • already some of the basic functionality in pygeos / Shapely 2.0 (get rings, coordinates, etc); requires multiple steps, but already faster than WKB conversion.
      • goal is to have one function that does this conversion
  • Expansion of the team
    • Martin's time is restricted in the following months leading to long response times on issues and PRs
    • consider using triage approach
    • need to formalize approach for adding new committers
  • S2 geometry engine
    • see this thread for context
    • https://github.com/benbovy/pys2index
    • overview from Benoit:
      • lightweight wrapper for S2 point index with API similar to scipy.spatial.cKDTree
      • performance in benchmarks so far is quite fast
      • would like to have vectorized wrappers for S2
      • S2 appears to be actively maintained and about to get additional functionality soon
      • used python-xtensor and pybind11 to work with S2 and numpy arrays
    • Two possible approaches to integrate:
      • own way to store geometries specfic to backend engine; convert geometries on the fly to S2 objects as part of specific operation (e.g., predicate)
    • Wrapper classes:
      • pygeos uses Python C extension wrapper for GEOS geometries so that GEOS objects are managed according to Python object lifecycles
    • related issue
    • consider putting in a request for NumFOCUS small development grant to start building out some of this support (next cycle may open early 2022): https://numfocus.org/programs/small-development-grants
  • Formalise and publish a roadmap
  • 2022 meetings schedule
    • keep current cycle: last Thurs of uneven months, same UTC time

2021-09-30

(attending: Martin Fleischmann, Joris Van den Bossche, Levi Wolf, Thomas Louf, Brendan Ward, Daniel Alejandro Mesejo-Leon, Imanol)

  • 0.10 release
  • geopandas.org domain
    • any updated regarding ownership?
    • need to find someone who can get a direct response from Kelsey
  • geopandas/benchmarks repo (or benchmark-data)
    • for macro benchmarks that don't fit into the ASV benchmarks
    • use issues to nominate datasets to use for benchmarks
    • GADM polygons often offer very good variety in terms of points-per-polygon
    • railways, municipalities..
  • EPSRC grant call
    • grant proposal is coming along; due Oct 14
    • integrating pygeos and geopandas-dask philosophies into core geopandas plus integration with other libraries
    • need
      • letter of support from Tom Augspurger (w/ planetary computer) (if anyone has better contact info than his gmail, send to [email protected])
      • 2 page resume/CV for @jorisvandenbossche & @martinfleis
    • proposal text:
  • Sub-project status updates:
    • dask-geopandas
      • once shuffle is in place will make next release (0.1) and publicize more
      • already in planetary computer docker images
      • had a Google Summer of Code project on this focused on spatial partitioning methods
    • pygeos / Shapely 2.0
      • GEOS 3.10 release coming soon; will have some things that we'd like to add
      • Shapely 1.8 is ready with all deprecations in place, just needs to be reviewed / released
      • will migrate pygeos into shapely after 1.8 is out
      • need to coordinate with Sean Gillies re: committers / admin rights
    • pyogrio
      • after geopandas 0.10, add an engine keyword to read_file / to_file to use
        • create issue for this (-> who?)
      • longer term (geopandas 1.0) aim to have Fiona replaced by pyogrio
      • we can also expose the pyogrio helper functions in geopandas (eg list_layers())
    • xyzservices (https://github.com/geopandas/xyzservices)
    • contextily

2021-08-05

(attending: Martin Fleischmann, Levi Wolf, Stefanie Lumnitz, Tom Augspurger, Brendan Ward, Joris Van den Bossche, Thomas Statham)

  • Microsoft / planetary-computer & tabular data
    • https://planetarycomputer.microsoft.com/catalog
    • based around STAC; supports Zarr / NetCDF. Working on expanding to tabular support
    • wanting to refine recommendations for representing tabular data
      • use Parquet format
      • need to finalize the implementation of geo-arrow-spec in Geopandas
        • metadata is mostly done
        • storage currently uses WKB (will always support this as a fallback), planning to revist this to optimize using Arrow data structures
        • would like GDAL to support this as well; longer term want to use the Arrow C data API (both for file formats as well as transport after reading those to downstream libs like Geopandas)
    • See Cloud Data Warehouse Geospatial Interoperability
      • this is just getting off the ground
  • NumFOCUS dask-geopandas IO project
    • see proposal
    • Parquet support is mostly complete
    • Feather dataset (https://github.com/geopandas/dask-geopandas/pull/91/)
    • Plan is to:
      • read bounds from file (already implemented)
      • use methods in dask-geopandas to determine partitions (via Hilbert curve distance, etc)
      • then read underlying features into those paritions
    • Timeline is next ~3 months
  • funding
  • xyzservices release
    • https://github.com/geopandas/xyzservices
    • takes contextily providers (metadata for tile providers) and puts into dedicated package
      • goal is centralized package to be used within the ecosystem
      • contextily will be updated soon to use this
    • pushing more broadly within ecosystem; other packages starting to use or expressed interest
  • installation as geopandas and geopandas-base to either get minimal dependencies or most dependencies
  • API for tools
    • gdf.sjoin(other) vs geopandas.sjoin(left, right) #1984
    • need to define rules for what is a method vs a function
      • method approach is more common for pandas
      • for dask-geopandas method is preferred
      • clip is potentially problematic since supported by pandas, but since it is a numeric method not applicable to geometries anyway (currently fails), probably OK to make clip here support only geometry implementation
    • duplicate vs deprecate functional approach?
      • in favor of deprecation, though a bit annoying for community since functions are widely used
      • short term can pass through functional to method approach to limit duplication of code
      • start teaching around the method approach
      • next release aim to have method approach, release after mark functional ones as deprecated
  • API of matrix binary operations
    • https://github.com/geopandas/geopandas/pull/1674
    • We now have an implementation based on sparse matrix which works really well for all the use cases
    • API:
      • always return sparse array (use sparse package as optional dependency)
        • basic support for sparse in dask (using scipy.sparse) but lots of things not yet in place
        • have a keyword for sparse backend scipy.sparse or pydata sparse?
      • single method (predicate is a parameter) vs one method per predicate (which could use the former internally)
        • could also have predicate_matrix for everything, and also expose intersects_matrix since this is most likely used
  • pyData Global
    • possible talk on updates in Geopandas
  • type hints
    • lots of outstanding PRs -> start with reviewing the geoseries.py file
    • testing
    • may not want to do for next release
    • for internal functions, aim to have strict types
  • 0.10 release target
    • add explore as a highlight in this release
  • Shapely 2.0
    • slowly moving forward
    • STRtree discussion resolved
    • need to have a shapely 1.8 release first
    • branch in shapely using pygeos is ready to merge into master
    • numpy warnings -> also ignore in geopandas (TODO Joris)
  • functions in pygeos as methods on GeoSeries

2021-05-27

(attending Martin, Joris, Stefanie, Thomas, Brendan)

  • dask-geopandas
    • Dask Summit workshop debrief
    • Google Summer of Code
      • We have one project on dask-geopandas development
        • Logistics:
          • smaller meetings every week, aim for Thurs 4-5 PM UTC; Martin will setup meetings
          • every 2 months a larger GeoPandas meeting
          • use Github issues, PRs, GeoPandas gitter, dask-geopandas gitter
          • Martin is admin point of contact
          • Blog posts from GSOC: these to get linked into NumFOCUS blog
        • Goals:
          • spatial partitioning
            • explore writing out to Parquet?
            • need to figure out partitioning methods, e.g., Hilbert curve
            • probably want to implement a couple methods: Hilbert, maybe a gridded approach
            • first identify some of the options:
              • simple grid
              • known regions (can do spatial clustering for getting more or less homogeneous sized partitions)
              • hilbert curve
              • quadtree: might work well, not exposed yet in GEOS C API / pygeos
              • strtree: don't have access to nodes / leaves via GEOS C API / pygeos
            • storage of partitions
              • right now just polygons as a geoseries
          • spatial indexing
            • also want to make sure this gets done
            • only place this is currently used is for writing to Parquet and cx coordinate indexer
            • good starter PR: simple predicates: intersects; check for overlap with partition first, before checking geometries within partition
      • feedback to rejected projects?
    • NumFocus SDG
      • Joris wants to apply for SDG to work on dask-geopandas
      • Focus more on I/O
        • Read large dataset, have dask-geopandas figure out partitioning to files
        • Read index and bounding boxes into memory to drive the partitioning, then use the partion bounding boxes or lists of indexes to query out chunks of data
        • Optimize parquet: store coordinates instead of WKB
        • Feather support? Right now using the dask support for Parquet, not available for Feather in dask; Joris has a prototype Feather file reader for dask
        • Convert GDAL directly to Arrow memory format instead of WKB
          • maybe do directly in GDAL
          • try first in pyogrio
  • GeoPandas Blog
  • API of matrix binary operations
    • https://github.com/geopandas/geopandas/pull/1674
    • We now have an implementation based on sparse matrix which works really well for all the use cases
    • Qs:
      • API
      • which sparse backend? scipy.sparse or pydata sparse?
      • Martin is planning to base the implementation around sparse approach
      • Discuss next time
  • API for interactive plotting
    • https://github.com/geopandas/geopandas/issues/1904
    • We want pluggable interactive plotting backends. How to do it smoothly?
      • interest from some of the plotting backends
      • don't really want global config for plot method
      • want to keep usage of static and interactive plotting separate, don't clobber the static implementation by using interactive plotting; keep these in separate methods
      • add another method: explore / view for interactive maps
    • datashader option to HVplot:
      • works quite well for large data
      • Joris follow up with them: can instance check be expanded to include geopandas geodataframes (via dask-geopandas), not just spatialpandas frames
  • Community calls
    • we have a shared Google Calendar for GeoPandas-related events
    • meetings are set to 17:00 UTC every two months (last Thursday)
  • xyzservices
    • new package under geopandas umbrella
    • formerly contextily.providers
    • https://github.com/geopandas/xyzservices
    • planning to have available before next release of geopandas
    • will have 2 JSON formats:
      • pretty version that includes metadata
      • compiled / compressed version that is actually used in code; plan is to create via Github action
  • Ecosystem update
    • cuSpatial should fully support geopandas-cuspatial dataframe conversion in the next release
  • Shapely 2.0
    • Joris planning to do more on this in June
    • main blocking issue is the discussion around STRtree
    • Differences in minimum rotated rectangle between Shapely's pure python method and method in GEOS
      • Follow up with GEOS team about differences
      • OpenCV method same as SHapely
      • Also a method in PostGIS - is it the same
  • Pyogrio
    • Brendan: transfer to GeoPandas org
  • Other

2021-05-24 GSOC coordination

  • Weekly meetings
  • Use public channels for discussion / questions (github issues, gitter channel, (specific? -> make a dask-geopandas channel))
  • Single Point of Contact (more for administrative questions)
    • Martin
  • Blog: on NumFOCUS & personal site is fine, no need for GeoPandas branded one

2021-03-25

(attending: Martin, Joris, James, Brendan, Sangarshanan, Levi)

  • Google Summer of Code
    • We have submitted 3 project ideas
    • Students should get in touch now and submit proposals within weeks
      • students will start applying next Monday
      • We need to select students between mid-April and mid-May
    • Should we advertise it more? Prospect on possible students?
      • TODO: Post on Twitter again (done)
      • PySAL: primarily recruits from own students; ~1/2 have been affiliated that way
  • Community repository
    • we have a new geopandas/community repo
      • if not package specific to not specific to code, governance, code of conduct, post to this
      • if specific to GeoPandas post issues to GeoPandas instead
      • use for announcing meetings or proposals (workshops, funding)
    • how should we efficiently use it?
    • https://github.com/geopandas/community
    • TODO: post issue for how to get funding for GeoPandas features or ideas list for potential future grants
  • Community calls
    • shall we switch to some predictable schedule? (Bi-)Monthly?
    • start with bimonthly on last Thursday of each month
      • TODO: post schedule to community repo
    • archive prior call notes to community repo; keep markdown doc for latest meeting
  • dask-geopandas
    • repository moved to GeoPandas org
    • https://github.com/geopandas/dask-geopandas
    • Dask-Summit workshop proposal
      • In May: https://summit.dask.org/
      • submitted proposal around scaling GeoPandas vector operations
      • Could have a presentation about current status of dask-geopandas
      • Some discussion around spatial partitioning
      • Look for ways to collaborate with spatial pandas
      • Would be good to do visualization of bigger data
      • TODO: add issue in community repo for ideas for this workshop
    • First alpha released on PyPI, still needs conda-forge
      • Martin: will add to conda-forge
      • Biggest needs: spatial index and overlap operations
  • User-friendly API of matrix binary operations
    • would be nice to have "intersects_matrix" in 0.10
    • We should agree on the API design, implementation should be straigtforward based on query_bulk,
    • https://github.com/geopandas/geopandas/pull/1674
    • returning a list maybe not particularly useful
    • might be a good to have a few example use cases
      • does any polygon in input intersect any in right dataframe
      • which of them in left dataframe intersects any in right dataframe
      • how many intersects
    • use outer strategy with sparse argument
      • currently don't depend on scipy; makes it harder to use sparse option
      • can keep sparse as an optional argument; fall back to full matrix
    • another alternative is to use xarray and pydata sparse backend (optional dependencies)
    • could just return dense pandas table of left and right indices
  • Interactive plotting
    • the existing tools are not as friendly as we thought
    • folium-based implementation of GeoDataFrame.view() mirroring the language of plot()
    • https://github.com/martinfleis/geopandas-view
    • should it be embedded in GeoPandas? Or as an affiliated project under GeoPandas repo?
    • @sangarshanan is willing to help maintaining it
    • status: most of the stuff supported for static plotting in matplotlib is now supported against folium
    • considerations for API:
      • plotting backend provider
      • namespacing folium / interactive methods to prevent collision with static plotting
      • over some threshold do not want to plot in folium
      • might be good to look at how sf in R handles translation to backend providers
      • implementation of backend can be outside GeoPandas; might be easier to have this directly in GeoPandas in order to allow it as a default (not a lot of code)
    • will do a bit more work to polish then migrate into GeoPandas
  • contextily providers module
  • Ecosystem update
  • geopandas.org
    • we still don't have access to the domain to point it to RTD
      • Joris will ping Kelsey J.
    • also need to have ownership in Pypi; need to be able to add others
    • conda forge:
      • anyone can help maintain this
      • currently Joris, James, Filipe
  • NumFOCUS small grants
    • do we want to apply for something in the near future?
    • anyone has capacity?
    • next round likely before summer
    • open issue on community repo

2021-03-25

(attending: Martin, Joris, James, Brendan, Sangarshanan, Levi)

  • Google Summer of Code
    • We have submitted 3 project ideas
    • Students should get in touch now and submit proposals within weeks
      • students will start applying next Monday
      • We need to select students between mid-April and mid-May
    • Should we advertise it more? Prospect on possible students?
      • TODO: Post on Twitter again (done)
      • PySAL: primarily recruits from own students; ~1/2 have been affiliated that way
  • Community repository
    • we have a new geopandas/community repo
      • if not package specific to not specific to code, governance, code of conduct, post to this
      • if specific to GeoPandas post issues to GeoPandas instead
      • use for announcing meetings or proposals (workshops, funding)
    • how should we efficiently use it?
    • https://github.com/geopandas/community
    • TODO: post issue for how to get funding for GeoPandas features or ideas list for potential future grants
  • Community calls
    • shall we switch to some predictable schedule? (Bi-)Monthly?
    • start with bimonthly on last Thursday of each month
      • TODO: post schedule to community repo
    • archive prior call notes to community repo; keep markdown doc for latest meeting
  • dask-geopandas
    • repository moved to GeoPandas org
    • https://github.com/geopandas/dask-geopandas
    • Dask-Summit workshop proposal
      • In May: https://summit.dask.org/
      • submitted proposal around scaling GeoPandas vector operations
      • Could have a presentation about current status of dask-geopandas
      • Some discussion around spatial partitioning
      • Look for ways to collaborate with spatial pandas
      • Would be good to do visualization of bigger data
      • TODO: add issue in community repo for ideas for this workshop
    • First alpha released on PyPI, still needs conda-forge
      • Martin: will add to conda-forge
      • Biggest needs: spatial index and overlap operations
  • User-friendly API of matrix binary operations
    • would be nice to have "intersects_matrix" in 0.10
    • We should agree on the API design, implementation should be straigtforward based on query_bulk,
    • https://github.com/geopandas/geopandas/pull/1674
    • returning a list maybe not particularly useful
    • might be a good to have a few example use cases
      • does any polygon in input intersect any in right dataframe
      • which of them in left dataframe intersects any in right dataframe
      • how many intersects
    • use outer strategy with sparse argument
      • currently don't depend on scipy; makes it harder to use sparse option
      • can keep sparse as an optional argument; fall back to full matrix
    • another alternative is to use xarray and pydata sparse backend (optional dependencies)
    • could just return dense pandas table of left and right indices
  • Interactive plotting
    • the existing tools are not as friendly as we thought
    • folium-based implementation of GeoDataFrame.view() mirroring the language of plot()
    • https://github.com/martinfleis/geopandas-view
    • should it be embedded in GeoPandas? Or as an affiliated project under GeoPandas repo?
    • @sangarshanan is willing to help maintaining it
    • status: most of the stuff supported for static plotting in matplotlib is now supported against folium
    • considerations for API:
      • plotting backend provider
      • namespacing folium / interactive methods to prevent collision with static plotting
      • over some threshold do not want to plot in folium
      • might be good to look at how sf in R handles translation to backend providers
      • implementation of backend can be outside GeoPandas; might be easier to have this directly in GeoPandas in order to allow it as a default (not a lot of code)
    • will do a bit more work to polish then migrate into GeoPandas
  • contextily providers module
  • Ecosystem update
  • geopandas.org
    • we still don't have access to the domain to point it to RTD
      • Joris will ping Kelsey J.
    • also need to have ownership in Pypi; need to be able to add others
    • conda forge:
      • anyone can help maintain this
      • currently Joris, James, Filipe
  • NumFOCUS small grants
    • do we want to apply for something in the near future?
    • anyone has capacity?
    • next round likely before summer
    • open issue on community repo

2021-01-14

2020-08-20: A second meeting!

Agenda

  • NumFOCUS Documentation project

    • I'd like to update you on current development and discuss a bit further steps to decide on priorities and time frame.
    • context: https://github.com/geopandas/geopandas/issues/1564
    • Martin provided an update on the latest direction in documentation work in https://github.com/geopandas/geopandas/issues/1564
      • some examples will move to user guide where they are using the core functions
      • for examples gallery may use nb-sphynx instead of sphynx-gallery
      • Will bulk up installation instructions to help alleviate many of the complaints around installation issues
      • will add a longer-term roadmap within the docs
    • Going forward, Martin will add examples incrementally but will try to get this reviewed as a larger PR
    • New Advanced Guide will include more advanced topics like using spatial index and vectorization
    • Will need to add redirects from important pages from existing readthedocs pages to the new documentation structure
  • Select final logo

    • https://github.com/geopandas/geopandas/issues/1405
    • Let's make the final decision!
      • Go with the one with highest votes
    • This will go into a separate PR with all the versions and source files
    • Add a page to documentation with the logo and specific colors used
    • Share logo back to NumFOCUS
    • TODO: update the logo on twitter, etc
  • GitHub Sponsors

    • We may consider using GitHub Sponsor button. Someone recently asked how to support GeoPandas and I was not sure if there is any possibility of a direct (financial) support, apart from donating to NumFOCUS.
    • In order to have NumFOCUS accept $ on behalf of GeoPandas, may need to become a fiscally-sponsored project instead of just an affiliated project; Joris will check into this
    • For GitHub Sponsor have seen examples of sponsoring individuals; will need to see what it would take to sponsor the larger project
  • GeoPandas usage / promotion

    • Would like to feature groups that use GeoPandas as part of their work, maybe on GeoPandas blog (if there was one)
    • Blog: would like to do this outside sphynx
  • GeoPandas domain

    • Joris will follow up with Kelsey
      • Also request PyPi access from Kelsey
  • Packaging automation

    • Can use GitHub Actions to publish packages to PyPi / Conda
    • Can derive this from Pydata project
  • Social media

    • Twitter
      • Joris is currently maintaining this
      • Martin can help with this; Joris will share access
      • Example that came up on twitter from COVID-19 dashboards around showing density of points, maybe by hexagon; might want to add something like this as an example in the docs
  • GeoPandas academic paper

    • Geographical Analysis journal is having a special issue on Open Source Software for Spatial Analysis, edited by Luc Anselin and Serge Rey (both PySAL). We had a small exchange about the possibility of writing a paper about GeoPandas (which is long overdue I'd say) with Joris and Serge on twitter: https://twitter.com/jorisvdbossche/status/1282208649335779328 I feel that this would be great thing to do, although it naturally takes time to write a proper paper.
    • Special issue will require more background documentation & contextualization; not just a description about the project
    • Need to position it into the wider ecosystem; directly address how it has advanced spatial analysis in Python
    • Could start brainstorming / collecting ideas
    • Martin will make a google doc
    • Martin will check to see if there is sponsorship from the university for making this open access
      • Full fee is $3,000 US
    • If we don't go for this, make sure to go after a different publication that allows open access
  • GeoPandas Survey

  • GeoPandas 0.9 roadmap

    • If we want to release 0.9 in December (we discussed switching to 6-month release cycle), we could discuss what do we want to (ideally) include.
    • Binary predicates change - https://gist.github.com/martinfleis/abc7cdbf9f9266bf9ed369080eec7cea
      • proposal is to build this on the output of query bulk
      • people normally interested in 2 questions: does my polygon intersect any in the other data frame (not just same line), which polygons from right data frame are intersected with the one on the left
      • sf (in R) doesn't return series, they return metrics (sparse / dense)
      • could have a function that gives more direct access to sindex bulk query
      • general agreement about keeping the existing predicate behavior as is, but adding a new set of methods on GeoSeries to add the cross / matrix oriented approach
      • Martin will add a new issue for this with notebook example
    • spatial index
      • do we want to expose interface to multiple spatial index or abstract base class that can wrap other spatial index implementations
      • can revise the issue based on discussion but don't target for 0.9
      • revisit once pygeos / shapely 2.0 integration is complete and no longer optional; STRtree will be default as part of that
    • Brendan will try to get outstanding pygeos issue to add other predicates to STRtree in for next pygeos version:
    • Upcoming pygeos features in next release: mostly around multithreading, adding support for Z values to coordinate ops
    • geodetic distance / area calculations
      • this was tricky to write these to be performant, dealing with wrap around the poles
      • there is project to extract out the S2 ideas into a general purpose library
      • Create an example out of this work and put in documentation
      • Create an issue about adapting ideas from sf
      • Aim for supporting different spatial backend (e.g., S2) after 1.0
      • Look into some of the other backends
    • cuSpatial:
      • want to support interoperability, not sure about supporting different underlying geometry providers / backends
    • Longer term, maybe consider making GDAL / Fiona optional (e.g., read data from Parquet...)
    • vectorized snap
      • e.g., make larger linestring out of 2 disconnected segments
      • in GEOS overlay refactor, this will include a precision-based snap
  • Future NumFOCUS grants

    • I am not aware of the schedule of future funding rounds, but we should be prepared (if anyone has a capacity).
      • Normally should be 3rd round for this year, but haven't heard yet
  • dask-geopandas

    • Discuss the current state and future of dask-geopandas.
    • Big work items underway:
      • I/O methods: Joris adding Parquet support from geopandas
      • making use of spatial partitioning

2020-05-07: A first meeting!

  • NumFOCUS

    • Small development grants ideas:
      • better documentation
      • better integration / leveraging spatial indexes for operations
      • small improvements to topological operations (relates operations); elementwise vs all-pairwise
  • Logo

  • Lowering barriers to effective engagement / involving community

    • reviewing PR bottlenecks
      • time of core maintainers
      • huge PRs, can we suggest folks make smaller PRs?
  • Maintenance bottlenecks

  • Roadmap (1.0?)

    • Shapely 2.0 / pygeos speed-ups
    • API for topological operations
    • IO
      • parquet/feather
      • faster GDAL
      • databases
      • consistent API
    • Integrating raster operations
      • zonal stats is problematic for large data
    • geodetic distance etc (geography)
    • visualization
      • maybe geoplot becomes an affiliate like contextily
      • residentmario may not have time naymore for maintenance
    • Vectorized snap feature to other feature
  • Do something like http://xarray.pydata.org/en/stable/roadmap.html

    • Open an issue for this
  • places to ask questions vs. filing an issue? document.

  • Documentation

    • notebooks/examples
  • Installation issues

Clone this wiki locally