-
-
Notifications
You must be signed in to change notification settings - Fork 0
Community Meeting Notes Archive
The archive of Community Meeting Notes. See the most recent and tentative agenda for the next meeting on hackmd.
(attending: Martin, Joris, James, Brendan, Sangarshanan, Levi)
-
Google Summer of Code
- We have submitted 3 project ideas
- Pure Python IO
- Plotting enhancements
dask-geopandas
- https://github.com/geopandas/geopandas/wiki/Google-Summer-of-Code-2021
- Students should get in touch now and submit proposals within weeks
- students will start applying next Monday
- We need to select students between mid-April and mid-May
- Should we advertise it more? Prospect on possible students?
- TODO: Post on Twitter again (done)
- PySAL: primarily recruits from own students; ~1/2 have been affiliated that way
- We have submitted 3 project ideas
-
Community repository
- we have a new geopandas/community repo
- if not package specific to not specific to code, governance, code of conduct, post to this
- if specific to GeoPandas post issues to GeoPandas instead
- use for announcing meetings or proposals (workshops, funding)
- how should we efficiently use it?
- https://github.com/geopandas/community
- TODO: post issue for how to get funding for GeoPandas features or ideas list for potential future grants
- we have a new geopandas/community repo
-
Community calls
- shall we switch to some predictable schedule? (Bi-)Monthly?
- start with bimonthly on last Thursday of each month
- TODO: post schedule to community repo
- archive prior call notes to community repo; keep markdown doc for latest meeting
-
dask-geopandas
- repository moved to GeoPandas org
- https://github.com/geopandas/dask-geopandas
- Dask-Summit workshop proposal
- In May: https://summit.dask.org/
- submitted proposal around scaling GeoPandas vector operations
- Could have a presentation about current status of dask-geopandas
- Some discussion around spatial partitioning
- Look for ways to collaborate with spatial pandas
- Would be good to do visualization of bigger data
- TODO: add issue in community repo for ideas for this workshop
- First alpha released on PyPI, still needs conda-forge
- Martin: will add to conda-forge
- Biggest needs: spatial index and overlap operations
-
User-friendly API of matrix binary operations
- would be nice to have "
intersects_matrix
" in 0.10 - We should agree on the API design, implementation should be straigtforward based on
query_bulk
, - https://github.com/geopandas/geopandas/pull/1674
- returning a list maybe not particularly useful
- might be a good to have a few example use cases
- does any polygon in input intersect any in right dataframe
- which of them in left dataframe intersects any in right dataframe
- how many intersects
- use outer strategy with sparse argument
- currently don't depend on scipy; makes it harder to use sparse option
- can keep sparse as an optional argument; fall back to full matrix
- another alternative is to use xarray and pydata sparse backend (optional dependencies)
- could just return dense pandas table of left and right indices
- would be nice to have "
-
Interactive plotting
- the existing tools are not as friendly as we thought
- folium-based implementation of
GeoDataFrame.view()
mirroring the language ofplot()
- https://github.com/martinfleis/geopandas-view
- should it be embedded in GeoPandas? Or as an affiliated project under GeoPandas repo?
- @sangarshanan is willing to help maintaining it
- status: most of the stuff supported for static plotting in matplotlib is now supported against folium
- considerations for API:
- plotting backend provider
- namespacing folium / interactive methods to prevent collision with static plotting
- over some threshold do not want to plot in folium
- might be good to look at how
sf
in R handles translation to backend providers - implementation of backend can be outside GeoPandas; might be easier to have this directly in GeoPandas in order to allow it as a default (not a lot of code)
- will do a bit more work to polish then migrate into GeoPandas
-
contextily providers module
- there is an idea to convert contextily providers module to a separate package
- both contextily and
view()
could be using it + others - https://github.com/geopandas/contextily/issues/153 and partially https://github.com/geopandas/contextily/issues/172
-
Ecosystem update
- pygeos/shapely2.0
- Current blocker: STRtree design (https://github.com/Toblerity/Shapely/pull/1064, https://github.com/Toblerity/Shapely/pull/1094)
- Shapely 1.8 release in prep for the transition; will raise deprecation warnings
- After 1.8, move pygeos code into Shapely; will need to coordinate with pygeos
- pyogrio
- Windows support?
- Do we need something similar as
fiona.Env
?
- pygeos/shapely2.0
-
geopandas.org
- we still don't have access to the domain to point it to RTD
- Joris will ping Kelsey J.
- also need to have ownership in Pypi; need to be able to add others
- conda forge:
- anyone can help maintain this
- currently Joris, James, Filipe
- we still don't have access to the domain to point it to RTD
-
NumFOCUS small grants
- do we want to apply for something in the near future?
- anyone has capacity?
- next round likely before summer
- open issue on community repo
-
User Survey Review
- Let's see what people think
- https://github.com/geopandas/geopandas-user-surveys/pull/1
- make private repo to store private responses
- Some points:
- interactive plotting: more examples
- performance is a consistent mentioned issue
-
Core dev team organisation
- Have official list of people?
- Mailing list
- Org like https://github.com/dask/community/
- Expanding the team?
- governance questions
- code of conduct
- mediation
- violations of CoC
- adding developers/removing (retiring?) developers
-
NumFOCUS fiscal sponsorship
-
Documentation
- Status of Martin's work
- https://github.com/geopandas/geopandas/pull/1759, https://github.com/geopandas/geopandas/pull/1757
-
geopandas-base
- having an option to depend on shapely only
- pure-Python I/O, no CRS
-
IO
-
pyogrio integration
- Discuss integration plan for testing I/O using
pyogrio
instead offiona
(seeing about 10-16x speedups)- try to package up on conda forge
- Discuss integration plan for testing I/O using
- non-GDAL IO
- pygpkg
- pyshp
- GSOC application focusing on non-GDAL IO @martin
-
pyogrio integration
-
GSOC
- think about participating in GSOC 21
- https://opensource.googleblog.com/2020/10/google-summer-of-code-2021-is-bringing.html
- Python GPGK IO project?
-
pyrosm
- Henrikki is looking for a home for pyrosm (yes to us)
-
GeoPandas paper
- REGION OA (no APC) journal
- https://openjournals.wu.ac.at/ojs/index.php/region/index
-
Ecosystem update
- pygeos/shapely2.0
- dask-geopandas
-
0.9 release
-
NumFOCUS Documentation project
- I'd like to update you on current development and discuss a bit further steps to decide on priorities and time frame.
- context: https://github.com/geopandas/geopandas/issues/1564
- Martin provided an update on the latest direction in documentation work in https://github.com/geopandas/geopandas/issues/1564
- some examples will move to user guide where they are using the core functions
- for examples gallery may use nb-sphynx instead of sphynx-gallery
- Will bulk up installation instructions to help alleviate many of the complaints around installation issues
- will add a longer-term roadmap within the docs
- Going forward, Martin will add examples incrementally but will try to get this reviewed as a larger PR
- New Advanced Guide will include more advanced topics like using spatial index and vectorization
- Will need to add redirects from important pages from existing readthedocs pages to the new documentation structure
-
Select final logo
- https://github.com/geopandas/geopandas/issues/1405
- Let's make the final decision!
- Go with the one with highest votes
- This will go into a separate PR with all the versions and source files
- Add a page to documentation with the logo and specific colors used
- Share logo back to NumFOCUS
- TODO: update the logo on twitter, etc
-
GitHub Sponsors
- We may consider using GitHub Sponsor button. Someone recently asked how to support GeoPandas and I was not sure if there is any possibility of a direct (financial) support, apart from donating to NumFOCUS.
- In order to have NumFOCUS accept $ on behalf of GeoPandas, may need to become a fiscally-sponsored project instead of just an affiliated project; Joris will check into this
- For GitHub Sponsor have seen examples of sponsoring individuals; will need to see what it would take to sponsor the larger project
-
GeoPandas usage / promotion
- Would like to feature groups that use GeoPandas as part of their work, maybe on GeoPandas blog (if there was one)
- Blog: would like to do this outside sphynx
-
GeoPandas domain
- Joris will follow up with Kelsey
- Also request PyPi access from Kelsey
- Joris will follow up with Kelsey
-
Packaging automation
- Can use GitHub Actions to publish packages to PyPi / Conda
- Can derive this from Pydata project
-
Social media
- Twitter
- Joris is currently maintaining this
- Martin can help with this; Joris will share access
- Example that came up on twitter from COVID-19 dashboards around showing density of points, maybe by hexagon; might want to add something like this as an example in the docs
- Twitter
-
GeoPandas academic paper
- Geographical Analysis journal is having a special issue on Open Source Software for Spatial Analysis, edited by Luc Anselin and Serge Rey (both PySAL). We had a small exchange about the possibility of writing a paper about GeoPandas (which is long overdue I'd say) with Joris and Serge on twitter: https://twitter.com/jorisvdbossche/status/1282208649335779328 I feel that this would be great thing to do, although it naturally takes time to write a proper paper.
- Special issue will require more background documentation & contextualization; not just a description about the project
- Need to position it into the wider ecosystem; directly address how it has advanced spatial analysis in Python
- Could start brainstorming / collecting ideas
- Martin will make a google doc
- Martin will check to see if there is sponsorship from the university for making this open access
- Full fee is $3,000 US
- If we don't go for this, make sure to go after a different publication that allows open access
-
GeoPandas Survey
- Discuss plan to finish up and post GeoPandas survey: https://docs.google.com/document/d/1caityqUUfgAN2u9VUJN78mTyS3fMYZI-ZvgtfLfio9A/edit?usp=sharing
- Martin will add the GDPR compliance
- Use Google Forms to release this
- Can be individual owner
- Martin can create the form
- Timeframe:
- Would like to launch as soon as possible, aim for sometime in Sept.
-
GeoPandas 0.9 roadmap
- If we want to release 0.9 in December (we discussed switching to 6-month release cycle), we could discuss what do we want to (ideally) include.
- Binary predicates change - https://gist.github.com/martinfleis/abc7cdbf9f9266bf9ed369080eec7cea
- proposal is to build this on the output of query bulk
- people normally interested in 2 questions: does my polygon intersect any in the other data frame (not just same line), which polygons from right data frame are intersected with the one on the left
-
sf
(in R) doesn't return series, they return metrics (sparse / dense) - could have a function that gives more direct access to sindex bulk query
- general agreement about keeping the existing predicate behavior as is, but adding a new set of methods on GeoSeries to add the cross / matrix oriented approach
- Martin will add a new issue for this with notebook example
- spatial index
- do we want to expose interface to multiple spatial index or abstract base class that can wrap other spatial index implementations
- can revise the issue based on discussion but don't target for 0.9
- revisit once pygeos / shapely 2.0 integration is complete and no longer optional; STRtree will be default as part of that
- Brendan will try to get outstanding pygeos issue to add other predicates to STRtree in for next pygeos version:
- Upcoming pygeos features in next release: mostly around multithreading, adding support for Z values to coordinate ops
- geodetic distance / area calculations
- this was tricky to write these to be performant, dealing with wrap around the poles
- there is project to extract out the S2 ideas into a general purpose library
- Create an example out of this work and put in documentation
- Create an issue about adapting ideas from
sf
- Aim for supporting different spatial backend (e.g.,
S2
) after 1.0 - Look into some of the other backends
- cuSpatial:
- want to support interoperability, not sure about supporting different underlying geometry providers / backends
- Longer term, maybe consider making GDAL / Fiona optional (e.g., read data from Parquet...)
- vectorized snap
- e.g., make larger linestring out of 2 disconnected segments
- in GEOS overlay refactor, this will include a precision-based snap
-
Future NumFOCUS grants
- I am not aware of the schedule of future funding rounds, but we should be prepared (if anyone has a capacity).
- Normally should be 3rd round for this year, but haven't heard yet
- I am not aware of the schedule of future funding rounds, but we should be prepared (if anyone has a capacity).
-
dask-geopandas
- Discuss the current state and future of
dask-geopandas
. - Big work items underway:
- I/O methods: Joris adding Parquet support from geopandas
- making use of spatial partitioning
- Discuss the current state and future of
-
NumFOCUS
- Small development grants ideas:
- better documentation
- better integration / leveraging spatial indexes for operations
- small improvements to topological operations (relates operations); elementwise vs all-pairwise
- Small development grants ideas:
-
Logo
- https://github.com/geopandas/geopandas/issues/1405/
- Joris: check with pandas
- Try different color, otherwise go with it!
-
Lowering barriers to effective engagement / involving community
- reviewing PR bottlenecks
- time of core maintainers
- huge PRs, can we suggest folks make smaller PRs?
- reviewing PR bottlenecks
-
Maintenance bottlenecks
-
Roadmap (1.0?)
- Shapely 2.0 / pygeos speed-ups
- API for topological operations
- IO
- parquet/feather
- faster GDAL
- databases
- consistent API
- Integrating raster operations
- zonal stats is problematic for large data
- geodetic distance etc (geography)
- visualization
- maybe geoplot becomes an affiliate like contextily
- residentmario may not have time naymore for maintenance
- Vectorized snap feature to other feature
-
Do something like http://xarray.pydata.org/en/stable/roadmap.html
- Open an issue for this
-
places to ask questions vs. filing an issue? document.
-
Documentation
- notebooks/examples
-
Installation issues