Skip to content

Research Request - Map of California Transit Facilities #1439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
evansiroky opened this issue Apr 7, 2025 · 9 comments · May be fixed by #1451
Open

Research Request - Map of California Transit Facilities #1439

evansiroky opened this issue Apr 7, 2025 · 9 comments · May be fixed by #1451
Assignees
Labels
research request Issues that serve as a request for research (summary and handoff)

Comments

@evansiroky
Copy link
Member

Research Question

As researchers inteterested in potential charging infrastructure siting,
I want to view a map of all transit facilities in California,
So that we can get a sense of potential infrastructure sites in California.

How will this research be used?

See above user story.

Stakeholders & End-Users

Gillian and others interested in charging infrastructure.

Metrics

Probably need to do batch geocoding.

Data sources

https://www.transit.dot.gov/ntd/data-product/2023-annual-database-facility-inventory

Deliverables

A shapefile showing all transit facilities in California.

Timeline of deliverables

Soon.

@evansiroky evansiroky added the research request Issues that serve as a request for research (summary and handoff) label Apr 7, 2025
@KatrinaMKaiser
Copy link
Member

KatrinaMKaiser commented Apr 7, 2025

Do we have a google maps API key associated with our project? I see that we do

@KatrinaMKaiser
Copy link
Member

This should be pretty quick if we can get the API key to work and use google's python package - assigning @mrtopsyt with an assist from @csuyat-dot if needed, regarding reading in the NTD data

@mrtopsyt
Copy link
Contributor

mrtopsyt commented Apr 10, 2025

Here's a first draft showing LA, rendered using Folium (the Python Leaflet wrapper). I was a little unsure about what (a) points to include, (b) what metadata to display, and (c) how individual points should be displayed.

Image

  • a) Many facilities contain a large number of different facilities, for instance LA Union Station has multiple fixed guideway stations, a bus loop, many parking lots, admin buildings, ticket offices, etc.. While there is a flag for whether a facility is a "Section of a Larger Facility", it doesn't seemed to be implemented in a way that it can be used for filters, since in some cases all components of a facility have it set and in others none do. My question here is:

    • Should we limit our list to only certain categories, and if so which ones?
  • b) The inventory contains a lot of metadata, although agencies are inconsistent about reporting it. A complete list is below. Right now we have a subset (see screenshot) shown in tooltips, and we use "Facility Type" for color coding with the below categories. My questions are:

    • Are there any specific fields that should be displayed in tooltips?
    • Does the below scheme for color coding look ok? I will provide a key in a later map, unfortunately it's a little annoying to add one in Folium so I've saved that for after this review.

Metadata:

['State/Parent NTD ID', 'NTD ID', 'Agency Name', 'Reporter Type',
       'Reporting Module', 'Sponsor NTD ID', 'Sponsor Name',
       'Primary Mode Served', 'Secondary Modes Served', 'Facility ID',
       'Facility Type', 'Facility Name', 'Street Address', 'City', 'State',
       'ZIP Code', 'Latitude', 'Longitude', 'Non-Agency Mode Served',
       'Private Modes Served', 'Administrative/Maintenance Facility Flag',
       'Passenger/Parking Facility Flag', 'Square Feet',
       'Number of Parking Spaces', 'Section of a Larger Facility',
       'Year Built or Reconstructed as New',
       'Percent Agency Capital Responsibility', 'Cross Agency Facility Flag',
       'Condition Assessment Date', 'Condition Assessment',
       'Separate Asset Flag', 'Notes', 'address_only', 'geometry',
       'geometry_geocoded', 'address', 'geocode_success', 'partial_match',
       'location_type']

Categories:

TYPES_BY_CATEGORY = {
    CATEGORIES.SURFACE_PARKING: ["Surface Parking Lot"],
    CATEGORIES.PARKING_STRUCTURE: ["Parking Structure"],
    CATEGORIES.TRANSIT_STATION: [
        "At-Grade Fixed Guideway Station", 
        "Simple At-Grade Platform Station", 
        "Bus Transfer Center", 
        "Elevated Fixed Guideway Station", 
        "Underground Fixed Guideway Station", 
        "Exclusive Platform Station", 
        "Ferryboat Terminal",
    ],
    CATEGORIES.ADMINISTRATIVE: ["Administrative Office / Sales Office", "Revenue Collection Facility"],
    CATEGORIES.MAINTENANCE: [
        "General Purpose Maintenance Facility / Depot",
        "Maintenance Facility (Service and Inspection)",
        "Vehicle Washing Facility",
        "Heavy Maintenance & Overhaul (Backshop)",
        "Vehicle Testing Facility",
        "Vehicle Blow-Down Facility"
    ],
    CATEGORIES.FUELING: ["Vehicle Fueling Facility"],
    CATEGORIES.OTHER_ADMIN_MAINTENANCE: [
        "Other, Administrative & Maintenance (describe in Notes)", 
        "Combined Administrative and Maintenance Facility (describe in Notes)",
    ],
    CATEGORIES.OTHER_PASSENGER_PARKING: ["Other, Passenger or Parking (describe in Notes)"],
    CATEGORIES.OTHER_UNCATEGORIZED: []
}
  • c) Right now we're just displaying facilities as points, preferentially using the provided latitide / longitude and filling in missing categories with geocoding. This does make the map look a bit crowded, but I think it's probably fine. A couple of other options we could try would be to:
    • Use the Google Building Attributes api to attempt to display polygons for buildings rather than points, but it would contain inaccuracies, especially for elements of maintenance facilities and for stations that are contained in buildings.
    • Use a clustering algorithm to group points within the same category. I'm not sure what the best algorithm choice would be, and this would lead to inaccuracies, but could be worth trying. Let me know if you have any suggestions if you think this is a good idea

One more additional note - this is definitely a useful dataset, but it misses some facilities. A couple of examples (both of which could be useful for EV charging) that I've noticed in SoCal are the Pico-Rimpau Transit Center and this layover facility in Hollywood (right next to LA's most transit friendly trader joes!). I'm not sure if these are both because they're on private property, but it does feel like a weakness of the dataset.

@evansiroky
Copy link
Member Author

evansiroky commented Apr 10, 2025

No need to limit the number of facilities at this point. Color coding looks great! I like the categorization. Is there a shapefile with the data that we can take a look at?

Also, good callout about the incompleteness. We'll see if we should take a look at that issue later depending on stakeholder feedback.

@mrtopsyt
Copy link
Contributor

Thanks! Apologies for the delay, Friday was my RDO. I'm working on getting a shapefile now (right now I only generated it for LA ti avoid usage constraints on the Geocoding API), and I'll ping you @evansiroky once it's done.

@mrtopsyt mrtopsyt linked a pull request Apr 15, 2025 that will close this issue
@mrtopsyt
Copy link
Contributor

Hi again @evansiroky! I've added a Draft PR with links to a GCS bucket with a shapefile and geojson. The Geojson is probably more useful, since the character limit for Shapefiles was causing some issues, but I provided both depending on the needs of the end-users. The map I posted a screenshot can be generated using ntd/ntd_trasit_facilities/ntd_map.ipynb.

Let me know if you have any comments, I left it as a draft PR for now to get feedback on whether there are additional features we need or if any of the fields displayed should change. If you think it looks good I can assign it to @csuyat-dot to provide a code review.

I have noticed some issues with the geocoding from Google due to typos made by the agencies - you may notice that the Scott's Valley Transit Center appears in UCSC because of a geocoding issue). Depending on the importance of accuracy here, I'm happy to go through and review this some more, I'm not sure if there are some options we can provide to make it less likely to provide a false positive when a result can't be found.

@evansiroky
Copy link
Member Author

Thanks for sharing @mrtopsyt! I had a hard time extracting the shapes zipfile, but was able to load the GeoJSON into ArcGIS. I mentioned this to Gillian and she was happy to learn about the progress. Let's wait and see what additional questions we have about this data before doing anything further.

@mrtopsyt
Copy link
Contributor

Thanks @evansiroky! Sorry about the shapefile, I may have made a mistake in uploading it but I'll look into correcting it later if a shapefile is still desired. Let me know once we get any feedback so I can make adjustments!

@mrtopsyt
Copy link
Contributor

mrtopsyt commented May 6, 2025

Just as a note, I noticed the NTD now produces an interactive map of the Facilities Inventory. I'm not sure how useful the actual visualization is, since it doesn't show much metadata, but they do have a way you can export the geocode results as a csv. If we want to move further with that, it could be something to validate against, although I can't find any published methods that say what their geocoding engine is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
research request Issues that serve as a request for research (summary and handoff)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants