Skip to content

A list of publicly available datasets with real-time data maintained by the team at bytewax.io

License

Notifications You must be signed in to change notification settings

bytewax/awesome-public-real-time-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Awesome Public Real-Time Datasets and Sources

Awesome

This list is inspired by awesome public datasets, but for real-time datasets and sources. Normally accessed via HTTP or Websockets.

The list is separated into Free and Paid and broken into subsections based on loose categories.

Free

Finance/Crypto

  • Coinbase Market Data - Coinbase websocket to market data including level 2 orderbook data.
  • Blockchain transactions - Provides real-time notifications about new transactions and blocks.
  • Yahoo Financewss://streamer.finance.yahoo.com/ - This is not advertised in developer documentation, but discoverable as the websocket is used to update their website.
  • Finnhub - Limited free usage with a premium data sources also available.
  • CoinCheck - a cryptocurrency API that has a WebSocket interface (in beta)
  • Alpaca Markets Real-Time and historical market data via HTTP and Websocket.
  • SEC EDGAR - The SEC offers real-time streaming access to regulatory filings (like 10-K, 10-Q, 8-K) as well as real-time XBRL financial data via RESTful APIs and RSS feeds.
  • Binance - WebSocket API that delivers real-time cryptocurrency trading data and order book updates
  • OANDA - HTTP based FOREX rates stream through the OANDA API.
  • CoinCap - Provides real-time pricing and market activity for over 1,000 cryptocurrencies
  • Polygon.io - Provides real‑time stock market and cryptocurrency data from all US exchanges via REST and WebSocket endpoints.

Transportation

  • Open Rail Data - A collection of APIs that provide data relating to the UK rail network, including reference data, train timetables, and live service updates. The live data is streamed using the STOMP protocol.
  • GBFS New York - GBFS is the standard for bike share data with many locations around the world. Find more information here
  • Open Sky Flight - Data from Open Sky Api via HTTP endpoint. Supports real-time, but not streaming. Need to continually poll.
  • Open Glider Network - The OGN provides real-time traffic for gliders and other light aircraft. You can use an OGN client like python-ogn-client to connect to OGN servers, parse the APRS messages and push them to a broker like Kafka for streaming processing.
  • MTA GTFS Feed - Transit data in GTFS format for transit systems like NYC subway and Caltrain.
  • NY 511 live camera data - This live camera data requires some scraping to use. From this list of cameras you can then source the individual camera id and then request the timestamped image or most recent image by building the url like - https://511ny.org/map/Cctv/<image-id-goes-here>
  • Transport for London (TfL) - live data about the tube, buses, and more
  • Norwegian Coastal Administration - AIS data from vessels within the Norwegian economic zone and the protection zones off Svalbard and Jan Mayen.
  • German Traffic Data - German real-time traffic information
  • Swiss Traffic & Public Transport Data - Various real-time transport data from Switzerland such road traffic, status of EV charging stations, shared mobility services and live arrivals/departures of public transport
  • Transport for NSW API - Real‑time public transport data (buses, trains, ferries) available for New South Wales, Australia
  • Ireland National Transport Authority - Real-time update stream for services provided by Dublin Bus, Bus Éireann, and Go-Ahead Ireland.

Information

  • Wikimedia SSE Event Stream of recent changes to the wikimedia foundation pages.
  • Seismic Data Seismic Portal provides a websocket interface to real-time seismic events.
  • Open Weather API - Current weather data available free at rate of 1 request per second.
  • Clima Cell - Real-time weather data in a free or paid API.
  • NOAA Buoy Data - Real-time buoy data from NOAA
  • NOAA Weather Data - Live Weather Data API from NOAA
  • Redfin Realestate - pull up to date data from redfin unofficial API.
  • EPA Airnow data - Air quality data hosted by the EPA.
  • UK Flood Data - UK government real-time API for flood data.
  • US Energy Grid Data - Real-time grid information for the US energy grid
  • USGS Earthquake Real-time Feed - Live seismological data feed to know about earthquakes as they happen
  • News API - Aggregator that pulls headlines and articles from dozens of news outlets worldwide in near real time via API. It offers a free tier (with rate limits)
  • New York Times Newswire API - The Times Newswire API provides an up-to-the-minute stream of articles published on NYTimes.com.

IoT

  • ThingSpeak IoT Public Channels - Crowdsourced IoT channels of users publishing various IoT sensor data in real-time. Accessible via REST API or MQTT API.

Cybersecurity

  • Certstream - Certstream provides a publicly accessible real‐time feed of certificate transparency logs, delivering live updates on SSL/TLS certificate issuance as it occurs.
  • URLhaus - Community-driven repository for real-time malicious URL data, offering actionable threat intelligence to block phishing and malware.
  • CISA Automated Indicator Sharing (AIS) - US government-led service that enables public and private organizations to exchange machine-readable threat indicators in real time.
  • Open Threat Exchange (OTX) - Community-driven threat intelligence platform that streams real-time data on malicious IPs, domains, and URL through the OTX DirectConnect API.

Other

Paid

Finance/Crypto

  • IEX Trading - IEX was created in response to questionable trading practices that had become widely used across traditional exhcanges! Their API provides streaming Stock market data.
  • NYSE Cloud Streaming - Provides real-time access to high-quality NYSE exchange data feeds, streaming directly in the cloud using Kafka format.
  • Alpha Vantage Market News & Sentiment - Live market news & sentiment data from selected news outlets covering stocks, cryptocurrencies, forex, and a wide range of topics such as fiscal policy, mergers & acquisitions, IPOs, etc.
  • Bloomberg - The Bloomberg Market Data Feed (B-PIPE) provides consolidated, normalized market data in real time.

Cybersecurity

  • Kaspersky Threat Data Feeds - Kaspersky Threat Data Feeds deliver continuously updated, real‐time threat intelligence by aggregating data from diverse sources.
  • Bitdefender Threat Intelligence Feeds - A comprehensive suite of real-time, curated threat data streams that deliver actionable insights on malicious domains, IP addresses, URLs, file hashes, and vulnerabilities.
  • ANY.RUN Threat Intelligence Feeds – Near real-time feeds generated from interactive malware sandbox sessions, providing deep insights into malware behaviors and associated indicators of compromise (IoCs).
  • Deepinfo Data Feeds – Real-time updates on domain registrations, subdomain discoveries, and DNS changes accesible via REST-based web services.

Transportation

  • AIS Data Maritime, Aviation and weather data available via Spire.
  • FlightAware Firehose - Real-time data feed of global aircraft ADS-B positions and flight status.
  • Spire Data Services - Global maritime vessel tracking, combining Satellite automatic identification systems (AIS), Terrestrial AIS, and Dynamic AIS.

Information

  • PurpleAir Air Quality Data - Developer API for accessing purple air sensor data.
  • NewsAPI - NewsAPI tracks headlines in 7 categories across over 50 countries, and at over a hundred top publications and blogs, in near real time. Free developer version with 24 delays available.
  • X (Twitter) - X provides a streaming interface for research or enterprise.
  • Bing News Search API - Bing News Search API returns fresh news results from various sources in near real time
  • Reuters News API - Low latency news feed from Reuters
  • webz.io - webz.io provides API with real-time information from thousnads of news sites, blogs, and discussion forums.
  • Mediastack - JSON API delivering worldwide news, headlines and blog articles in real-time

Sports

  • Sports Livescores - Developer API of TheSportsDB that gives you access to livescores
  • Sportradar Sports Data - Global live data of 80 sports, 500 sport leagues and 750k events a year (free 30 day trial available)

About

A list of publicly available datasets with real-time data maintained by the team at bytewax.io

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •