Intake plugin for HTML tables.
pip install -e git+https://github.com/compilerla/intake-html-table@main#egg=intake-html-tableOr
git clone https://github.com/compilerla/intake-html-table
cd intake-html-table/
pip install -e .See examples/notebook.ipynb or view on nbviewer for more.
From an intake catalog
Use the html_table driver to read data from HTML tables. Pass additional kwargs to pandas.read_html():
metadata:
version: 1
sources:
table:
description: Read from an HTML table with id=data, skipping the first 2 rows
driver: html_table
args:
urlpath: "https://example.com/"
attr:
id: data
skiprows: 2Use the apache_dir driver to read a catalog from an Apache Server directory:
metadata:
version: 1
sources:
ncei:
description: National Centers for Environmental Information data catalog
driver: apache_dir
args:
urlpath: "https://www.ncei.noaa.gov/data/"Run the test suite (from the root of the repository):
coverage run -m pytestTo view the coverage report with indicators for untested (missed) lines:
coverage report -mTo upate the README badge from the latest test run:
coverage-badge -f -o tests/coverage.svgThe -f argument ensures the existing badge is overwritten.
Tests also run via GitHub Action on events against the main branch.