Corpus data and analysis for CuRIAM.
CuRIAM stands for Corpus re Interpretation and Metalanguage. For information about the corpus, see the paper:
Corpus re Interpretation and Metalanguage in Supreme Court Opinions (arXiv, to appear at LREC-COLING 2024)
The full corpus data is available here and recommended for programmatic exploration of the data.
data/main/annotated contains the exports from annotation software and the verticalised *.tsv can be used to manually explore texts and their annotation.
-
Create a conda environment.
$ conda env create -f environment.yml $ conda activate curiam -
Install the
curiampackage locally.$ pip install --upgrade build $ pip install -e . -
If you plan on running the gamma agreement calculations, install pygamma-agreement separately.
$ sudo apt-get update $ sudo apt install coinor-libcbc-dev $ pip install "pygamma-agreement[cbc]"For Apple Silicon, Installing
cvxoptvia conda and thenpip install pygamma-agreementwithout cbc may work, but I haven't tested it. If it does work, it may be slow.
TODO
TODO
