Skip to content

TwoRavens/raven-metadata-service

Repository files navigation

Build Status


TwoRavens Metadata Service

Service to produce TwoRavens metadata.

More detailed documentation is available at: https://tworavens.github.io/TwoRavens/Metadata/

Install

The preprocess library, without the web service, is available via pypi: https://pypi.org/project/tworavens-preprocess/

pip install tworavens-preprocess

Manual install

Prerequisites:

  • python3
  • virtualenvwrapper
  • bz2 (sudo apt install or brew install libbz2-dev) Open a Terminal:
git clone https://github.com/TwoRavens/raven-metadata-service.git
cd ~/raven-metadata-service
mkvirtualenv preprocess
pip install -r requirements/30_preprocess_web.txt

Usage

Within Python

Preprocess a single file:

from raven_preprocess.preprocess_runner import PreprocessRunner

run_info = PreprocessRunner.load_from_file('input/path/my-data-file.csv')
if not run_info.success:
    print(run_info.err_msg)
else:
    runner = run_info.result_obj

    # show the JSON (string)
    print(runner.get_final_json(indent=4))

    # retrieve the data as an OrderedDict
    metadata = runner.get_final_dict()

    # iterate through the variables
    for vkey, vinfo in metadata['variables'].items():
        print('-' * 40)
        print(f'--- {vkey} ---')
        print('nature:', vinfo['nature'])
        print('invalidCount:', vinfo['invalidCount'])
        print('validCount:', vinfo['validCount'])
        print('uniqueCount:', vinfo['uniqueCount'])
        print('median:', vinfo['median'])
        print('etc...')

or

from raven_preprocess.preprocess import run_preprocess
run_preprocess('path-to-input-file.csv')

Preprocess a single file, write output to file:

from raven_preprocess.preprocess import run_preprocess
run_preprocess('path-to-input-file.csv', 'path-to-OUTPUT-file.json')

Using the wrapper from the manual install

Open a Terminal:

cd ~/raven-metadata-service/preprocess/raven_preprocess
workon preprocess

Preprocess a single file, write output to screen:

python preprocess.py [input file]

Example:

python preprocess.py ../../test_data/fearonLaitin.csv

Preprocess a single file, write output to file:

python preprocess.py [input file name] [output file]

Example:

python preprocess.py ../../test_data/fearonLaitin.csv /tmp/fearonLaitin.json

Both ways accept an --old-format flag which will convert the output to be the same as that available on Harvard Dataverse.

python preprocess.py ../../test_data/fearonLaitin.csv --old-format

More examples can be found in the documentation.

Run tests

The test suite is available for development purposes when installed manually.

cd ~/raven-metadata-service/preprocess/code
python -m unittest

Authors

  • Kripanshu Bhargava
  • Vito D'Orazio
  • James Honaker
  • Aaron Lebo
  • Yu Lin
  • Raman Prasad
  • Michael Shoemate

Community Guidelines

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

License

Apache 2.0

About

service to produce TwoRavens metadata

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 8