Hydra

Multi-headed connector for rapid data extraction from new environments.

Description

Hydra is a rapid extraction tool for use in novel data infrastructure environments. Its purpose is to provide a wide selection of simple data source handlers which can be manipulated and combined to create multi-headed one-shot data extractions for processing in a more powerful cloud environment.
Given only environment credentials, Hydra will assist with:
+ Enumerating all data elements for target servers
+ Providing summary statistics for later reconciliation
+ Investigating specific elements
+ Marking elements for extraction
+ Efficient bulk transfer to a cloud or local storage solution
+ Post-transfer reconciliation

Hydra's purpose is NOT to be a set-and-forget data pipeline as this is better suited to existing pipeline tools such as Spotify's Luigi.
Instead Hydra should be used to build a solid POC from which a better optimised and tailored pipeline can be built. It balances between
full-featured libraries for single data sources (i.e. TableauServerClient, AWS CLI), and simple data extractors which limit access 
to the underlying datasource (i.e. Pandas' IO Tools). This minimises the upfront investment required to develop an inital analytics 
solution while also avoiding the need for an expensive re-build as the project matures. Simply fork the Hydra repository and expand its 
pre-built connectors to suit your tailored pipeline. Hydra's handlers are simple to modify by design and will make use of the most 
extensible data connector available for any given target data source. 

There are two modules implemented by Hydra:
    1) Inbound - for initial connection and investigation of data sources  
    2) Outbound - for efficient bulk transfer of selected data elements  
These are designed for compatibility with the following technologies:  
    + Flat files (currently only CSV)  
    + IBM Notes (formerly Lotus Notes)  
    + SQL Alchemy - MSSQL & PostgreSQL  
    + Tableau Server Repository  
    + SAP (when integrated with 'SAPsucker')  
    + AWS - S3 & RDS

Setup

In progress.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
connectors		connectors
server_scripts		server_scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
api_extractor_config.py		api_extractor_config.py
api_extractor_utils.py		api_extractor_utils.py
example_bulk_insert.py		example_bulk_insert.py
exec_procedure.py		exec_procedure.py
hydra.png		hydra.png
main.py		main.py
results_generator.py		results_generator.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hydra

Description

Setup

About

Uh oh!

Releases

Packages

Languages

License

SigmaAdvancedAnalytics/hydra

Folders and files

Latest commit

History

Repository files navigation

Hydra

Description

Setup

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages