Skip to content

data-engineer-yogesh/ongoing-clinical-trials-analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Ongoing-clinical-trials-analytics

End-to-end data pipeline for ongoing clinical trials using Databricks. Ingests data from ClinicalTrials.gov API into Delta Lake (Bronze → Silver → Gold) and prepares analytics-ready datasets for sponsors, conditions, and locations.

This project is ideal for anyone who is:

  • Learning Databricks and Delta Lake hands-on
  • Exploring real-world data pipelines and analytics projects
  • Interested in mastering data governance, orchestration, and CI/CD in Databricks

Learning Objectives

  • Unity Catalog – managing data access and security
  • Databricks Connections – integrating REST APIs and external sources
  • Service Principals & User Groups – enterprise-level user management
  • Delta Live Tables (DLT) – building reliable ETL pipelines
  • Databricks Jobs, Pipelines & Dashboards – orchestration and reporting
  • CI/CD Deployment – using Databricks Asset Bundles (DAB)
  • Automating deployments with GitHub Actions for Databricks
ChatGPT Image Mar 6, 2026 at 10_40_33 AM

License

This project is intended for learning and demonstration purposes.

About

End-to-end data pipeline for ongoing clinical trials using Databricks. Ingests data from ClinicalTrials.gov API into Delta Lake (Bronze → Silver → Gold) and prepares analytics-ready datasets for sponsors, conditions, and locations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors