HashMark is a lightweight, serverless platform that lets you prove a digital file existed at a specific point in time without revealing its contents. When a file is uploaded to Azure Blob Storage, an Azure Function computes cryptographic hashes and stores a tamper‑evident record in Azure Table Storage. A simple HTTP API verifies later whether a given file (or hash) was previously recorded.
- Cloud Platform: Microsoft Azure
- IaC: Terraform
- Compute: Azure Functions (Python)
- Storage: Azure Blob Storage (uploads) + Azure Table Storage (hash records)
- Libraries:
azure-functions,azure-data-tables,requests-toolbelt - Architecture: Serverless, event-driven (Blob upload → Hash → Persist)
- Project Overview
- Features
- System Architecture
- Infrastructure Setup
- Source Code Structure
- Installation and Deployment
- API Documentation
- Configuration and Environment
- Troubleshooting and FAQ
- License and Legal
- Problem: Traditional notarization is slow, costly, and requires sharing document contents. Many workflows only need to prove a file existed unchanged at a point in time.
- Solution: HashMark computes and stores cryptographic file digests on upload and exposes verification APIs. Only hashes and minimal metadata are stored—never file contents—preserving privacy while enabling integrity proofs.
- Value:
- Low‑cost, automatic, and privacy‑preserving proof‑of‑existence
- Serverless scaling and pay‑per‑use economics
- Simple API integration and straightforward operations
- Use cases:
- Legal and compliance document sealing
- Intellectual property and authorship proofs
- Supply‑chain and audit logs
- Medical images and research data integrity
High‑level architecture (ASCII):
+-----------------------------+ +-----------------------+
| Client / Uploader | | Verifier / API User |
+--------------+--------------+ +-----------+-----------+
| |
| 1) Upload file | 4) POST /api/verify (hash or file)
v v
+------+-----------------+ +-------+-----------------+
| Azure Blob Storage | | Azure Functions (HTTP) |
| Container: uploads | | Route: /api/verify |
+-----------+------------+ +-----------+------------+
| 2) Blob-created trigger |
v |
+-------+-----------------------+ |
| Azure Functions (Blob Trigger)| |
| Compute SHA-256/SHA3, store | |
+---------------+---------------+ |
| 3) Upsert record |
v |
+--------+-------------------+ |
| Azure Table Storage |<--------------+
| Hash + metadata + times |
+---------------------------+
- Automatic hashing on upload: SHA‑256 (and SHA3‑256 stored alongside) computed on every blob written to
uploads/. - Verification API: Verify by providing either a SHA‑256 hash or uploading a file via multipart form.
- Privacy‑preserving: Only digests and metadata are stored; original file bytes remain in private blob storage and are not returned.
- Detailed results: Responses include timestamps, file size, and storage row identifiers.
- Serverless scale: Azure Functions on a consumption plan scale to demand.
- JSON responses: Simple, consistent API payloads.
- Data flow:
- A file is uploaded to Blob Storage container
uploads. - A Blob Trigger Function reads the bytes, computes SHA‑256 and SHA3‑256, and writes a record to Table Storage with partition and row keys, timestamps, and metadata.
- A user later calls the HTTP endpoint with a hash or an uploaded file; the Function queries Table Storage for a match and returns verification details.
- A file is uploaded to Blob Storage container
- Components:
- Azure Blob Storage: private
uploadscontainer for ingestion - Azure Functions: Blob trigger (
uploads/{name}) and HTTP route/api/verify - Azure Table Storage: records containing
sha256_hash,sha3_hash, file size, timestamps, and keys
- Azure Blob Storage: private
- Events & triggers: Blob creation triggers hashing; HTTP trigger performs verification.
- Security model:
- Storage accounts are private; only Functions access data via connection strings
- No file contents are exposed via the API; only hashes and metadata are returned
- Secrets are stored in app settings
- Scalability:
- Consumption plan autoscales Function instances
- Table queries are partition‑aware; partitioning by date improves scan performance
- Blob and Table Storage scale horizontally with Azure limits
Terraform modules provision:
- Resource Group, Storage Account,
uploadsBlob container, and Table Storage (<name_prefix>Records) - Linux Function App on a consumption plan (Python 3.12)
- Optional Service Principal with Contributor role for CI/CD
Resource naming and inputs (from terraform/variables.tf):
name_prefix(string): Prefix for all resource namesregion(string): Azure regionstorage_account_name(string): Globally unique storage account name for main datasubscription_id(string): Target subscription
Primary outputs include:
- Storage connection string
- Table name
- Resource group information
- Function‑scoped storage account
- Optional GitHub Actions SP credentials (sensitive)
Typical Terraform workflow using terraform.tfvars:
# 0) Prepare variables file (run in repo root)
touch terraform/terraform.tfvars
# Edit terraform/terraform.tfvars with your values
# 1) Authenticate
az login
az account set --subscription <SUBSCRIPTION_ID>
# 2) Initialize and review plan (from ./terraform)
cd terraform
terraform init
terraform plan # vars are loaded from terraform.tfvars automatically
# 3) Apply
terraform apply -auto-approveOptional: keep multiple env files and use -var-file:
terraform plan -var-file="envs/dev.tfvars"
terraform apply -var-file="envs/dev.tfvars"Cost and sizing:
- Consumption Functions: pay per execution and GB‑s
- Storage: minimal costs for blob + table operations; budget for egress if applicable
Permissions and CLI:
- Requires Owner/Contributor on the target subscription or delegated RG
- AzureAD permissions are needed if creating a Service Principal
hashmark/
├── terraform/ # Terraform configurations (RG, Storage, Function, SP)
│ ├── main.tf
│ ├── variables.tf
│ ├── provider.tf
│ └── modules/
├── hashmark-func/ # Azure Functions source
│ ├── function_app.py # blob trigger + /api/verify
│ ├── host.json # Functions host configuration
│ ├── requirements.txt # Python deps
├── .github/ # (CI/CD) workflows and templates
└── README.md # This file
Prerequisites:
- Python 3.12 (matches Function App setting)
- Azure CLI ≥ 2.50, Terraform ≥ 1.6
- Azure subscription with permissions
- Node.js (optional) for Azure Functions Core Tools UI
Deploy infrastructure with Terraform: see Infrastructure Setup.
Deploy the Function App: GitHub Actions (recommended): Builds and publishes on push.
- Run
terraform output -json gh_actions_sp_credentialsthen put the ouput in proper json format and paste into a new github environment secrets: AZURE_CREDENTIALS
Environment variables (App Settings):
FUNCTIONS_WORKER_RUNTIME=pythonAzureWebJobsStorage=<connection string to main storage>TABLE_NAME=<name_prefix>Records
Verification checklist:
- Blob container
uploadsexists and is private - Table
<name_prefix>Recordsexists - HTTP
POST /api/verifyreturns 400 for invalid content type and 200 for valid requests
Base URL: https://<function-app-name>.azurewebsites.net
- POST
/api/verify - Content types:
application/json(hash‑only) ormultipart/form-data(file upload)
Request (hash‑only):
POST /api/verify HTTP/1.1
Content-Type: application/json
{"hash":"<64-hex-sha256>"}Response (200):
{
"success": true,
"verification": {
"exists": true,
"file_details": {
"original_filename": "uploads/mydoc.pdf",
"file_size": 12345,
"upload_timestamp": "2024-01-01T12:00:00Z",
"sha256_hash": "...",
"verification_count": 3,
"last_verified": "2024-01-02T09:30:00Z"
},
"proof_details": {
"partition_key": "2024-01-01",
"row_key": "1704100800_deadbeef",
"storage_status": "verified"
}
},
"request_info": {
"provided_hash": "...",
"verification_timestamp": "2024-01-02T09:30:00Z"
}
}Errors:
- 400: Unsupported content type, missing/invalid
hash - 500: Internal server error
Curl examples:
# Hash-only
curl -X POST https://<function-app-name>.azurewebsites.net/api/verify \
-H "Content-Type: application/json" \
-d '{"hash":"<64-hex-sha256>"}' \
# Multipart file upload
curl -X POST https://<function-app-name>.azurewebsites.net/api/verify \
-F "file=@filename" \
Environment variables / App Settings used by the Function:
FUNCTIONS_WORKER_RUNTIME— set topythonAzureWebJobsStorage— connection string to the primary Storage Account used by both blob and table accessTABLE_NAME— name of the Azure Table to persist/query (e.g.,<name_prefix>Records)
Templates to include:
hashmark-func/local.settings.json.template(do not commit secrets):
{
"IsEncrypted": false,
"Values": {
"FUNCTIONS_WORKER_RUNTIME": "python",
"AzureWebJobsStorage": "<connection-string>",
"TABLE_NAME": "<prefix>Records"
}
}- Blob trigger not firing:
- Ensure uploads go to container
uploads(exact name) - Verify
AzureWebJobsStorageand that the Function app has access
- Ensure uploads go to container
- 400 "Unsupported content type":
- Use
application/json(hash‑only) ormultipart/form-data(file upload)
- Use
- Hash not found but file was uploaded:
- Allow for eventual consistency; confirm trigger success in logs; check Table name
- Permissions errors with Terraform SP:
- Confirm AzureAD provider permissions and subscription role assignments
- Performance tips:
- Prefer hash‑only verification for very large files; enable HTTP keep‑alive on clients
- License: MIT (see
LICENSE) - Dependencies:
azure-functions,azure-data-tables,requests-toolbelt(seehashmark-func/requirements.txt) - Privacy: Only hashes and metadata are stored; operators should publish a privacy notice
- Terms: If hosting for others, publish acceptable use and rate limits