Skip to content

Commit

Permalink
feat: add nvidia flare demo
Browse files Browse the repository at this point in the history
chore: fix linting issues

feat: add namespace variable and remove environment variable

feat: change namespace as variable

fix: reviews

chore(deps): bump super-linter/super-linter from 7.0.0 to 7.1.0 (#477)

Bumps [super-linter/super-linter](https://github.com/super-linter/super-linter) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/super-linter/super-linter/releases)
- [Changelog](https://github.com/super-linter/super-linter/blob/main/CHANGELOG.md)
- [Commits](super-linter/super-linter@v7.0.0...v7.1.0)

---
updated-dependencies:
- dependency-name: super-linter/super-linter
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

fix!: use an external git repository (#475)

- Don't enable the Cloud Source Repositories API.
- Don't provision a Cloud Source Repository.
- Get the Config Sync Git repository with an input variable.
- Let users specify the credentials type to authenticate Config Sync
  with their repositories.

Fix #449

chore: move manifests to workload-pkg folder

feat: create nvflare terraform module

chore: gitignore terraform stuff (#480)

feat: allow config sync to access source repository for synchronization (#479)

fix: minor changes to comply with pr review

Co-authored-by: Laurent Grangeau <[email protected]>

feat: Removing the private DNS zone for CSR

feat: enabling HTTP load balancing for ASM

feat: add nvflare demo deployment

chore: manually build the example container image (#481)

Add instructions to build the distributed TensorFlow Federated example
in the example README, instead of having Terraform orchestrate the build
and push process.

This simplifies the root Terraform module, by removing something that's
needed only for an example, and it's also unlikely for users to have a
build pipeline orchestrated this way.

feat: add nvflare demo deployment

feat: add nvflare demo

feat: refactor to have same extension everywhere

feat: add templating of manifest resources

feat: add nvflare example
  • Loading branch information
Laurent Grangeau committed Sep 10, 2024
1 parent 1e90155 commit 4aac500
Show file tree
Hide file tree
Showing 43 changed files with 1,332 additions and 156 deletions.
12 changes: 10 additions & 2 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
{
"name": "Python 3",
"image": "ghcr.io/super-linter/super-linter:v7.0.0",
"image": "ghcr.io/super-linter/super-linter:v7.1.0",
"customizations": {
"vscode": {
"settings": {
"editor.defaultFormatter": "esbenp.prettier-vscode",
"editor.formatOnSave": true,
"editor.formatOnSaveMode": "file",
"editor.wordWrap": "off",
"prettier.resolveGlobalModules": true
"prettier.resolveGlobalModules": true,
"[terraform]": {
"editor.defaultFormatter": "hashicorp.terraform"
},
"[terraform-vars]": {
"editor.defaultFormatter": "hashicorp.terraform"
}
},
"extensions": [
"DavidAnson.vscode-markdownlint",
Expand Down
18 changes: 18 additions & 0 deletions .github/dependabot.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,24 @@ updates:
package-ecosystem: "pip"
schedule:
interval: "daily"
- directory: "/examples/federated-learning/tff/nvflare/container-image"
commit-message:
prefix: "chore(deps)"
package-ecosystem: "docker"
schedule:
interval: "daily"
- directory: "/examples/federated-learning/tff/nvflare/container-image"
commit-message:
prefix: "chore(deps)"
package-ecosystem: "pip"
schedule:
interval: "daily"
- directory: "/examples/federated-learning/tff/nvflare"
commit-message:
prefix: "chore(deps)"
package-ecosystem: "pip"
schedule:
interval: "daily"
- directory: "/terraform"
commit-message:
prefix: "chore(deps)"
Expand Down
10 changes: 10 additions & 0 deletions .github/workflows/build-container-images.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,13 @@ jobs:
contents: read
runs-on: ubuntu-latest
steps:
- name: Free Disk space
shell: bash
run: |
sudo rm -rf /usr/local/lib/android || true
sudo rm -rf /usr/share/dotnet || true
sudo rm -rf /opt/ghc || true
sudo rm -rf /usr/local/.ghcup || true
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- uses: actions/checkout@v4
Expand Down Expand Up @@ -70,3 +77,6 @@ jobs:
context-directory: examples/federated-learning/tff/distributed-fl-simulation-k8s/container-image
# Don't run an instance of this because it needs instrumentation
run: false
- name: nvflare
context-directory: examples/federated-learning/tff/nvflare/container-image
run: true
4 changes: 2 additions & 2 deletions .github/workflows/pipeline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ jobs:
- name: Load super-linter configuration
run: cat config/lint/super-linter.env >> "$GITHUB_ENV"
- name: Super-Linter
uses: super-linter/super-linter@v7.0.0
uses: super-linter/super-linter@v7.1.0
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Expand All @@ -77,7 +77,7 @@ jobs:
- name: Load super-linter fix mode configuration
run: cat config/lint/super-linter-fix-mode.env >> "$GITHUB_ENV"
- name: Super-Linter
uses: super-linter/super-linter@v7.0.0
uses: super-linter/super-linter@v7.1.0
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Commit and push linting fixes
Expand Down
24 changes: 24 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,27 @@ super-linter.log

# GitHub Actions leftovers
github_conf

# Terraform gitgnore
# Ref: https://github.com/github/gitignore/blob/main/Terraform.gitignore

# Local .terraform directories
**/.terraform/*

# .tfstate files
*.tfstate
*.tfstate.*

# Crash log files
crash.log
crash.*.log

# Ignore override files as they are usually used to override resources locally and so
# are not checked in
override.tf
override.tf.json
*_override.tf
*_override.tf.json

# Ignore transient lock info files created by terraform apply
.terraform.tfstate.lock.info
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ To deploy this blueprint you need:
- The `serviceusage.googleapis.com` must be enabled on the project. For more
information about enabling APIs, see
[Enabling and disabling services](https://cloud.google.com/service-usage/docs/enable-disable)
- A Git repository to store the environment configuration.

You create the infastructure using Terraform. The blueprint uses a local [Terraform backend](https://www.terraform.io/docs/language/settings/backends/configuration.html),
but we recommend to configure a [remote backend](https://www.terraform.io/language/settings/backends/configuration#backend-types)
Expand Down Expand Up @@ -127,10 +128,16 @@ Users and teams managing tenant apps should not have permissions to change clust
1. Initialize the following Terraform variables:

```hcl
project_id = # Google Cloud project ID where to provision resources with the blueprint.
acm_repository_path = # Path on the host running Terraform to store the GKE descriptors to configure the cluster
project_id = # Google Cloud project ID where to provision resources with the blueprint.
acm_repository_path = # Path on the host running Terraform to store environment configuration
acm_repository_url = # URL of the repository to store environment configuration
acm_secret_type = # Secret type to authenticate with the Config Sync Git repository
acm_source_repository_fqdns = # FQDNs of source repository for Config Sync to allow in the Network Firewall Policy
```

For more information about setting `acm_secret_type`, see
[Grant access to Git](https://cloud.google.com/kubernetes-engine/enterprise/config-sync/docs/how-to/installing-config-sync#git-creds-secret).

If you don't provide all the necessary inputs, Terraform will exit with an
error, and will provide information about the missing inputs. For example,
you can create a Terraform variables initialization file and set inputs there.
Expand All @@ -145,6 +152,9 @@ Users and teams managing tenant apps should not have permissions to change clust

The provisioning process may take about 15 minutes to complete.

1. [Grant the Config Sync agent access to the Git repository](https://cloud.google.com/kubernetes-engine/enterprise/config-sync/docs/how-to/installing-config-sync#git-creds-secret)
where the environment configuration will be stored.

1. Wait for the GKE cluster to be reported as ready in the [GKE Kuberentes clusters dashboard](https://cloud.google.com/kubernetes-engine/docs/concepts/dashboards#kubernetes_clusters).

### Next steps
Expand All @@ -154,6 +164,7 @@ To familiarize with the environment that you provisioned, you can also deploy
the following examples in the GKE cluster:

- [Distributed TensorFlow Federated training](./examples/federated-learning/tff/distributed-fl-simulation-k8s/README.md)
- [Nvflare training](./examples/federated-learning/tff/nvflare/README.md)

Federated learning is typically split into Cross-silo and Cross-device federated learning. Cross-silo federated computation is where the participating members are organizations or companies, and the number of members is usually small (e.g., within a hundred).

Expand Down
4 changes: 4 additions & 0 deletions assets/nvflare.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 4aac500

Please sign in to comment.