Skip to content

Commit

Permalink
feat: add nvidia flare demo
Browse files Browse the repository at this point in the history
chore: fix linting issues

feat: add namespace variable and remove environment variable

feat: change namespace as variable

fix: reviews

chore(deps): bump super-linter/super-linter from 7.0.0 to 7.1.0 (#477)

Bumps [super-linter/super-linter](https://github.com/super-linter/super-linter) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/super-linter/super-linter/releases)
- [Changelog](https://github.com/super-linter/super-linter/blob/main/CHANGELOG.md)
- [Commits](super-linter/super-linter@v7.0.0...v7.1.0)

---
updated-dependencies:
- dependency-name: super-linter/super-linter
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

fix!: use an external git repository (#475)

- Don't enable the Cloud Source Repositories API.
- Don't provision a Cloud Source Repository.
- Get the Config Sync Git repository with an input variable.
- Let users specify the credentials type to authenticate Config Sync
  with their repositories.

Fix #449

chore: move manifests to workload-pkg folder

feat: create nvflare terraform module

chore: gitignore terraform stuff (#480)

feat: allow config sync to access source repository for synchronization (#479)

fix: minor changes to comply with pr review

Co-authored-by: Laurent Grangeau <[email protected]>

feat: Removing the private DNS zone for CSR

feat: enabling HTTP load balancing for ASM

feat: add nvflare demo deployment

chore: manually build the example container image (#481)

Add instructions to build the distributed TensorFlow Federated example
in the example README, instead of having Terraform orchestrate the build
and push process.

This simplifies the root Terraform module, by removing something that's
needed only for an example, and it's also unlikely for users to have a
build pipeline orchestrated this way.

feat: add nvflare demo deployment

feat: add nvflare demo

feat: refactor to have same extension everywhere

feat: add templating of manifest resources

feat: add nvflare example

fix: lint errors

fix: lint errors

fix: lint errors
  • Loading branch information
Laurent Grangeau committed Sep 10, 2024
1 parent a187cf5 commit d726931
Show file tree
Hide file tree
Showing 37 changed files with 549 additions and 266 deletions.
12 changes: 10 additions & 2 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
{
"name": "Python 3",
"image": "ghcr.io/super-linter/super-linter:v7.0.0",
"image": "ghcr.io/super-linter/super-linter:v7.1.0",
"customizations": {
"vscode": {
"settings": {
"editor.defaultFormatter": "esbenp.prettier-vscode",
"editor.formatOnSave": true,
"editor.formatOnSaveMode": "file",
"editor.wordWrap": "off",
"prettier.resolveGlobalModules": true
"prettier.resolveGlobalModules": true,
"[terraform]": {
"editor.defaultFormatter": "hashicorp.terraform"
},
"[terraform-vars]": {
"editor.defaultFormatter": "hashicorp.terraform"
}
},
"extensions": [
"DavidAnson.vscode-markdownlint",
Expand Down
24 changes: 24 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,27 @@ super-linter.log

# GitHub Actions leftovers
github_conf

# Terraform gitgnore
# Ref: https://github.com/github/gitignore/blob/main/Terraform.gitignore

# Local .terraform directories
**/.terraform/*

# .tfstate files
*.tfstate
*.tfstate.*

# Crash log files
crash.log
crash.*.log

# Ignore override files as they are usually used to override resources locally and so
# are not checked in
override.tf
override.tf.json
*_override.tf
*_override.tf.json

# Ignore transient lock info files created by terraform apply
.terraform.tfstate.lock.info
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ To deploy this blueprint you need:
- The `serviceusage.googleapis.com` must be enabled on the project. For more
information about enabling APIs, see
[Enabling and disabling services](https://cloud.google.com/service-usage/docs/enable-disable)
- A Git repository to store the environment configuration.

You create the infastructure using Terraform. The blueprint uses a local [Terraform backend](https://www.terraform.io/docs/language/settings/backends/configuration.html),
but we recommend to configure a [remote backend](https://www.terraform.io/language/settings/backends/configuration#backend-types)
Expand Down Expand Up @@ -127,10 +128,16 @@ Users and teams managing tenant apps should not have permissions to change clust
1. Initialize the following Terraform variables:

```hcl
project_id = # Google Cloud project ID where to provision resources with the blueprint.
acm_repository_path = # Path on the host running Terraform to store the GKE descriptors to configure the cluster
project_id = # Google Cloud project ID where to provision resources with the blueprint.
acm_repository_path = # Path on the host running Terraform to store environment configuration
acm_repository_url = # URL of the repository to store environment configuration
acm_secret_type = # Secret type to authenticate with the Config Sync Git repository
acm_source_repository_fqdns = # FQDNs of source repository for Config Sync to allow in the Network Firewall Policy
```

For more information about setting `acm_secret_type`, see
[Grant access to Git](https://cloud.google.com/kubernetes-engine/enterprise/config-sync/docs/how-to/installing-config-sync#git-creds-secret).

If you don't provide all the necessary inputs, Terraform will exit with an
error, and will provide information about the missing inputs. For example,
you can create a Terraform variables initialization file and set inputs there.
Expand All @@ -145,6 +152,9 @@ Users and teams managing tenant apps should not have permissions to change clust

The provisioning process may take about 15 minutes to complete.

1. [Grant the Config Sync agent access to the Git repository](https://cloud.google.com/kubernetes-engine/enterprise/config-sync/docs/how-to/installing-config-sync#git-creds-secret)
where the environment configuration will be stored.

1. Wait for the GKE cluster to be reported as ready in the [GKE Kuberentes clusters dashboard](https://cloud.google.com/kubernetes-engine/docs/concepts/dashboards#kubernetes_clusters).

### Next steps
Expand All @@ -154,6 +164,7 @@ To familiarize with the environment that you provisioned, you can also deploy
the following examples in the GKE cluster:

- [Distributed TensorFlow Federated training](./examples/federated-learning/tff/distributed-fl-simulation-k8s/README.md)
- [Nvflare training](./examples/federated-learning/tff/nvflare/README.md)

Federated learning is typically split into Cross-silo and Cross-device federated learning. Cross-silo federated computation is where the participating members are organizations or companies, and the number of members is usually small (e.g., within a hundred).

Expand Down
Loading

0 comments on commit d726931

Please sign in to comment.