Skip to content

Commit 2fcc9ce

Browse files
committed
Move kettle from k8s-gubernator to kubernetes-public
Signed-off-by: Davanum Srinivas <[email protected]>
1 parent dcdd96d commit 2fcc9ce

18 files changed

+83
-125
lines changed
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
periodics:
2+
- name: metrics-kettle
3+
cluster: k8s-infra-prow-build-trusted
4+
interval: 1h
5+
decorate: true
6+
extra_refs:
7+
- org: kubernetes
8+
repo: test-infra
9+
base_ref: master
10+
spec:
11+
serviceAccountName: k8s-triage
12+
containers:
13+
- image: gcr.io/k8s-staging-test-infra/bigquery:v20240205-69ac5748ba
14+
args:
15+
- ./kettle/monitor.py
16+
- --stale=6
17+
- --table
18+
- k8s_infra_kettle:build.all
19+
- k8s_infra_kettle:build.week
20+
- k8s_infra_kettle:build.day
21+
annotations:
22+
testgrid-num-failures-to-alert: '6'
23+
testgrid-alert-stale-results-hours: '12'
24+
testgrid-dashboards: sig-testing-misc
25+
testgrid-alert-email: [email protected], [email protected]
26+
testgrid-broken-column-threshold: '0.5'
27+
description: Monitors Kettle's BigQuery database freshness.

config/jobs/kubernetes/test-infra/test-infra-periodics.yaml

Lines changed: 0 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -35,32 +35,6 @@ periodics:
3535
testgrid-broken-column-threshold: '0.5'
3636
description: Runs `make test verify` on the test-infra repo every hour
3737

38-
- name: metrics-kettle
39-
interval: 1h
40-
decorate: true
41-
extra_refs:
42-
- org: kubernetes
43-
repo: test-infra
44-
base_ref: master
45-
spec:
46-
serviceAccountName: triage
47-
containers:
48-
- image: gcr.io/k8s-staging-test-infra/bigquery:v20240205-69ac5748ba
49-
args:
50-
- ./kettle/monitor.py
51-
- --stale=6
52-
- --table
53-
- k8s-gubernator:build.all
54-
- k8s-gubernator:build.week
55-
- k8s-gubernator:build.day
56-
annotations:
57-
testgrid-num-failures-to-alert: '6'
58-
testgrid-alert-stale-results-hours: '12'
59-
testgrid-dashboards: sig-testing-misc
60-
testgrid-alert-email: [email protected], [email protected]
61-
testgrid-broken-column-threshold: '0.5'
62-
description: Monitors Kettle's BigQuery database freshness.
63-
6438
- name: job-migration-todo-report
6539
decorate: true
6640
interval: 24h

docs/architecture.dot

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ digraph G {
2323
Gubernator [href="https://gubernator.k8s.io"]
2424
"Testgrid (closed)" [href="https://testgrid.k8s.io"]
2525
Deck [href="https://prow.k8s.io"]
26-
BigQuery [href="https://bigquery.cloud.google.com/table/k8s-gubernator:build.week"]
26+
BigQuery [href="https://bigquery.cloud.google.com/table/k8s_infra_kettle:build.week"]
2727

2828
subgraph cluster_Prow {
2929
label="Prow"

docs/architecture.svg

Lines changed: 1 addition & 1 deletion
Loading

kettle/Makefile

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -20,18 +20,18 @@ IMG = gcr.io/k8s-testimages/kettle
2020
TAG := $(shell date +v%Y%m%d)-$(shell git describe --tags --always --dirty)
2121

2222
# These are the usual GKE variables.
23-
PROJECT ?= k8s-gubernator
24-
ZONE ?= us-west1-b
25-
CLUSTER ?= g8r
23+
PROJECT ?= kubernetes-public
24+
ZONE ?= us-central1
25+
CLUSTER ?= aaa
2626

2727
get-cluster-credentials:
28-
kubectl config use-context gke_k8s-gubernator_us-west1-b_g8r || gcloud container clusters get-credentials "$(CLUSTER)" --project="$(PROJECT)" --zone="$(ZONE)"
28+
kubectl config use-context gke_kubernetes-public_us-central1_aaa || gcloud container clusters get-credentials "$(CLUSTER)" --project="$(PROJECT)" --zone="$(ZONE)"
2929

3030
push-prod:
31-
../../../hack/make-rules/go-run/arbitrary.sh run ./images/builder --project=k8s-testimages --scratch-bucket=gs://k8s-testimages-scratch --build-dir=. kettle/
31+
../hack/make-rules/go-run/arbitrary.sh run ./images/builder --project=k8s-staging-infra-tools --scratch-bucket=gs://k8s-testimages-scratch --build-dir=. kettle/
3232

3333
push:
34-
../../../hack/make-rules/go-run/arbitrary.sh run ./images/builder --project=k8s-testimages --allow-dirty --build-dir=. kettle/
34+
../hack/make-rules/go-run/arbitrary.sh run ./images/builder --project=k8s-staging-infra-tools --allow-dirty --build-dir=. kettle/
3535

3636
deploy: get-cluster-credentials
3737
sed "s/:latest/:$(TAG)/g" deployment.yaml | kubectl apply -f - --record

kettle/OVERVIEW.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,9 @@ Flags:
2929
# Create JSON Results and Upload
3030
This stage gets run for each [BigQuery] table that Kettle is tasked with uploading data to. Typically looking like either:
3131
- Fixed Time: `pypy3 make_json.py --days <num> | pv | gzip > build_<table>.json.gz`
32-
and `bq load --source_format=NEWLINE_DELIMITED_JSON --max_bad_records={MAX_BAD_RECORDS} k8s-gubernator:build.<table> build_<table>.json.gz schema.json`
32+
and `bq load --source_format=NEWLINE_DELIMITED_JSON --max_bad_records={MAX_BAD_RECORDS} k8s_infra_kettle:build.<table> build_<table>.json.gz schema.json`
3333
- All Results: `pypy3 make_json.py | pv | gzip > build_<table>.json.gz`
34-
and `bq load --source_format=NEWLINE_DELIMITED_JSON --max_bad_records={MAX_BAD_RECORDS} k8s-gubernator:build.<table> build_<table>.json.gz schema.json`
34+
and `bq load --source_format=NEWLINE_DELIMITED_JSON --max_bad_records={MAX_BAD_RECORDS} k8s_infra_kettle:build.<table> build_<table>.json.gz schema.json`
3535

3636
### Make Json
3737
`make_json.py` prepares an incremental table to track builds it has emitted to BQ. This table is named `build_emitted_<days>` (if days flag passed) or `build_emitted` otherwise. *This is important because if you change the days AND NOT the table being uploaded to, you will get duplicate results. If the `--reset_emitted` flag is passed, it will refresh the incremental table for fresh data. It then walks all of the builds to fetch within `<days>` or since epoch if unset, and dumps each as a json object to a build `tar.gz`.
@@ -49,6 +49,6 @@ After all historical data has been uploaded, Kettle enters a Streaming phase. It
4949
- inserts it into the tables (from flag)
5050
- adds the data to the respective incremental tables
5151

52-
[BigQuery]: https://console.cloud.google.com/bigquery?utm_source=bqui&utm_medium=link&utm_campaign=classic&project=k8s-gubernator
52+
[BigQuery]: https://console.cloud.google.com/bigquery?utm_source=bqui&utm_medium=link&utm_campaign=classic&project=k8s-infra-kettle
5353
[Buckets]: https://github.com/kubernetes/test-infra/blob/master/kettle/buckets.yaml
54-
[Schema]: https://github.com/kubernetes/test-infra/blob/master/kettle/schema.json
54+
[Schema]: https://github.com/kubernetes/test-infra/blob/master/kettle/schema.json

kettle/README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,12 @@ This collects test results scattered across a variety of GCS buckets,
44
stores them in a local SQLite database, and outputs newline-delimited
55
JSON files for import into BigQuery. *See [overview](./OVERVIEW.md) for more details.*
66

7-
Results are stored in the [k8s-gubernator:build BigQuery dataset][Big Query Tables],
7+
Results are stored in the [k8s_infra_kettle:build BigQuery dataset][Big Query Tables],
88
which is publicly accessible.
99

1010
# Deploying
1111

12-
Kettle runs as a pod in the `k8s-gubernator/g8r` cluster. To drop into it's context, run `<root>$ make -C kettle get-cluster-credentials`
12+
Kettle runs as a pod in the `kubernetes-public/aaa` cluster. To drop into it's context, run `<root>$ make -C kettle get-cluster-credentials`
1313

1414
If you change:
1515

@@ -18,7 +18,7 @@ If you change:
1818
- any code: **Run from root** deploy with `make -C kettle push update`, revert with `make -C kettle rollback` if it fails
1919
- `push` builds the continer image and pushes it to the image registry
2020
- `update` sets the image of the existing kettle *Pod* which triggers a restart cycle
21-
- this will build the image to [Google Container Registry](https://console.cloud.google.com/gcr/images/k8s-gubernator/GLOBAL/kettle?project=k8s-gubernator&organizationId=433637338589&gcrImageListsize=30)
21+
- this will build the image to [Google Container Registry](https://console.cloud.google.com/gcr/images/kubernetes-public/GLOBAL/kettle)
2222
- See [Makefile](Makefile) for details
2323

2424
#### Note:
@@ -63,23 +63,23 @@ You can watch the pod startup and collect data from various GCS buckets by looki
6363
```sh
6464
kubectl logs -f $(kubectl get pod -l app=kettle -oname)
6565
```
66-
or access [log history](https://console.cloud.google.com/logs/query?project=k8s-gubernator) with the Query: `resource.labels.container_name="kettle"`.
66+
or access [log history](https://console.cloud.google.com/logs/query?project=kubernetes-public) with the Query: `resource.labels.container_name="kettle"`.
6767

6868
It might take a couple of hours to be fully functional and start updating BigQuery. You can always go back to the [Gubernator BigQuery page][Big Query All] and check to see if data collection has resumed. Backfill should happen automatically.
6969

7070
#### Kettle Staging
7171

7272
`Kettle Staging` uses a similar deployment to `Kettle` with the following differences
73-
- [100G SSD](https://console.cloud.google.com/compute/disksDetail/zones/us-west1-b/disks/kettle-data-staging?folder=&organizationId=&project=k8s-gubernator) vs 1001G in production
73+
- [100G SSD](https://console.cloud.google.com/compute/disksDetail/zones/us-central1/disks/kettle-data-staging?folder=&organizationId=&project=kubernetes-public) vs 1001G in production
7474
- Limit option for number of builds to pull from each job bucket (Default 1000 each). Set via BUILD_LIMIT env in [deployment-staging.yaml](./deployment-staging.yaml).
75-
- writes to [build.staging](https://console.cloud.google.com/bigquery?project=k8s-gubernator&page=table&t=all&d=build&p=k8s-gubernator&redirect_from_classic=true) table only. This differs from production that writes to three tables `build.all`, `build.day`, and `build.week`.
75+
- writes to [build.staging](https://console.cloud.google.com/bigquery?project=kubernetes-public&page=table&t=all&d=build&p=kubernetes-public&redirect_from_classic=true) table only. This differs from production that writes to three tables `build.all`, `build.day`, and `build.week`.
7676

7777

7878
It can be deployed with `make -C kettle deploy-staging`. If already deployed, you may just run `make -C kettle update-staging`.
7979

8080
#### Adding Fields
8181

82-
To add fields to the BQ table, Visit the [k8s-gubernator:build BigQuery dataset][Big Query Tables] and Select the table (Ex. Build > All). Schema -> Edit Schema -> Add field. As well as update [schema.json](./schema.json)
82+
To add fields to the BQ table, Visit the [k8s_infra_kettle:build BigQuery dataset][Big Query Tables] and Select the table (Ex. Build > All). Schema -> Edit Schema -> Add field. As well as update [schema.json](./schema.json)
8383

8484
## Adding Buckets
8585

@@ -118,21 +118,21 @@ gcloud pubsub subscriptions create <subscription name> --topic=gcs-changes --top
118118
```
119119

120120
### Auth
121-
For kettle to have permission, kettle's user needs access. When updating or changing a [Subscription] make sure to add `kettle@k8s-gubernator.iam.gserviceaccount.com` as a `PubSub Editor`.
121+
For kettle to have permission, kettle's user needs access. When updating or changing a [Subscription] make sure to add `kettle@kubernetes-public.iam.gserviceaccount.com` as a `PubSub Editor`.
122122
```
123123
gcloud pubsub subscriptions add-iam-policy-binding \
124124
projects/kubernetes-jenkins/subscriptions/kettle-staging \
125-
--member=serviceAccount:kettle@k8s-gubernator.iam.gserviceaccount.com \
125+
--member=serviceAccount:kettle@kubernetes-public.iam.gserviceaccount.com \
126126
--role=roles/pubsub.editor
127127
```
128128
129129
# Known Issues
130130
131131
- Occasionally data from Kettle stops updating, we suspect this is due to a transient hang when contacting GCS ([#8800](https://github.com/kubernetes/test-infra/issues/8800)). If this happens, [restart kettle](#restarting)
132132
133-
[Big Query Tables]: https://console.cloud.google.com/bigquery?utm_source=bqui&utm_medium=link&utm_campaign=classic&project=k8s-gubernator
134-
[Big Query All]: https://console.cloud.google.com/bigquery?project=k8s-gubernator&page=table&t=all&d=build&p=k8s-gubernator
135-
[Big Query Staging]: https://console.cloud.google.com/bigquery?project=k8s-gubernator&page=table&t=staging&d=build&p=k8s-gubernator
133+
[Big Query Tables]: https://console.cloud.google.com/bigquery?utm_source=bqui&utm_medium=link&utm_campaign=classic&project=kubernetes-public
134+
[Big Query All]: https://console.cloud.google.com/bigquery?project=kubernetes-public&page=table&t=all&d=build&p=kubernetes-public
135+
[Big Query Staging]: https://console.cloud.google.com/bigquery?project=kubernetes-public&page=table&t=staging&d=build&p=kubernetes-public
136136
[PubSub]: https://cloud.google.com/pubsub/docs
137137
[Subscriptions]: https://console.cloud.google.com/cloudpubsub/subscription/list?project=kubernetes-jenkins
138138
[Topic Creation]: https://cloud.google.com/storage/docs/reporting-changes#enabling

kettle/deployment-staging.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ apiVersion: v1
33
kind: ServiceAccount
44
metadata:
55
annotations:
6-
iam.gke.io/gcp-service-account: kettle@k8s-gubernator.iam.gserviceaccount.com
6+
iam.gke.io/gcp-service-account: kettle@kubernetes-public.iam.gserviceaccount.com
77
name: kettle
88
---
99
apiVersion: apps/v1
@@ -23,7 +23,7 @@ spec:
2323
serviceAccountName: kettle
2424
containers:
2525
- name: kettle-staging
26-
image: gcr.io/k8s-testimages/kettle:latest
26+
image: gcr.io/k8s-staging-infra-tools/kettle:latest
2727
imagePullPolicy: Always
2828
env:
2929
- name: BUILD_LIMIT

kettle/deployment.yaml

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,18 @@
22
apiVersion: v1
33
kind: ServiceAccount
44
metadata:
5-
annotations:
6-
iam.gke.io/gcp-service-account: [email protected]
75
name: kettle
6+
namespace: kettle
7+
labels:
8+
app: kettle
9+
annotations:
10+
iam.gke.io/gcp-service-account: [email protected]
811
---
912
apiVersion: apps/v1
1013
kind: Deployment
1114
metadata:
1215
name: kettle
16+
namespace: kettle
1317
spec:
1418
replicas: 1
1519
selector:
@@ -23,13 +27,18 @@ spec:
2327
serviceAccountName: kettle
2428
containers:
2529
- name: kettle
26-
image: gcr.io/k8s-testimages/kettle:latest
30+
image: gcr.io/k8s-staging-infra-tools/kettle:latest
2731
imagePullPolicy: Always
2832
env:
2933
- name: DEPLOYMENT
3034
value: prod
3135
- name: SUBSCRIPTION_PATH
3236
value: kubernetes-jenkins/gcs-changes/kettle-filtered
37+
resources:
38+
requests:
39+
memory: 4Gi
40+
limits:
41+
memory: 12Gi
3342
volumeMounts:
3443
- name: data
3544
mountPath: /data

kettle/pv.yaml

Lines changed: 7 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,22 @@
1-
kind: PersistentVolume
2-
apiVersion: v1
1+
kind: StorageClass
2+
apiVersion: storage.k8s.io/v1
33
metadata:
4-
labels:
5-
app: kettle
6-
name: kettle-data
7-
spec:
8-
capacity:
9-
storage: 3001Gi
10-
accessModes:
11-
- ReadWriteOnce
12-
persistentVolumeReclaimPolicy: Retain
13-
gcePersistentDisk:
14-
pdName: kettle-data
15-
fsType: ext4
4+
name: ssd
5+
provisioner: kubernetes.io/gce-pd
6+
parameters:
7+
type: pd-ssd
168
---
179
kind: PersistentVolumeClaim
1810
apiVersion: v1
1911
metadata:
2012
labels:
2113
app: kettle
2214
name: kettle-data
15+
namespace: kettle
2316
spec:
2417
accessModes:
2518
- ReadWriteOnce
2619
resources:
2720
requests:
2821
storage: 3001Gi
2922
storageClassName: ssd
30-
volumeName: kettle-data
31-
---
32-
apiVersion: storage.k8s.io/v1
33-
kind: StorageClass
34-
metadata:
35-
name: ssd
36-
provisioner: kubernetes.io/gce-pd
37-
parameters:
38-
type: pd-ssd
39-
allowVolumeExpansion: true
40-
reclaimPolicy: Delete

kettle/stream.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -320,7 +320,7 @@ def get_options(argv):
320320
)
321321
parser.add_argument(
322322
'--dataset',
323-
help='BigQuery dataset (e.g. k8s-gubernator:build)'
323+
help='BigQuery dataset (e.g. k8s_infra_kettle:build)'
324324
)
325325
parser.add_argument(
326326
'--tables',

kettle/update.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -63,23 +63,23 @@ def main():
6363

6464
if os.getenv('DEPLOYMENT', 'staging') == "prod":
6565
call(f'{mj_cmd} {mj_ext} --days {DAY} | pv | gzip > build_day.json.gz')
66-
call(f'{bq_cmd} {bq_ext} k8s-gubernator:build.day build_day.json.gz schema.json')
66+
call(f'{bq_cmd} {bq_ext} k8s_infra_kettle:build.day build_day.json.gz schema.json')
6767

6868
call(f'{mj_cmd} {mj_ext} --days {WEEK} | pv | gzip > build_week.json.gz')
69-
call(f'{bq_cmd} {bq_ext} k8s-gubernator:build.week build_week.json.gz schema.json')
69+
call(f'{bq_cmd} {bq_ext} k8s_infra_kettle:build.week build_week.json.gz schema.json')
7070

7171
# TODO: (MushuEE) #20024, remove 30 day limit once issue with all uploads is found
7272
call(f'{mj_cmd} --days {MONTH} | pv | gzip > build_all.json.gz')
73-
call(f'{bq_cmd} k8s-gubernator:build.all build_all.json.gz schema.json')
73+
call(f'{bq_cmd} k8s_infra_kettle:build.all build_all.json.gz schema.json')
7474

7575
call(f'python3 stream.py --poll {SUB_PATH} ' \
76-
f'--dataset k8s-gubernator:build ' \
76+
f'--dataset k8s_infra_kettle:build ' \
7777
f'--tables all:{MONTH} day:{DAY} week:{WEEK} --stop_at=1')
7878
else:
7979
call(f'{mj_cmd} | pv | gzip > build_staging.json.gz')
80-
call(f'{bq_cmd} k8s-gubernator:build.staging build_staging.json.gz schema.json')
80+
call(f'{bq_cmd} k8s_infra_kettle:build.staging build_staging.json.gz schema.json')
8181
call(f'python3 stream.py --poll {SUB_PATH} ' \
82-
f'--dataset k8s-gubernator:build --tables staging:0 --stop_at=1')
82+
f'--dataset k8s_infra_kettle:build --tables staging:0 --stop_at=1')
8383

8484
if __name__ == '__main__':
8585
os.chdir(os.path.dirname(__file__))

metrics/README.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -56,9 +56,6 @@ jqfilter: |
5656
* weekly-consistency - compute overall weekly consistency for PRs
5757
- [Config](configs/weekly-consistency-config.yaml)
5858
- [weekly-consistency-latest.json](http://storage.googleapis.com/k8s-metrics/weekly-consistency-latest.json)
59-
* istio-job-flakes - compute overall weekly consistency for postsubmits
60-
- [Config](configs/istio-flakes.yaml)
61-
- [istio-job-flakes-latest.json](http://storage.googleapis.com/k8s-metrics/istio-job-flakes-latest.json)
6259
6360
## Adding a new metric
6461

metrics/configs/istio-flakes.yaml

Lines changed: 0 additions & 31 deletions
This file was deleted.

0 commit comments

Comments
 (0)