-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve & simplify valgrind testing #278
base: master
Are you sure you want to change the base?
Conversation
* Get rid of valgrind code from fab infra because using this infra for valgrind testing was both unnecessary and this was heavily complicatings the things. Instead, a simpler infra that we can also use in our local machines seems much more usuful and easier to maintain & understand. * Also remove fabfile/pg_report_vg_query.patch since the Postgres patch that we depend on to report the query string that causes the memory error has been merged into Postgres as of PG16.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some pointers to optimize the docker image a bit. I have no real understanding of how we use valgrind, so would be good to get someone involved in day-to-day citus to checkout and try to use this change to run a valgrind test before merging - hence the lack of approval for now.
@@ -0,0 +1,37 @@ | |||
FROM ubuntu:24.04 | |||
|
|||
RUN apt-get update && apt-get install -y git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apt update
pulls in a lot of data that sticks around on this layer, making the image uselessly bigger. It is common to apt clean
after you have installed something eg: https://github.com/citusdata/citus/blob/117bd1d04fe55b4001012b3923216d5df0fda10e/.devcontainer/Dockerfile#L43C5-L43C14
There are a couple of more occurances of apt update && apt install ...
down below
RUN ./bin/pgenv config init | ||
RUN sed -i 's/^PGENV_CONFIGURE_OPTIONS=.*/PGENV_CONFIGURE_OPTIONS=([0]="--enable-debug" [1]="--enable-depend" [2]="--enable-cassert" [3]="CFLAGS=-ggdb -Og -fno-omit-frame-pointer -DUSE_VALGRIND")/' ./config/default.conf | ||
RUN sed -i "s/^PGENV_MAKE_OPTIONS=.*/PGENV_MAKE_OPTIONS=([0]='-sj$(nproc)')/" ./config/default.conf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit hard to read what config you are running, could you copy the config file in?
eg https://github.com/citusdata/citus/blob/117bd1d04fe55b4001012b3923216d5df0fda10e/.devcontainer/Dockerfile#L63
This adds an extra file, but makes it easier to understand what the config is, that we are using.
RUN git clone https://github.com/theory/pgenv | ||
|
||
WORKDIR /pgenv | ||
RUN git fetch && git checkout v1.3.6b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as per https://stackoverflow.com/questions/8932389/git-shallow-clone-to-specific-tag
RUN git clone https://github.com/theory/pgenv | |
WORKDIR /pgenv | |
RUN git fetch && git checkout v1.3.6b | |
RUN git clone --branch v1.3.6b --depth 1 https://github.com/theory/pgenv |
less code, less data, instantly in the state you want it in
RUN sed -i 's/^PGENV_CONFIGURE_OPTIONS=.*/PGENV_CONFIGURE_OPTIONS=([0]="--enable-debug" [1]="--enable-depend" [2]="--enable-cassert" [3]="CFLAGS=-ggdb -Og -fno-omit-frame-pointer -DUSE_VALGRIND")/' ./config/default.conf | ||
RUN sed -i "s/^PGENV_MAKE_OPTIONS=.*/PGENV_MAKE_OPTIONS=([0]='-sj$(nproc)')/" ./config/default.conf | ||
|
||
RUN apt-get update && apt-get install -y bzip2 build-essential libicu-dev pkg-config flex bison liblz4-dev libreadline-dev libzstd-dev libxslt1-dev zlib1g-dev zlib1g-dev libxml2-utils docbook-xsl xsltproc valgrind |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when installing multiple packages it is easier maintainable if you put every package on a new line and sort them. Keeping an extra \
at the end of the last package as well for easier diffing in the future, but that is also enforced by the apt clean you want to have in the same RUN
as the update and install.
after each package is on a separate line use your editor to sort the packages. Future edits should remember to insert in sorted position. This will also identify double mentions of the same package, like zlib1g-dev
in your list.
|
||
RUN apt-get update && apt-get install -y bzip2 build-essential libicu-dev pkg-config flex bison liblz4-dev libreadline-dev libzstd-dev libxslt1-dev zlib1g-dev zlib1g-dev libxml2-utils docbook-xsl xsltproc valgrind | ||
ARG PG_VERSION | ||
RUN ./bin/pgenv build $PG_VERSION && ./bin/pgenv switch $PG_VERSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you will want to add some makeflags here to speed up the build to use all available cores.
RUN git clone https://github.com/citusdata/citus | ||
WORKDIR /citus | ||
ARG CITUS_VERSION | ||
RUN git fetch && git checkout $CITUS_VERSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably use the same --branch ... --depth 1
as above
WORKDIR / | ||
|
||
# install python3 as it's required by Citus test suite | ||
RUN apt-get update && apt-get install -y python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to not add this to the bigger install above?
@@ -0,0 +1,37 @@ | |||
FROM ubuntu:24.04 | |||
|
|||
RUN apt-get update && apt-get install -y git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible I would try to get all installs in a single run, unless there is a good reason to have them at different layers throughout the file
docker cp "$container_name":/citus/src/test/regress/regression.diffs "$test_artifacts_dir/" | ||
docker cp "$container_name":/citus/src/test/regress/regression.out "$test_artifacts_dir/" | ||
docker cp "$container_name":/citus/src/test/regress/citus_valgrind_test_log.txt "$test_artifacts_dir/" | ||
docker cp "$container_name":/citus/src/test/regress/gdb_core_backtraces "$test_artifacts_dir/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that we have copied out the files, would it make sense to remove the container after the run? If there is reason to keep it lets comment why we are not removing the container.
especially when running many schedules it pollutes the docker environment of the eng quite a bit.
create_cluster.sh
and its friends can still be used to create an Azure VM but they are not a must now. See the changes in README.md for more details.Note to reviewer - although the PR touches nine files, this PR mainly introduces following three files and updates the README on the usage of valgrind testing infra. The other 5 files are only modified to remove the old infra.
The dockerfile to build a simple docker image that can be used to run valgrind tests.
This has nothing but Postgres & Citus built with the flags that are needed to properly run valgrind tests and valgrind itself.
The script that we run within the container to run the valgrind test for the specified test schedule.
We could set the entrypoint of the container to this script but we keep it as a separate script for simplicity.
The host side script that we use to build the container and run the valgrind tests through the container.
I strongly recommend reading the relevant section of the README from this PR to better understand the changes.