Profile Guided Optimization (PGO)
Profile-guided Optimization is a compiler optimization technique that involves collecting typical execution data, including possible branches, during program execution. This collected data is then used to optimize various aspects of the code, such as inlining, conditional branches, machine code layout, and register allocation.
According to the
Open in the new tab
More information about PGO in Databend you can read in
Open in the new tab
Prerequisites
Before you build Databend with PGO, make sure the following requirements have been met:
- Install the PGO helper binary by adding the
llvm-tools-preview
component to your toolchain with rustup:
$ rustup component add llvm-tools-preview
- Install that makes it easier to useOpen in the new tabto optimize Rust binaries.Open in the new tab
$ cargo install cargo-pgo
Building Databend with PGO
Follow the steps below to build Databend:
- Download the source code.
git clone https://github.com/datafuselabs/databend.git
- Install dependencies and compile the source code.
cd databend
make setup -d
export PATH=$PATH:~/.cargo/bin
- Build Databend with
cargo-pgo
. Due to awith PyO3, we need to skipOpen in the new tabbendpy
during the build.
cargo pgo build -- --workspace --exclude bendpy
- Import the dataset and run a typical query workload.
# Run Databend in standalone mode, or you can try running it in cluster mode.
BUILD_PROFILE=<target-tuple>/release ./scripts/ci/deploy/databend-query-standalone.sh
# Here Databend's SQL logic tests is used as a demonstration, but you need to adjust it for the target workload
ulimit -n 10000;ulimit -s 16384; cargo run -p databend-qllogictest --release -- --enable_sandbox --parallel 16 --no-fail-fast
# Stop the running databend instance to dump profile data.
killall databend-query
killall databend-meta
tip
- You need to check the platform triple corresponding to your current build and replace
<target-tuple>
above. For example:x86_64-unknown-linux-gnu
. - For more precise profiles, run it with the following environment variable:
LLVM_PROFILE_FILE=./target/pgo-profiles/%m_%p.profraw
.
- Build an optimized Databend
cargo pgo optimize -- --workspace --exclude bendpy