Skip to content

Gaining efficiency in echo server performance by appropriate design and implementation choices.

Notifications You must be signed in to change notification settings

Fenix-125/hpc-echo-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gaining efficiency in echo server performance by appropriate design and implementation choices

Author: Yuriy Pasichnyk

This repository is a part of a bachelor's thesis.

Thesis title: Performance analysis of synchronous and asynchronous parallel network server implementations using the C++ language

Description

The main two paradigms for implementing parallel network servers are synchronous and asynchronous. After an overview of these methodologies and implementation choices, the most representative and valuable versions of a stateful TCP echo server were designed and implemented. The versions and their specification are listed below:

  • echo_server_simple_threaded -- synchronous multithreaded
    • A separate thread per client is used.
    • A blocking I/O is used.
  • echo_server_simple -- hybrid-synchronous single-threaded
    • The hybrid keyword is used to denote that an asynchronous syscall("poll syscall") is used.
    • A blocking I/O is used.
  • echo_server_custom_thread_pool -- hybrid-synchronous multithreaded
    • The hybrid keyword is used to denote that an asynchronous syscall("poll syscall") is used.
    • Custom thread pool is used to distribute work between worker threads.
    • A blocking I/O is used.
  • echo_server_boost_asio -- asynchronous single-threaded
    • For this implementation, the boost asynchronous lib was used.
    • A non-blocking I/O is used.
  • echo_server_boost_asio_threaded -- asynchronous multithreaded
    • For this implementation, the boost asynchronous lib was used, too.
    • A non-blocking I/O is used.

All versions support Google Logging. The Logging in the not-debug compilation is reduced due to performance concerns. The logging output is written to separate files in the newly created ./logs directory.

Server Requirements

  • Send back received data from the client
  • Hold the client session until the client terminates it
  • Use TCP as the transport level protocol

Prerequisites

The requirements for apt and apk Linux packet managers are listed in the corresponding files in the dependencies directory. An example how to install dependencies you can find below:

$ apt update && apt upgrade 
$ xargs apt install -y << ./dependencies/apt.txt 

Compilation

The compilation is automated using the compile.sh. Please refer to the help of the script via the -h option. To compile all the versions with optimization and install them to the ./bin directory (created automatically), use the command below:

$ bash ./compile.sh

Usage

The compiled executables can be run in normal user mode as shown below for differnt versions:

$ ./echo_server_simple_threaded
$ ./echo_server_simple
$ ./echo_server_custom_thread_pool
$ ./echo_server_boost_asio
$ ./echo_server_boost_asio_threaded

Testing description

The perfomance testing of this versions is done using the Fortio opern source testing tool with parameters listed below:

  • "-qps 0" -- try to send maximum number of queries per second
  • "-t 60 s" -- test duration 60 seconds
  • "-c " -- client number parameter
  • "-payload-size 64" -- set the client message size
  • "-uniform" -- de-synchronize parallel clients’ requests uniformly

The server was run on a PC with characteristics listed below:

Characteristic Value
CPU Architecture x86_64
CPU Model name 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
Logical CPUs 8
Physical CPUs 4
CPU max MHz 4200
CPU min MHz 400
CPU Byte Oserver Little Endian
L1d cache 192 KiB (4 instances)
L1i cache 128 KiB (4 instances)
L2 cache 5 MiB (4 instances)
L3 cache 8 MiB (1 instance)
RAM 12.0 GB
RAM type DDR4 SDRAM
OS Ubuntu 20.04 LTS
OS Kernel Linux Kernel 5.4
NIC Gigabit Ethernet LAN

The load for the server was generated from 3 PCs, which were interconnected with a Gigabit Ethernet network using a Cisco Switch. The metrics that we used to measure the performance are:

  • Connected clients number
  • Throughput
  • Latency
  • CPU consumption
  • Memory consumption

Performance Visualizations

Below you can see visualizations of data collected from the Fortio load tests.

Throughput in respect to clients number

chart-throughput-1

Average latency in respect to clients number

chart-avg-latency-1

90 percentile latency in respect to clients number

chart-latency-persentile-90-1

99 percentile latency in respect to clients number

chart-latency-persentile-99-1

99.9 percentile latency in respect to clients number

chart-latency-persentile-999-1

CPU usage in respect to clients number

chart-cpu-1

Memory usage in respect to clients number

chart-mem-1

About

Gaining efficiency in echo server performance by appropriate design and implementation choices.

Resources

Stars

Watchers

Forks

Packages

No packages published