Remote Desktop Server Sample

The Remote Desktop Server sample implements a low-latency streaming server with the following functionality:

Video Capture from the specified display using either the AMD proprietary Direct Display Capture or Microsoft Desktop Duplication API (Windows only) at the specified frame rate or when the image is updated (AMD capture only)
Desktop audio capture from the default audio output
Streaming compressed video using h.264 (AVC), h.265 (HEVC) or AV1 video codec (subject to hardware-accelerated video encoder availability) to one or several concurrently connected clients at the specified resolution and bitrate
Standard Dynamic Range (SDR) and High Dynamic Range (HDR) streaming when supported by the OS and codec (HEVC and AV1 on Windows only)
Streaming compressed audio using AAC or OPUS audio codec to one or several concurrently connected clients at the specified sampling rate, bitrate and channel layout
Receiving input events from the keyboard, mouse and game controller connected to the client and injecting them into server's input queues
Optional AES encryption of all network traffic using a pre-shared password to generate the encryption key
Running interactively or as a background process
Extensive logging

NOTE - A compatible AMD GPU or APU to run. Currently this includes:

the "Navi" family discrete GPUs (RX-5000, RX-6000, RX-7000-series and newer)
Ryzen 5000-series APUs and newer.
Older GPUs, such as Rx Vega and Polaris series (RX-470/480/580/590) might work with limited functionality.
AMD does not test Streaming SDK on GPUs not supported by the current display drivers - use at your own risk.

Running Remote Desktop Server

Run the RemoteDesktopServer executable with the following command line parameters (command line parameters are not case-sensitive):

Video Capture Parameters:

-MonitorID <index> - specifies a zero-based montor ID to capture video from. Default: 0
-Capture [AMD|DD] - Windows only. Selects the display capture API. Specify AMD for AMD Direct Capture or DD for Microsoft Desktop Duplication API. Default: AMD.
-CaptureMode [Framerate|Present] - specifies the capture mode when AMD Direct Display Capture is used. Framerate configures AMD Direct Capture to produce a sequence of frames at the specified frame rate. Present triggers capture only when the image on the display is updated. This option is not applicable when Microsoft Desktop Duplication API is used. Default: Framerate.
-Framerate <fps> - specifies the frame rate video will be captured at when AMD Direct Display Capture is used. This option is not applicable when Microsoft Desktop Duplication API is used. Default: 60.
-VideoAPI [DX12|DX11] - specifies the graphical API used by the server's video pipeline. Windows only. Default: DX12.

Video Encoding Parameters:

-VideoCodec [AVC|H264|HEVC|H265|AV1] - specifies the video codec to be used for video compression. H264 and H265 are aliases for AVC and HEVC respectively. Default: HEVC.
-VideoBitrate <bitrate> - specifies the average bitrate in bits-per-second the video stream will be encoded at. Note that this is the target average bitrate and is not guaranteed to be constant - for example, a static image on the server's desktop might compress to a much lower bitrate, while a highly dynamic game scene may produce spikes above the specified bitrate. This behavior is normal and is determined by the nature of lossy video compression. For more information about choosing the appropriate bitrate, please refer to the Configuration Considerations section. Default: 20000000 (20Mbps).
-Resolution <width,height> - specifies the resolution of the encoded video stream. Please refer to the Configuration Considerations section for more details. Default: 1920,1080.
-PreserveAspectRatio [true/false] - specifies whether the aspect ratio of the server's display should be preserved when it is different from the aspect ratio of the encoded video stream. Default: true.
-Hdr [true|false] - enables High Dynamic Range (HDR) streaming. HDR allows for a more naturally looking video when enabled and the client is connected to an HDR-capable monitor. NOTE: HDR must be enabled on both the server and the client at the OS level for accurate color reproduction. Default: false.

Audio Encoding Parameters:

-AudioCodec [AAC|OPUS] - specifies the audio codec to be used for audio compression. Default: AAC.
-AudioSamplingRate <rate> - specifies the sampling rate in Hertz. When the sampling rate of the default audio output is different from the one specified, the captured audio stream will be resampled to the specified rate. Default: 44100.
-AudioBitrate <rate> - specifies the bitrate of the compressed audio stream in bits-per-second. Default: 256000.
-AudioChannels <channel_layout> - specifies the number of audio channels and their layout. When the channel layout of the audio output is different from the one specified, audio will be remixed to the specified layout. The following values are accepted:
- 1: single-channel monophonic stream.
- 2: 2-channel stereo (left + right)
- 2.1: 2.1-channel stereo (left + right + subwoofer)
- 3: 3-channel stereo (left + center + right)
- 3.1: 3.1-channel stereo (left + center + right + subwoofer)
- 5: 5-channel surround (left + center + right + left surround + right surround)
- 5.1: 5.1-channel surround (left + center + right + left surround + right surround + subwoofer)

Network Parameters:

-Protocol [UDP|TCP] - specifies whether streaming will be performed over UDP or TCP. NOTE: UDP generally produces lower latency, especially over slower networks, however UDP streaming is less reliable and is subject to packet loss. For more information please refer to the Configuration Considerations section. Also note that when TCP is enabled, UDP also remains active and the client has a choice of using either protocol. Default: UDP.
-Port <port> - specifies the UDP and TCP port on which the server listens for incoming connections. When TCP is enabled, the numeric value of the TCP port remains the same as UDP. Ensure that the inbound traffic to this port is allowed for the current network type by the firewall. Default: 1235.
-Hostname <name> - specifies the host name under which the server will be visible to clients.
-DatagramSize <size> - datagram size in bytes. Applicable only when streaming over UDP. For more information please refer to the Configuration Considerations section. Default: 65507.
-BindInterface <ip address or *> - specifies the IP address of the network interface the server will communicate through. You can specify either an IP address of a specific NIC or "*" to use the default route. Use this parameter when the server has more than one network interface and is behind a retstictive firewall where inbound streaming traffic is blocked on the default route. Default: *.
-Connections <number of connections> - specifies the number of concurrent connections the server is allowed to accept. Default: 1.
-Encrypted [true|false] - enables AES encryption of all network traffic. Default: false.
-Pass <passphrase> - specifies a pre-shared passphrase used to generate the encryption key when AES encryption is enabled. It must not be blank. Not applicable when encryption is disabled. This option is ignored when encryption is disabled.

Miscellaneous:

-LogFile <passphrase> - specifies the path to the log file. Default: RemoteDesktopServer.log located in the executable's directory.
-Interactive [true|false] - specifies whether the server is run in the interactive mode or as a background process. Default: true.
-Shutdown <port> - shutdown the server running in the background mode. The port value must match the value specified in the -port parameter when the server is started.

Code Overview

This section provides a high level overview of the Remote Desktop Server sample's code. The sample's source code is located in the samples/RemoteDesktopServer/ directory. This overview is not meant to be a reference, but is designed to provide guidance as you navigate the source code and help in understanding of the most important and the not-so-obvious aspects of the sample. We suggest that you follow the source code while reading this section.

Application Classes:

The server application is implemented in the RemoteDesktopServer class. This is a base class which contains all platform-agnostic functionality. All Windows-specific functionality is implemented in the RemoteDesktopServerWin class derived from RemoteDesktopServer. This approach allows for cleaner code, not polluted with platform-specific #ifdef statements.

These classes are responsible for the application initialization and termination, command line parsing and starting the video and audio capture components, the video and audio encoding pipelines and the network server.

Streaming Classes:

Video and audio capture and the respective encoding pipelines are implemented in the AVStreamer class. This class is responsible for running threads that capture video frames from the specified display and audio buffers from the default audio output. Video and audio capture threads are implemented by the AVStreamer::VideoCaptureThread and AVStreamer::AudioCaptureThread classes respectively. They perform video and audio capture by calling the QuesryOutput method of the m_VideoCapture and the m_AudioCapture class members respectively. These objects implement the amf::AMFComponent interface interface.

A server can supply multiple video and audio streams originating from either different or the same sources, encoded using different codecs or at different bitrates, scaled to different resolutions, sampling rates or audio channel layouts. Clients can subscribe to streams of their choice, with one video and one audio stream being the default ones. The Remote Desktop Server samples streams only one video and one audio stream, nevertheless it shows the entire process of subscribing to a video and an audio stream.

The AVStreamer object itself implements the SenderCallback interfaces through which it receives notifications about clients subscribing to and unsubscribing from certain video and audio streams. These are the ssdk::transport_common::ServerTransport::VideoSenderCallback and ssdk::transport_common::ServerTransport::AudioSenderCallback respectively. It also implements the ssdk::transport_common::ServerTransport::VideoStatsCallback interface which aggregates statistics collected by all currently connected clients. These interfaces are declared in the ssdk::transport_common::ServerTransport class located in the sdk/transports/transport_common/ServerTransport.h header file.

In addition, the AVStreamer class implements the ssdk::util::QoS::QoSCallback, which receives notifications from the Quality-of-Service (QoS) object defined in ssdk/util/QoS/QoS.h. This object constantly monitors the condition of the communication channel, the depth of the encoder and decoder queues and dynamically adjusts stream parameters to ensure the best balance between image quality and latency. For more information about the QoS implementation please refer to the Quality-of-Service section.

Synchronizing Video and Audio Timestamps

Both video frames and audio buffers carry timestamps that determine when these frames or buffers need to be presented or played. Since video and audio is captured independently of each other and video frames and audio packets are transfered across the network in separate messages, they are subject to various random delays which can desynchronize them. Different aspects of this synchronization needs to be performed on both the server and the client. Video and audio timestamps are calibrated on the server by the m_TimestampCalibrator class member of the ssdk::util::TimestampCalibrator type. The implementation of m_TimestampCalibrator is located in the sdk/util/pipeline/TimestampCalibrator. files.

Video Pipeline

The Remote Desktop Server sample transmits a single compressed video stream. Video compression and the necessary pre-processing, such as color space conversion and scaling, are performed by the m_VideoOutput class member of the ssdk::video::MonoscopicVideoOutput type, which encapsulated an optional scaler/color space converter and a video encoder. The source code for the ssdk::video::MonoscopicVideoOutput class is located in the sdk/video/MonoscopicVideoOutput. files. For more information about video transmitter pipelines please refer to the Implementiong Video Transmitter Pipelines section.

The input of the m_VideoOutput object is a sequence of video frames. Its output is a compressed video stream, which is then passed to the Network Transport. Since the server allows for multiple concurrent connections receiving the same video stream, the video stream might need to be replicated multiple times for each connection. For most codecs the stream must be preceded by the initialiazation block, commonly referred to as Extra Data, which contains the information about codec, stream resolution, codec-specific profile and other information required to initialize the decoder. Also, since most high compression ratio codecs such as h.264, HEVC or AV1 utilize temporal compression mechanisms, i.e. subsequent frames can reference similar looking parts in previous frames, such video streams can only be decoded starting from self-contained frames which do not reference any previous frames. Such frames are known as index (I-frames, IDR-frames) or key frames, depending on the codec. Since secondary clients can connect at any moment, they would likely be connected mid-stream and would not be able to decode it correctly unless an index/key frame is sent to them immediately after connection. Therefore, it is necessary to request an index/key frame from the video pipeline every time a new client connects to a video stream and the Extra Data initialization block is sent to the newly connected client prior to that. All this is implemented by the m_VideoTransmitterAdapter member of type ssdk::video::TransmitterAdapter. The source code of the ssdk::video::TransmitterAdapter class is located in the sdk/video/VideoTransmitterAdapter.* files. This class provides the necessary glue between the video pipeline and the Network Transport Server object.

Audio Pipeline

Likewise, the audio pipeline is responsible for preprocessing and compressing the audio stream and passing it to the Network Transport Server object. All aspects of video compression explained above are equally applicable to audio as well, therefore, the audio pipeline is implemented in a way very similar to video. The encoding and the necessary resampling is performed by the m_AudioOutput object of type ssdk::audio::AudioOutput, which receives raw audio buffers captured by the audio capture thread. The encoded audio stream is passed to the m_AudioTransmitterAdapter object of type ssdk::audio::TransmitterAdapter type, which, in turn, passes it down to Network Transport Server to each client session subscribed to this audio stream. The implementation of the ssdk::audio::TransmitterAdapter class is located in the sdk/audio/AudioTransmitterAdapter.* files.

Video and Audio Encoders and Stream Descriptors

Since the video and audio codecs are configurable parameters, the corresponding encoder objects are created in the RemoteDesktopServer class according to the information passed via command line arguments. A pointer to the video encoder object is stored in the m_VideoEncoder member of the RemoteDesktopServer class. It is also passed to the AVStreamer object implementing the video pipeline. Likewise, a pointer to the audio encoder object is stored in the m_AudioEncoder member of the RemoteDesktopServer class and is passed to the AVStreamer object during initialization.

Streaming SDK uses AMF video encoder components for video and audio encoding. Each codec is configured through properties set on the encoder object. Since these properties are different for each codec, Streaming SDK provides a container class responsible for encoder configuration for every video and audio codec it supports. These container classes for video and audio encoders are located under sdk/video/encoders/ and sdk/audio/encoders/ directories respectively.

The AMD Network Protocol, as well as other common protocols used for video and audio streaming, such as, for example, RTP and WebRTC, provides a way for the server to declare the video and audio streams it transmits. Parameters of these streams, such as codec used, resolution, bitrate and frame rate for video and codec, bitrate and sampling rate, the number of audio channels and their layout, are announced by the server to the connecting clients. Each stream transmitted by the server is declared using the ssdk::transport_common::VideoStreamDescriptor for video and ssdk::transport_common::AudioStreamDescriptor for audio. These descriptors are created and stored in the RemoteDesktopServer class' m_VideoStreamDescriptor and m_AudioStreamDescriptor members respectively. They are populated from the command line arguments passed to the application and are used to configure the pipelines accordingly.

Network Transport

The Remote Desktop Server sample uses the AMD Network Transport protocol to communicate with the client(s). The server component of the AMD Network Transport is implemented by the ssdk::transport_amd::ServerTransportImpl class, derived from the ssdk::transport_common::ServerTransport class. The implementation of the ssdk::transport_amd::ServerTransportImpl class is located in ssdk/transports/transport_amd/ServerTransportImpl.*. For more information about the AMD Network Transport please refer to the AMD Network Transport section.

If you wish to replace the AMD Transport protocol with another protocol, derive your own implementation class from ssdk::transport_common::ServerTransport and instantiate it instead of ssdk::transport_amd::ServerTransportImpl. Follow the instructions outlined in the Implementing Custom Protocols section.

Handling User Input

The Remote Desktop Server sample receives user input events from the Network Transport and injects them into the corresponding input queues. All input devices on the server are represented by a set of "driver" classes derived from the ssdk::ctls::ControllerBaseSvr class, defined in sdk/controllers/server/ControllerBaseSvr.h. All controllers are managed by the ssdk::ctls::ControllerManagerSvr class, which is responsible for instantiation of individual controller objects, querying their state, parsing event strings received from the Network Transport layer and distributing them to their respective "drivers".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly