中文  |  English
Home:Marketing > Use cases

DDS Protocol Testing Practice and Problem Analysis


In the previous article, we provided a detailed introduction to the strategy, methods, and tools for DDS protocol testing. This article aims to further explore how to utilize these methods and tools to construct a practical testing environment, execute tests, and uncover various potential issues that may arise.


Introduction to the Protocol Stack Under Test


In this test, the protocol stack under test has been chosen as an open-source DDS product widely used in the automotive industry.


In recent years, with the continuous development and maturity of the open-source software community, more and more automotive manufacturers have begun to favor open-source products when selecting DDS protocol stacks. Compared to commercial DDS protocol stacks, open-source products offer significant cost advantages. Additionally, using open-source products allows automotive manufacturers to have greater autonomy, enabling them to customize and optimize the protocol stack according to their own needs.


However, there is no such thing as a free lunch. Choosing an open-source DDS protocol stack also means that users must take responsibility for the product quality. Unlike commercial products, open-source products usually do not have dedicated teams responsible for quality assurance. Therefore, users of open-source DDS protocol stacks must invest additional effort and resources, or even establish dedicated software teams, to comprehensively evaluate, test, and maintain the selected open-source products to ensure that their functionality and performance meet the rigorous requirements of the automotive industry.


In this article, we aim to accurately identify potential issues, hoping to provide practical references for automotive industry users when choosing open-source DDS protocol stacks.


Building the Test Environment


The DDS protocol stack under test is deployed on a server running the Ubuntu operating system on x86 architecture. This deployment method provides a simple, stable, and consistent runtime environment for the DDS protocol stack, avoiding unreliable test results due to resource or network configuration errors.

 

To comprehensively evaluate the functionality and performance of the DDS protocol stack, we have deployed two specially designed test applications on top of DDS. The main purpose of these two applications is to simulate the calls to DDS interfaces by real-world applications, verifying that DDS can handle various requests correctly and return the expected results. This approach allows us to thoroughly examine whether the DDS interfaces comply with the OMG DDS standard and meet the specific requirements of the automotive industry.

 

To achieve automated control and management of the testing process, we have developed a set of dedicated test scripts on the host machine. These scripts are responsible for sending various test instructions to the DDS test applications, orchestrating and scheduling the behavior of the test applications according to predefined logic. The testing process is fully automated, requiring no manual intervention, ensuring consistency and repeatability of the testing process.

 

Furthermore, to facilitate engineers in managing and monitoring test cases, the host machine software also provides an intuitive graphical interface. Through this interface, testers can easily create, edit, and organize test cases, as well as monitor the execution status and results of the tests in real-time.


图片1.png 

Figure 1: Setting up the DDS Testing Environment


It is important to emphasize that the testing environment can be flexibly configured according to the specific needs of the users. For example, deploying the DDS testing applications on one or multiple real ECUs can help us discover systemic issues such as network configuration problems, compatibility issues, or performance issues. This includes firewall settings, IP addresses, port numbers, TSN constraints, time synchronization, network topology issues, compatibility between DDS middleware and different hardware and software platforms, as well as metrics such as throughput, latency, and resource utilization. This comprehensive evaluation allows us to assess the reliability, compatibility, and performance of DDS distributed systems in real-world application scenarios, providing valuable insights for system development and optimization.


Introduction to Test Cases

 

The test cases cover all software interfaces defined in the OMG DDS specification, totaling 406 test cases. The contents are as follows:

• Interface behavior testing, including behavior testing under normal invocation and fault behavior testing under error invocation, totaling 353 test cases.

• QoS testing, which includes functional testing of various QoS defined in OMG DDS, totaling 53 test cases.

Regarding performance testing, since the performance of DDS largely depends on the performance and resource situation of the hardware platform, as well as the scheduling and management mechanism of the operating system, the performance test results obtained on the test server may vary significantly from the actual system performance. Therefore, performance testing is not included in this test.

 

Analysis of Test Results


Overview


This test executed a total of 406 test cases, with 194 passing and 212 failing, resulting in a pass rate of 47.78%. Below is the pass status of each module.


图片2.png 

Figure 2: Overview of Test Results


Example of Issues


Functionality Missing


Below is a table listing some of the missing interfaces in the current version of DDS. These missing interfaces have a direct or indirect impact on the functionality and performance of DDS distributed systems. Importantly, developer documentation may not explicitly indicate the absence of functionality for these interfaces, which could potentially pose risks to the stability and reliability of the system. Therefore, users need to remain vigilant about these potential deficiencies when using DDS and take appropriate preventive measures. Additionally, users should actively participate in discussions within the open-source community, monitor product release notes, to promptly understand and supplement the functionality of these critical interfaces, thereby reducing uncertainties and risks in system operation.

 

Table 1: Example of Functionality Missing Issues

Interface or Functionality

Description of the Issue

get_discovered_participants
get_discovered_participant_data

These are two interfaces related to discovery mechanism. They are used to retrieve information about DomainParticipants discovered in the DDS domain, such as QoS policies, names, etc. This allows applications to query and understand the configuration and capabilities of other DomainParticipants. The tested DDS lacks this functionality.

ignore_participant
ignore_publication
ignore_subscription

The missing functionality allows applications to dynamically ignore specific DomainParticipants, Publications, or Subscriptions in the DDS domain, thereby refining and optimizing the data distribution process. The main purpose of these operations is to improve efficiency and performance by reducing unnecessary data exchange to save resources. The tested DDS lacks this functionality.

begin_coherent_changes
end_coherent_changes

The missing functionality allows Publishers to perform a series of data changes in a consistent manner, ensuring that these changes are received and processed as a whole by Subscribers, thereby maintaining data consistency and integrity. This mainly affects the Presentation QoS functionality. The tested DDS lacks this functionality.

copy_from_topic_qos

This interface is used to copy the QoS settings of one object to another. The tested DDS lacks this functionality.

The DATAWRITER_QOS_USE_TOPIC_QOS feature of create_datawriter.

Directly using the topic QoS when creating a DataWriter. The tested DDS lacks this functionality.

...

...


图片3.png 

Figure 3: Example of Test Report for Missing Functionality Issues

 

Behavior Errors


When developers invoke specific APIs according to the official documentation, the behavior of the software may not align with what is described in the documentation. This inconsistency can manifest as incorrect results, unexpected side effects, or no response at all. In such cases, developers often have to invest significant time to troubleshoot and locate the issue.


Table 2: Example of API Behavioral Errors

Interface or Functionality

Description of the Issue

lookup_instance

This interface returns a handle that uniquely identifies a specific data instance. However, the interface does not return the correct handle.

set_default_topic_qos
get_default_topic_qos

These two interfaces are used to set or retrieve the default QoS policies. When incorrect QoS is set, the interfaces return success, or the retrieved value does not match the set value.

wait_for_acknowledgments

For reliable data transmission, this interface is used to wait until all previously sent data is received by the DataReader. However, in practice, this interface returns without blocking even when the data is not received.

get_liveliness_lost_status

This interface allows the DataWriter to retrieve information about its "liveliness lost" status, which indicates when the DataWriter has failed to declare its liveliness within the predetermined interval to subscribers. In practice, the interface does not return the correct count of liveliness losses.

TimeBasedFilterQos

TimeBasedFilterQos is a QoS policy used to control the minimum interval at which the DataReader receives data samples. This policy aims to reduce the burden on the application when the data publishing rate exceeds the application's processing capacity. However, the DataReader does not filter the data as per the rule in practice.

 

 图片4.png 

Figure 4: Example of Behavior Fault Testing Report

 

Abnormal Termination


This type of issue refers to the abnormal termination of the DDS middleware when a specific interface is called by the application under certain scenarios. The severity of such issues lies in their difficulty to be discovered, diagnosed, and fixed. They are often elusive because the exceptions are typically triggered only under specific conditions, such as particular data patterns, concurrency levels, or resource usage, making it challenging to trigger and identify them in regular testing. Moreover, even if the problem is successfully identified, fixing it is equally challenging, requiring precise modifications to complex code logic while ensuring no adverse impacts on other functionalities of DDS.

 

As the DDS middleware serves as the foundation software of the system, its stability is crucial for the overall system operation. Instability in foundational software can have a cascading effect on upper-layer applications and end-users, significantly impacting the quality of the entire system and user experience. Therefore, addressing abnormal termination issues in DDS middleware is not only a technical challenge for improving software quality but also a vital aspect of ensuring overall system stability and reliability.

 

Table 3: Examples of DDS Software Abnormal Termination Issues

Interface or Functionality

Description of the Issue

find_topic

Create a topic named "TopicName". Then create a DomainParticipant (DP) and start a thread. The DP calls find_topic to search for the topic named "TopicName", sets a timeout of 10 seconds, waits until the timeout expires, and then deletes all created entities. Finally, when deleting the DP, an abnormal termination of DDS occurs.

create_datareader
create_datawriter

An abnormal termination of DDS occurs when calling create_datawriter/create_datareader with empty parameters.

get_datareaders

DDS throws a memory allocation exception when calling get_datareaders with a key-based type as the input parameter.

notify_datareaders

DDS terminates abnormally when the DomainParticipantFactory (DPF) sets the entity_factory.autoenable_created_entities attribute to false, ensuring that created entities are by default not enabled. Subsequently, when the DPF calls set_qos to set qos parameters and creates entities including DomainParticipant, Publisher, Subscriber, Topic, DataWriter, and DataReader, and then invokes the notify_datareaders interface of Sub, DDS terminates abnormally.

on_offered_deadline_missed

DDS terminates abnormally when the on_offered_deadline_missed interface crashes in a scenario where the Deadline Missed event is triggered.

...

...


Summary


After reading this article, readers should have gained a general understanding of DDS protocol testing and the potential issues it may encounter.


In agile development, software requirements continue to grow, and the scale and complexity of software systems are constantly increasing. However, many critical issues (especially performance issues) only emerge when the software reaches a certain scale and complexity. Once these issues are discovered, the cost of fixing them is often very high because any modifications to the underlying software may affect the entire system.


In contrast, if realistic application scenarios can be simulated early in the project and comprehensive functional and performance testing of DDS can be conducted, development teams can gain in-depth insights into the behavior characteristics of DDS, identify software defects, and recognize performance bottlenecks, thus enabling timely adjustments to the design and optimization of the implementation. This "upfront" testing approach not only significantly reduces the cost of later fixes but also improves the quality and reliability of the entire system, helping the system to meet future challenges.


This article introduced the DDS protocol testing tool developed by Nanjing Zhenrong Technology Co., Ltd. (hereinafter referred to as "Zhenrong Technology"). Zhenrong Technology has been committed to the independent development of DDS products and their related toolchains for the past decade, and has achieved the highest market share in key industries in China. This DDS protocol testing tool has undergone nearly a decade of continuous iteration during DDS development, demonstrating the maturity and reliability of its products. The collaboration between Zhenrong Technology and Polelink aims to introduce this tool into the automotive industry, assisting clients in establishing DDS testing capabilities, providing high-quality testing services and related training, and accelerating the promotion and application of DDS in the automotive industry.