Data Intellect
The timeline of guidelines published by the European Union[1], the UK government[2] and USA government[3] reveals an increasing focus on software safety. Such recommendations and guidelines are usually the prelude to enforceable regulations. In the last decade, similar regulations, such as the European General Data Protection Regulation (GDPR) and the Market in Financial Instruments Directive II (MIFID II), not only monopolised internal software development efforts, but also dramatically increased the demand for consulting services to implement regulatory changes within regulatory timeframes. Data Intellect’s data driven strategy is to inform its clients of potential challenges ahead, clarify the nature of those challenges, address them and eventually turn them into opportunities.
Software safety entails various concerns such as:
… but also bounds safety, Lifetime Safety.
The view taken by government regulation bodies — use memory safe languages
Quoting the NSA Cybersecurity Information Sheet
The NSA information sheet advises organizations to consider making a strategic shift from programming languages that provide little or no inherent memory protection, to a memory safe language where possible.
An illusion of safety — the pitfalls in adopting simplistic views
Memory safe languages, such as Java, C#, or Rust undoubtedly mitigate important software safety issues by managing the memory on behalf of the developer. In memory unsafe languages such as C and C++ however, the user code is indeed vulnerable to bugs and hacking attempts, particularly in the absence of bounds checks.
However, both Java and C# ensure memory safety, but not object lifetime safety. In effect, memory safe language only guarantee that the developer code will only handle allocated memory, but the underlying object might not hold legitimate values (i.e. “use after dispose/finalise”). While those scenarios arguably occur less often than in memory unsafe languages, they demonstrate that software safety is not as simple as ruling out a particular language.
Engineering Tools, trends, and Data Intellect’s views for a safer software world
The last decade has seen a reversal in trends initiated at the beginning of the century. While object-oriented software has improved modularisation, modern software is built on top of data structures and algorithms rather than on sky-high-level programming language, suitable only for infinite hardware resources. The logical next step is to learn how to communicate safely with the hardware rather than:
Safety critical systems and memory unsafe languages are not mutually exclusive. In the automotive industry, AUTOSAR recognises that the performance offered by C++ should not come at the detriment of software and functional safety. MISRA shares the same vision for C/C++ software safety by restricting those language to a subset of safe usages and programming constructs. The C++ Core Guidelines aims to improve software safety by a combination of:
Herb Sutter is notably working on a safer flavour of C++, C++ 2.
Sofware safety can be significantly improved by leveraging existing tools
Knowledge from safety critical industries can be leveraged in the financial industry
Safety critical industries such as Automotive or Plane industry already have very strong software development safety rules.
Strive for compile time checks, rather than runtime checks
C++ has a competitive advantage when it comes to type safety. On the top of strong typing, C++20 Constraints and concepts break compilation on invalid operations attempted on data structures and functions under their control.
Whenever possible:
Use static memory allocation instead of dynamic memory allocation
Allocating static and fixed size chunks of memory at startup seems to take us back to the Fortran world. But is it better to deal with memory allocation failure at startup, or during execution[5]? In the same way that business units are allocated fixed -and limited- resources, why can’t a program be allocated all the resources it needs at startup?
Embedded real-time systems – that only moderately enjoy undefined runtime behaviour tend to avoid dynamic memory allocation due to the indeterministic response time it involves, a concern shared in low latency trading.
Replace dynamic collections with fixed size collections
Fixed size circular buffers are typically used in the gaming industry[6] for they provide higher performance than dynamic collections. Fixed size collections such as std::array forbid the use of expensive memory resizing operations such those offered by std::vector.
Replace pointer-based collections with flat collections
Cache friendly flat collections, such as those in the latest <flat_map> and <flat_set> adapters, represent node links with contiguous array elements instead of pointers-based links. Thus, they can leverage hardware cache associativity and prefetching. By contrast, standard collections such as std::map and std::set, represent node neighbours as pointers. This implies that they have no control on the proximity of those neighbors, which in turns disables hardware caches strategies.
Chapter 12 of Introduction to Algorithms highlights how to use one dimensional or multidimensional arrays to represent linked lists.
Data intellect technical proposition boils down to design guidelines enabling software safety and high performance, such as:
[1] European Commision | Proposal for a Regulation on cybersecurity requirements for products with digital elements – Cyber resilience Act | 15 September 2022 | https://digital-strategy.ec.europa.eu/en/library/cyber-resilience-act
[2] UK Government Cabinet Office | Policy paper – Government Cyber Security Strategy: 2022 to 2030 | 25 January 2022 | https://www.gov.uk/government/publications/government-cyber-security-strategy-2022-to-2030
[3] United States National Security Agency | Cybersecurity Information Sheet – Software Memory Safety | 15 November 2022 | https://media.defense.gov/2022/Nov/10/2003112742/-1/-1/0/CSI_SOFTWARE_MEMORY_SAFETY.PDF
[4] Herb Stutter in Episode 357 of the CppCast podcast “Cpp2, with Herb Sutter”, Episode 357, published Friday, 31 Mar 2023
[5] Note that the default Linux overcommit strategy is to terminate a C++ process by the OMM killer rather than process the bad_alloc exception.
[6] SG14 (ISO C++ study group dedicated to programmers in the games, embedded and financial domain) proposal for Rings, 2016
[7] https://www.weforum.org/reports/global-risks-report-2023/in-full
Share this: