Multi-threaded concepts are important: e.g., atomics, locks, issues with different designs, how to make things thread safe. Cache locality is another huge thing these days. Asynchronous architectures and callbacks are what you will be dealing with every day. What is cache locality? How do multicore systems ensure their caches are in sync? How do you get around this problem? Why are signals slow and why is context switching bad? What exactly happens during a context switch? [Read More]


Kinds of parallelism bit level instruction level (ILP) data (DLP/SIMD) task parallelism (TLP/MIMD) See YouTube/MIT - parallel processing. Examples Distributed processing over networks Multiple CPUs Multiple cores Pipelines (deeper and wider pipeline = more control hazards) ILP - instruction level parallelism (at best x2 speed up) MLP - Memory-level parallelism is a term in computer architecture referring to the ability to have pending multiple memory operations, in particular cache misses or translation lookaside buffer (TLB) misses, at the same time Loop unrolling Out-of-order execution - OoO of multiple instructions simultaneously Single Operation-Multiple-Data (SIMD) operations in vector registers Multiple CPU cores on the same chip Speculative execution Branch prediction versus branch target prediction SSE and AVX Moore’s law hits the roof OpenMP C++ AMP - Accelerated Massive Parallelism Pluralsight - High-performance Computing in C++ SMOP - small matter of programming: multiple cores are the way we’re heading, working out how to use them is the difficult part Vector processing - think about it like explicitly managing giant cache lines GPGPU Advance Vector Extensions AVX - xmm ymm zmm Amdahl’s law Amdahl’s law shows the maximum speed up that can be achieved by parallelising a pipeline is related to the proportion that can be done in parallel. [Read More]