I gave an internal presentation at Dropbox (sorry, video is not sharable publicly) about engineering software for performance. Here are a bunch of resources that went into the presentation.

Similar Presentations

The Economics of Performance

Humans

Vision

Touch

Reaction Times

Response Time

Perception of Time

Computers

Latency

Examples of High-Performance Code

CPU architecture

In the talk I intentionally left out some detail - technically the branch predictor and branch target predictor are different things.

Agner Fog has amazing resources for CPU optimization, including his famous x86 instruction tables. It's helpful to scan the latencies and reciprocal throughputs of common instructions.

Haswell

Apple A9

Caches, Memory, and Atomics

Memory Bandwidth

Branch Prediction

Miscellaneous