Parallel Programming & Architectures




Parallel processing is about accelerating heavy computations, for instance, in artificial intelligence, machine learning, big data processing, scientific simulations, signal processing, and autonomous robotics. Parallel platforms range from large scale systems to tiny wearable devices. This course serves as an introduction to the world of parallel computing and covers algorithm development, programming techniques, and architectures.


Example Applications:



Big Data,

Heavy Simulations




Large Scale





Machine Learning,

Signal Processing,

Autonomous Robotics





Embedded Systems





Wearable Devices




Ultra Low Power






C Programming




1.      Motivation and demanding applications

2.      Types of parallelism: ILP, DLP, TLP

3.      Components of parallel architectures

4.      Process scheduling

5.      Shared-memory multi-threading

6.      Pthread library

7.      Busy waiting, mutex, semaphore

8.      SIMD hardware and programming

9.      GPGPU architecture

10.   CUDA programming model

11.   Data-parallel algorithms

12.   Algorithm optimization techniques

13.   Other frameworks such as OpenCL




1.      Peter Pacheco, An Introduction to Parallel Programming, 2nd Edition (2020) or 1st Edition (2011).

2.      Nicholas Wilt, The CUDA Handbook, 2nd Edition (2020) or 1st Edition (2013).




Exam (12), Programming Exercises (8)




Matin Hashemi