Is it possible to reduce the context switch time, Measure the overhead of context switching in GPU. Perhaps maintaining a process table that is always up to date? Context switching is highly expensive - it reduces productivity and effectiveness. And the fatigue that builds up from all of this energy loss is what heavily demotivates us, as well as potentially causing a mental burnout. Because there is more context than just the registers. Saying that the time taken by context switch is pure overhead is IMHO an excessive simplification; it is like saying that the time to do an addition is overhead. This fact rings especially true for product managers, as we can only make the best decision by deeply understanding the context behind the decision. One of the best ways to deal with context switching is to design your schedule so that you avoid context switching. Context-switching speed varies from machine to machine (depends upon memory speed, number of registers to be copied, existence of special instructions). Reduced STL usage Another optimisation which has context switching implications is switching from using our "buffer lists" to using the new "buffer chains". Method Profiling says that 'context switch' - Inclusive Real Time is 100% and takes about 1510 ms. Is it possible to reduce it somehow? Context-switch times are highly dependent on hardware support. I understand there are processors on which you can read / write all the registers in a single block of memory. The need for supportingvariety of hard and soft real-time, as well as best effort applications in a multimedia computing environment requires an operating system framework that: (1) enables different schedulers to be employed for different application classes, and (2) provides protection between the various classes of applications. A context switch or a function call? When a thread tries to acquire a lock that is already acquired by another thread, it has little choice but to poll several times, hoping they will release it within a very short time, then give up and do a context switch. The key reason context switching is bad is because it takes time and effort to get into focus. Why doesn't Linux use the hardware context switch via the TSS? How can a CPU save its register state in a context switch? The context switch time is dependent on the registers you have to save / restore. Context switch rate is significantly lower when you have spare processors - it won't need to "switch" any existing processors, it can use an idle one. If there is a context switch every 10ms, then each task is left to run for 9.9ms, then out of every 1ms period, 99% is spent running the tasks and 1% is spent in context switches. Context Switch happens when processes CPU time slice finishes or interruption happens. We switch-task, rapidly shifting from one thing to another, interrupting ourselves unproductively, and losing time in the process. Process Management in Multiprocessor Operating Systems using Class Hierarchical Design, A Hierarchical CPU Scheduler for Multimedia Operating Systems. When it occurs, you save the old thread's FPU state, load the current thread's FPU state, reset CR0.TS and resume execution at that FPU instruction. With fewer projects, Tractionites can devote larger blocks of focused allocation to each project. It is fundamental to implement any kind of "multi-tasked" system. The time per context switch keeps going up and up as the working set size increases, but beyond a certain point the benchmark becomes dominated by memory accesses and is no longer actually testing the overhead of a context switch, it's simply testing the performance of the memory subsystem.