Synchronization (computer science): Difference between revisions

Content deleted Content added

Inline

Revision as of 23:00, 1 September 2014

In computer science, synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronization refers to the idea that multiple processes are to join up or handshake at a certain point, in order to reach an agreement or commit to a certain sequence of action. Data synchronization refers to the idea of keeping multiple copies of a dataset in coherence with one another, or to maintain data integrity. Process synchronization primitives are commonly used to implement data synchronization.

Thread or process synchronization

Thread synchronization or serialization, strictly defined, is the application of particular mechanisms to ensure that two concurrently-executing threads or processes do not execute specific portions of a program at the same time, referred to as critical sections. If one thread has begun to execute a serialized portion of the program, any other thread trying to execute this portion must wait until the first thread finishes. If such synchronization measures are not taken, it can result in a race condition where variable values depend on the timings of the thread or process context switch.

Synchronization is used to control access to state both in small-scale multiprocessing systems -- in multithreaded environments and multiprocessor computers -- and in distributed computers consisting of thousands of units -- in banking and database systems, in web servers, and so on.

See

Data synchronization

A distinctly different (but related) concept is that of data synchronization. This refers to the need to keep multiple copies of a set of data coherent with one another.

Examples include:

File synchronization, such as syncing a hand-held MP3 player to a desktop computer.
Cluster file systems, which are file systems that maintain data or indexes in a coherent fashion across a whole computing cluster.
Cache coherency, maintaining multiple copies of data in sync across multiple caches.
RAID, where data is written in a redundant fashion across multiple disks, so that the loss of any one disk does not lead to a loss of data.
Database replication, where copies of data on a database are kept in sync, despite possible large geographical separation.
Journaling, a technique used by many modern file systems to make sure that file metadata are updated on a disk in a coherent, consistent manner.

Mathematical foundations

Synchronization was originally a process based concept whereby a lock could be obtained on an object. Its primary usage was in databases. There are two types of (file) lock; read-only and read-write. Read-only locks may be obtained by many processes or threads. Read-write locks are exclusive, as they may only be used by a single process/thread at a time.
Although locks were derived for file databases, data is also shared in memory between processes and threads. Sometimes more than one object (or file) is locked at a time. If they are not locked simultaneously they can overlap, causing a deadlock exception.
Java and Ada only have exclusive locks because they are thread based and rely on the compare-and-swap processor instruction (see mutex).
An abstract mathematical foundation for synchronization primitives is given by the history monoid. There are also many higher-level theoretical devices, such as process calculi and Petri nets, which can be built on top of the history monoid.

References

External links

Anatomy of Linux synchronization methods at IBM developerWorks
The Little Book of Semaphores, by Allen B. Downey

@@ Line 3: / Line 3: @@
 ==Thread or process synchronization==
-Thread synchronization or serialization, strictly defined, is the application of particular mechanisms to ensure that two concurrently-executing [[thread (computer science)|threads]] or [[process (computer science)|processes]] do not execute specific portions of a program at the same time.  If one thread has begun to execute a serialized portion of the program, any other thread trying to execute this portion must wait until the first thread finishes. Synchronization is used to control access to state both in small-scale multiprocessing systems -- in multithreaded environments and multiprocessor computers -- and in distributed computers consisting of thousands of units -- in banking and database systems, in web servers, and so on.
+Thread synchronization or serialization, strictly defined, is the application of particular mechanisms to ensure that two concurrently-executing [[thread (computer science)|threads]] or [[process (computer science)|processes]] do not execute specific portions of a program at the same time.  If one thread has begun to execute a serialized portion of the program, any other thread trying to execute this portion must wait until the first thread finishes.
+Synchronization is used to control access to state both in small-scale multiprocessing systems -- in multithreaded environments and multiprocessor computers -- and in distributed computers consisting of thousands of units -- in banking and database systems, in web servers, and so on.
 ===See===
 * [[Lock (computer science)]] and [[mutex]]

v t e Parallel computing
General	Distributed computing Parallel computing Massively parallel Cloud computing High-performance computing Multiprocessing Manycore processor GPGPU Computer network Systolic array
Levels	Bit Instruction Thread Task Data Memory Loop Pipeline
Multithreading	Temporal Simultaneous (SMT) Simultaneous and heterogenous Speculative (SpMT) Preemptive Cooperative Clustered multi-thread (CMT) Hardware scout
Theory	PRAM model PEM model Analysis of parallel algorithms Amdahl's law Gustafson's law Cost efficiency Karp–Flatt metric Slowdown Speedup
Elements	Process Thread Fiber Instruction window Array
Coordination	Multiprocessing Memory coherence Cache coherence Cache invalidation Barrier Synchronization Application checkpointing
Programming	Stream processing Dataflow programming Models Implicit parallelism Explicit parallelism Concurrency Non-blocking algorithm
Hardware	Flynn's taxonomy SISD SIMD Array processing (SIMT) Pipelined processing Associative processing MISD MIMD Dataflow architecture Pipelined processor Superscalar processor Vector processor Multiprocessor symmetric asymmetric Memory shared distributed distributed shared UMA NUMA COMA Massively parallel computer Computer cluster Beowulf cluster Grid computer Hardware acceleration
APIs	Ateji PX Boost Chapel HPX Charm++ Cilk Coarray Fortran CUDA Dryad C++ AMP Global Arrays GPUOpen MPI OpenMP OpenCL OpenHMPP OpenACC Parallel Extensions PVM pthreads RaftLib ROCm UPC TBB ZPL
Problems	Automatic parallelization Deadlock Deterministic algorithm Embarrassingly parallel Parallel slowdown Race condition Software lockout Scalability Starvation
Category: Parallel computing