Parallelism. Using Parallel.For and ConcurrentBag.
Parallelism refers to the technique of running multiple calculations at the same time to speed up a computer program. Historically, this has been a complicated thing to write requiring a developer to do complicated coding including low-level manipulation of threads and locks.
A program will generally run faster if you allow it to execute multiple calculations at the same time. For example, you might have a program where you need check how many orders a customer has, and instead of looping through each customer to check on their orders, you could check on multiple customers at the same time by using something like Parallel.For.
Code example:
private IEnumerable<Orders> MyMethod(List<Orders> orders) { // Converting the List<Orders> to ConcurrentBag for thread-safe purposes. var result = new ConcurrentBag<Orders>(); Parallel.ForEach(orders, item => { // Some data manipulation result.Add(new Orders(/* constructor parameters */); }); return result; }
The .NET Framework makes writing parallel code a much simpler task than before. A variety of enhancements and additions such as runtime, class library types, and diagnostic tools were introduced with the .NET Framework 4.0 to help developers write safe and efficient parallel code.
Below are some of these tools and enhancements, you can click any of the links for access to Microsoft’s documentation for each one of these:
- Task Parallel Library (TPL) Provides documentation for the System.Threading.Tasks.Parallel class, which includes parallel versions of
For
andForEach
loops, and also for the System.Threading.Tasks.Task class, which represents the preferred way to express asynchronous operations. - Parallel LINQ (PLINQ) A parallel implementation of LINQ to Objects that significantly improves performance in many scenarios.
- Data Structures for Parallel Programming Provides links to documentation for thread-safe collection classes, lightweight synchronization types, and types for lazy initialization.
- Parallel Diagnostic Tools Provides links to documentation for Visual Studio debugger windows for tasks and parallel stacks, and for the Concurrency Visualizer.
- Custom Partitioners for PLINQ and TPL Describes how partitioners work and how to configure the default partitioners or create a new partitioner.
- Task Schedulers Describes how schedulers work and how the default schedulers may be configured.
- Lambda Expressions in PLINQ and TPL Provides a brief overview of lambda expressions in C# and Visual Basic, and shows how they are used in PLINQ and the Task Parallel Library.
The benefits
The benefit of using parallel programming is gaining the advantage to execute multiple instructions at the same time. This offers the benefit of making your program faster by reducing the time for the same code to execute sequentially. While this is a great way to speed up your code, you should still consider other ideas as well and not use the framework features around parallelism before knowing more about it. Believe, I know by personal experience, unfortunately.
The disadvantages
The disadvantages of using parallel coding are the increase of use of CPU for it (something to be aware of) and also the potential for issues when using collection objects that aren’t thread-safe. Thread safe means multiple threads can access the common data without any problem. When using something like Parallel.For you want to use a thread-safe object such as ConcurrentBag<T>. Bags are useful for storing objects when ordering doesn’t matter, and unlike sets, bags support duplicates. If you need your collection to be ordered, remember to sort it after converting it to a List<>.
As with everything else, test your code and find out if using the Parallel library or PLINQ in your existing scenario is the right thing for it or not. While it might seem that running things in parallel will always be faster, this isn’t always true. Read more about it here.
Happy coding!