Tuesday, September 12, 2017

Waiting for a task to finish and harvesting Result in .NET 4

The below techniques work for code tasks and facade tasks.

Explicitly Waiting for a single task

Sometime when working with task you need to wait for a computation to complete before you can do something with the result (such as write it to the screen) as shown below.

Task t = Task.Factory.StartNew( /* code */ );
t.Wait();
Console.WriteLine(t.Status);

The result would be one of the following:
  • RanToCompletion
  • Canceled
  • Faulted

Explicitly Waiting for Multiple tasks (Ordered)

decimal min = 0;
decimal max = 0;
Task t_min = Task.Factory.StartNew(() => {min = data.Price.Min();});
Task t_max = Task.Factory.StartNew(() => {max = data.Price.Max();});

t_min.Wait();
t_max.Wait();

Console.WriteLine(min);
Console.WriteLine(max);

Note: You can use Wait() any task as well if they are dependent on each other. This reduces the benefits of parallelism.

Harvesting Result (Implicitly Waiting for Multiple tasks Sequentially)

While there is nothing wrong with explicitly waiting there is a cleaner way to do it and with less code code.

Task<decimal> t_min = Task.Factory.StartNew(() => {return data.Price.Min();});
Task<decimal> t_max = Task.Factory.StartNew(() => {return data.Price.Max();});

Console.WriteLine(t_min.Result);
Console.WriteLine(t_max.Result);

Notice no variables that could have a race condition if more than one task was accessing one of them. Notice no calls to Wait() because Result implicitly calls Wait() before it returns the value.

Explicitly Waiting for All tasks to complete

The above waiting scenario implies that the order that we wait for the tasks to finish matters. If it does not matter and we want to wait for all tasks to complete before continuing then semantically it is better to use WaitAll().

Here is the same code as above, but instead using a WaitAll();

Task<decimal> t_min = Task.Factory.StartNew(() => {return data.Price.Min();});
Task<decimal> t_max = Task.Factory.StartNew(() => {return data.Price.Max();});

Task.WaitAll(new Task[] {t_min, t_max});
Console.WriteLine(t_min.Result);
Console.WriteLine(t_max.Result);

Explicitly Waiting for ANY task to complete

There are scenarios searches that are returning the first response and displaying it and ignoring the other responses that make sense. In this scenario, we don't care which one is first, but we do want to wait for one to complete.

Here is a new example using WaitAny(). It takes a list of tasks as input and returns the index in that array that finished first. That index can be used to get the results of the task that completed first as shown below.

Task<string> t_msn = Task.Factory.StartNew(() => {return SearchMsn;});
Task<string> t_google = Task.Factory.StartNew(() => {return SearchGoogle();});

Task[] tasks = new Task[] {t_msn, t_google};
int firstIndex = Task.WaitAny(tasks);
Task firstTask = tasks[firstIndex];
Console.WriteLine(firstTask.Result);

WaitAllOneByOne pattern

There are scenarios where you want to wait for all tasks to finish, but process results as each one completes. Another way to think of this is, imagine we want to start several tasks and we don't know in what order they will complete, but we want to process them as they are finished. This assumes that we don't need to wait for them all before we start processing the results.

This pattern is useful when:
  • Some tasks may fail - discard / retry
  • Overlap computations with result processing - aka hide latency
There is no built in feature for this and this is more of a pattern that you can implement if this is your scenario. Here is one conceptual implementation.

while (tasks.Count > 0)
{
int taskIndex = Task.WaitAny(tasks.ToArray());

// check and observe exception and ignore it
if (tasks[taskIndex].Exception == null)
{
// process tasks using tasks[taskIndex].Result;
}

tasks.RemoveAt(taskIndex);
}

// if we get here all tasks failed. Act accordingly

Notice this is NOT a tight while loop that will spin while the tasks are processing. The Task.WaitAny() blocks execution of the while loop until a task completes then in loops again and starts another Task.WaitAny() and continues this until all tasks have been processed.

Task Composition (One-to-One)

Sometimes we want the completion of one task to trigger the start of another task. This can be done as noted before using simple Wait() in the second task. This is the TPL approach to Wait().

That might look like this

Task t1 = Task.Factory.StartNew(() => { /* code here */ });
Task t2 = Task.Factory.StartNew(() => { t1.Wait(); /* more code here */ });

The above implementation will work, however t2 actually starts BEFORE t1 is finished. This is probably reasonable if we remember to do a t1.Wait() first thing in the code of t2. From an orchestration perspective and responsibility point of view t2 should not be concerned with waiting for t1. This should be at a higher level. This will make testing the code for t2 much easier.

To fill the deficiencies noted above we can use the ContinueWith() method wait to start t2 until after t1 has finished. One added benefit is this allows .NET to optimize the scheduling such that both t1 and t2 would be on the same thread and could actually optimize away the wait.

That code might look like this.

Task t1 = Task.Factory.StartNew(() => { /* code here */ });
Task t2 = T1.ContinueWith((antecedent) => { /* more code here */ });

Note, the parameter antecedent references the task that just finished. This can be used to check the status of the task (t1 in this case), get the result, etc.

For example to get the result from t1 the above code would be changed to look like this:

Task t1 = Task.Factory.StartNew(() => { /* code here */ });
Task t2 = T1.ContinueWith((antecedent) => { var result = antecedent.Result; /* more code here */ });

Task Composition (Many-to-one)

TPL has alternatives for WaitAll() and WaitAny() also. Instead of using WaitAll() and WaitAny(), you can use ContinueWhenAll() and ContinueWhenAny respectvely.

ContinueWhenAll

Using our example for WaitAll() we had:

Task<decimal> t_min = Task.Factory.StartNew(() => {return data.Price.Min();});
Task<decimal> t_max = Task.Factory.StartNew(() => {return data.Price.Max();});

Task.WaitAll(new Task[] {t_min, t_max});
Console.WriteLine(t_min.Result);
Console.WriteLine(t_max.Result);

We could convert that to use ContinueWhenAll() as follows:

Task<decimal> t_min = Task.Factory.StartNew(() => {return data.Price.Min();});
Task<decimal> t_max = Task.Factory.StartNew(() => {return data.Price.Max();});

Task[] tasks = new Task[] {t_min, t_max};

Task.Factory.ContinueWhenAll(tasks, (setOfTasks) => 
{
Console.WriteLine(t_min.Result);
Console.WriteLine(t_max.Result);
});

ContinueWhenAny

Using our example for WaitAny() we had:

Task<string> t_msn = Task.Factory.StartNew(() => {return SearchMsn;});
Task<string> t_google = Task.Factory.StartNew(() => {return SearchGoogle();});

Task[] tasks = new Task[] {t_msn, t_google};
int firstIndex = Task.WaitAny(tasks);
Task firstTask = tasks[firstIndex];
Console.WriteLine(firstTask.Result);

We can convert that to use ContinueWhenAny() as follows:

Task<string> t_msn = Task.Factory.StartNew(() => {return SearchMsn;});
Task<string> t_google = Task.Factory.StartNew(() => {return SearchGoogle();});

Task[] tasks = new Task[] {t_msn, t_google};

Task.Factory.ContinueWhenAny(tasks, (firstTask) =>
{
Console.WriteLine(firstTask.Result);
}

The code is similar to using WaitAll() and WaitAny(), but again a bit easier to test because the responsibility of the orchestration is at a higher level. It also reduces a few lines of code.

Reference

Content is based on Pluralsight video called Introduction to Async and Parallel Programming in .NET 4 by Dr. Joe Hummel. 

No comments: