Our Blog

Azure Durable Functions - Concurrent Processing
David Madison

Azure Durable Functions - Concurrent Processing

Fan-Out pattern VS. Multi-Threading in Activity

If you are developing micro-services utilizing the Microsoft Azure Function infrastructure, then you are likely aware of a few of the constraints the architecture imposes upon using durable functions (aka orchestrations).

  • 10 Minute maximum run time of a function*
  • Must be deterministic (so that it can be replayed and produce the same result)
  • Single-threaded behavior
  • I/O (input/output) operations are not allowed.
  • Identity management is restricted to Azure Active Directory tokens (Azure AD)

These constraints help answer the complexities of running a complex process that requires stateful coordination between components in a serverless environment.

* Azure’s default maximum run time is 5 minutes.

Yet, how is concurrent processing achieved within an orchestration?

Concurrent Processing with Durable Functions

On the topic of single threaded VS multi-threaded, it is possible to parallelize tasks through the use of Activity Functions. One approach is to use a ‘Fan-Out’ pattern to run multiple functions concurrently. A second approach is use multiple threads within an activity.

We looked at both options on our last project.

Fan-Out pattern – Multiple concurrent Activities

Orchestrations do allow activities to be run in parallel using the IDurableOrchestrationContext.CallActivityWithRetryAsync method for each activity to be run in parallel, storing the task into a list of tasks and ‘await’-ing on Any or All the tasks to complete.

Fan-Out Pattern Picture


Code example:

            var dataLoadTasks = new List>();

            foreach (var blobName in blobNames)
            {
                dataLoadTasks.Add(
                    context.CallActivityWithRetryAsync(
                        nameof(ActivityLoadFile),
                        retryOptions,
                        blobName));
                }
            }
           await Task.WhenAll(dataLoadTasks).ConfigureAwait(true);

 

Limiting the number of tasks that run in parallel

One thing to be cautious of is the number of resources that may get consumed with launching multiple parallel tasks. You may need to limit the number of threads that get launched at once or increase your Azure resources.

To limit the number of parallel tasks, employ a throttle to the number of tasks that are running at one time.

Code Example:

            var dataLoadTasks = new List>();
            var numberofTasksToRunInParallel = 2;

            foreach (var blobName in blobNames)
            {
                // Fan out to a configured threshold
                var runningTasks = dataLoadTasks.Where(t => !t.IsCompleted);
                if (runningTasks.Count() >= numberofTasksToRunInParallel)
                {
                    await Task.WhenAny(runningTasks).ConfigureAwait(true);
                }

                dataLoadTasks.Add(
                    context.CallActivityWithRetryAsync(
                        nameof(ActivityLoadFile),
                        retryOptions,
                        blobName));
            }
           await Task.WhenAll(dataLoadTasks).ConfigureAwait(true);


NOTE: The number of tasks to run in parallel can be set using an app config value instead of hard-coded. This way it could be changed through the Azure Portal instead of re-deploying the function.

Pros:
  • Azure has a 10-minute maximum time an activity can run. This approach provides each activity up to 10 minutes to run. In this case the overall processing can span a longer time frame.
Cons:
  • This approach was respectively slower to process the same set of files. Likely because starting an activity has significant overhead.
  • Orchestrator code was more cluttered. To help keep the code organized, the processing that was to be run in parallel was put into a local method in the orchestration class.
     

Multi-threading in an Activity

As an alternative to spinning up parallel activities, an activity could employ the use of threading to accomplish the same goal. In this approach, a similar technique in the first example can used to execute a process in parallel.

Multi-Threading Diagram


Code Example

            var fileLoadTasks = new List>();
            var numberofTasksToRunInParallel = 2;

            foreach (var blobName in blobNames)
            {
                var runningTasks = fileLoadTasks.Where(t => !t.IsCompleted);
                if (runningTasks.Count() >= numberofTasksToRunInParallel)
                {
                    await Task.WhenAny(runningTasks).ConfigureAwait(true);
                }

                fileLoadTasks.Add(this.LoadFile(blobName));
            }

            // Wait for remaining tasks to finish
            await Task.WhenAll(fileLoadTasks).ConfigureAwait(true);


In the example above, the LoadFile routine is a private method within the activity.

Throttling the number of tasks that are run concurrently is shown above through the same technique as used in the Fan-Out example. Again, the value can be set through an app config setting instead of being hard-coded.

Pros:
  • It was respectively faster than the Fan-out pattern, likely because starting an activity has significant overhead.
  • Orchestrator was cleaner. The code that is necessary to manage launching and throttling the activities was moved to the activity.
  • More encapsulated. The orchestration doesn’t need to ‘know’ how the processes get run; it only needs to know the result status of the processing.
Cons:
  • Azure has a 10-minute maximum time an activity can run. With this approach, all the processing is within an activity and therefore, all parallel processing must be completed within the same 10 minutes.

Which approach is better?

This is a great question and the answer will likely depend on your scenario.

  • Does the overall processing need to go beyond 10 minutes, even with parallelizing the tasks?
  • Does your budget allow you to use a premium plan where the 10 minute restriction is removed?

We have used both techniques for different purposes.

Fan-Out Use

We employed the Fan-Out architecture for a use-case where we needed to pull data from a third party system based on a date range. The amount of data to be retrieved was variable from one day to the next, and respectively too large to pull all in one request. We used the Fan-Out architecture to request the data in hourly blocks concurrently to split the work up among multiple activities.

Multi-Threaded Use

In a file load use case, we found that parallelizing the functionality in the Activity through multi-threading worked better. We had a couple hundred files to load from a single source. Our first attempt was to use the Fan-Out pattern. This worked yet, seemed to take longer than expected. We then prototyped the the multi-threaded approach and found that it took 25% of time to load all the files compared to the Fan-Out.

Unexpected benefit of Mult-threaded approach

Another unexpected benefit also resulted in using the multi-threaded approach with file loading... Reduced SEH Exceptions when debugging our function app in Visual Studio.

What is an SEH exception?

An SEH Exception occurs when an external component throws an exception. From Microsoft, SEHExceptions are generated when unmanaged code throws an exception that hasn't been mapped to another .NET Framework exception.

When debugging our solution through Visual Studio, these intermittently occurred when accessing azure blobs in azure blob storage. Sometimes so often that we couldn't step through the code. After switching to the multi-threaded approach, the SEHException would still happen, yet very infrequently.

NOTE: We did attempt to isolate what specifically was causing it with out success. The issue seems to point to an issue between the blob storage libraries and Visual Studio. These exceptions only occurred when debugging the code; they never occurred if we ran with out debugging.

Previous Article Will you be a Level Five Leader?
Next Article March 17, 2003...twenty years later
Print
2057

Too Much To Do - try using a Smart Calendar!

Too Much To Do - try using a Smart Calendar! Read more

Have you ever seen the person that no matter what, always seems on top of things, on schedule and never flustered? While you might want to hurt them, the fact is you’re jealous! Our Tip this week will help you regain some control and get more organized.

Digital Marketing - yes, it's MANDATORY

Digital Marketing - yes, it's MANDATORY Read more

Everybody throws around buzz words and says we need to be doing ‘SEO” or on Social Media. Just because everybody else is doing it, why do I need to? My mother always asked me “if they jumped off a bridge, would you?” In this case, the answer is YES!

Tips for your business: Goal Setting

Tips for your business: Goal Setting Read more

Lately, I find myself thinking about goals a lot. Maybe it’s because I manage a lot of tasks and some days, though I’ve been busy all day, I’m not sure what I have accomplished. Perhaps it is because I’ve never been good at putting substance around my goals. 

Tip for your business: Understanding your WHY

Tip for your business: Understanding your WHY Read more

If you have spent much time with a toddler, you know the word why can take on annoying attributes beyond comprehension. Yet if you are in business, being able to answer the one-word question why takes on incredible importance.

Has Google forgotten you exist? Whose responsibility is that?

Has Google forgotten you exist? Whose responsibility is that? Read more

Getting your website included in organic search results requires focus and intent…it doesn’t just happen. While there are a lot of factors outside of your control, the foundational factors are right at your fingertips. You need to create quality content and you need to expand the content on your website so that Google sees you are investing in your website! In addition, if your website is not responsive, you are your own worst enemy!

You have the option to change this by making your website a priority. It doesn’t have to cost a fortune or demand every waking moment – but it does require your attention. 

We’ve shared some aspects of Google and options that can help you impact your web presence in the complete article which can be accessed here.
 

RSS
First234567810