Expanding the Cloud – The Amazon Simple Workflow Service
Today AWS launched an exciting new service for developers: the Amazon Simple Workflow Service. Amazon SWF is an orchestration service for building scalable distributed applications. Often an application consists of several different tasks to be performed in particular sequence driven by a set of dynamic conditions. Amazon SWF makes it very easy for developers to architect and implement these tasks, run them in the cloud or on premise and coordinate their flow. Amazon SWF manages the execution flow such that the tasks are load balanced across the registered workers, that inter-task dependencies are respected, that concurrency is handled appropriately and that child workflows are executed.
A growing number of applications are relying on asynchronous and distributed processing, with scalability of the application as the primary motivation. By designing autonomous distributed components, developers get the flexibility to deploy and scale out parts of the application independently as load increases. The asynchronous and distributed model has the benefits of loose coupling and selective scalability, but it also creates new challenges. Application developers must coordinate multiple distributed components to get the desired results. They must deal with the increased latency and unreliability inherent in remote communication. Components may take extended periods of time to complete tasks, requests may fail and errors originating from remote systems must be handled. Today, to accomplish this, developers are forced to write complicated infrastructure that typically involves message queues and databases along with complex logic to synchronize them. All this ‘plumbing’ is extraneous to business logic and makes the application code unnecessarily complicated and hard to maintain.
Amazon SWF enables applications to be built by orchestrating tasks coordinated by a decider process. Tasks represent logical units of work and are performed by application components that can take any form, including executable code, scripts, web service calls, and human actions. Developers have full control over implementing and orchestrating tasks, without worrying about underlying complexities such as tracking their progress and keeping their state.
Developers implement workers to perform tasks. They run their workers either on cloud infrastructure, such as Amazon EC2, or on-premise. Tasks can be long-running, may fail, may timeout and may complete with varying throughputs and latencies. Amazon SWF stores tasks for workers, assigns them when workers are ready, tracks their progress, and keeps their latest state, including details on their completion. To orchestrate tasks, developers write programs that get the latest state of tasks from Amazon SWF and use it to initiate subsequent tasks in an ongoing manner. Amazon SWF maintains an application’s execution state durably so that the application can be resilient to failures in individual application components.
An important feature of Amazon SWF is the auditability; Amazon SWF gives visibility into the execution of each step in the application. The Management Console and APIs let you monitor all running executions of the application. The customer can zoom in on any execution to see the status of each task and its input and output data. To facilitate troubleshooting and historical analysis, Amazon SWF retains the history of executions for any number of days that the customer cab specify, up to a maximum of 90 days.
Amazon SWF provides a collection of very powerful building blocks that also can be used to build higher-level execution engines. Some of our early customers used Amazon SWF to implement their domain specific languages (DL) for specialized business process execution, This is an area where I think the availability of Amazon SWF will drive a lot of innovation.
As part of the AWS SDK, the AWS Flow Framework helps developers create Amazon SWF based application quickly and easily. The Java version of the SDK includes really cool integration at the language level, making it easy for developers to automatically transform java code into tasks, create the right dependencies, and manage the execution of the workflow. This brings the power that some languages with built-in distribution and concurrency like Erlang offer to Java.
For more insight into workflow execution, task coordination, task routing, task distribution, exception handling, child workflows, timers, signals, markers and much more see the Amazon SWF detail page. More information about the SDK see the developers guide. As always The AWS developer blog has additional details. At the Rightscale blog Thorsten von Eicken talks about their use of SWF.