Observations on the Importance of Cloud-based Analytics
Cloud computing is enabling amazing new innovations both in consumer and enterprise products, as it became the new normal for organizations of all sizes. So many exciting new areas are being empowered by cloud that it is fascinating to watch. AWS is enabling innovations in areas such as healthcare, automotive, life sciences, retail, media, energy, robotics that it is mind boggling and humbling.
Despite all of the amazing innovations we have already seen, we are still on Day One in the Cloud; at AWS we will continue to use our inventive powers to build new tools and services to enable even more exciting innovations by our customers that will touch every area of our lives. Many of these innovations will have a significant analytics component or may even be completely driven by it. For example many of the Internet of Things innovations that we have seen come to life in the past years on AWS all have a significant analytics components to it.
I have seen our customers do so many radical new things with the analytics tools that our partners and us make available that I have made a few observations I would like to share with you.
Cloud analytics are everywhere. There is almost no consumer or business area that is not impacted by Cloud enabled analytics. Often it is hidden from the consumer’s eye as it empowers applications rather than being the end game but analytics is becoming more prevalent. From retail recommendations to genomics based product development, from financial risk management to start-ups measuring the effect of their new products, from digital marketing to fast processing of clinical trial data, all are taken to the next level by cloud based analytics.
For AWS we have seen evidence of this as Amazon Redshift, our data warehouse service, has become the fastest growing Cloud service in the history of the company. We even see that for many businesses Amazon Redshift is the first cloud service they ever use. Adoption is now really starting to explode in 2015 as more and more businesses understand the power analytics has to empower their organizations. The integration with many of the standard analytics tools such as Tableau, Jaspersoft, Pentaho and many others make Redshift extremely powerful.
Cloud enables self-service analytics. In the past analytics within an organization was the pinnacle of old style IT: a centralized data warehouse running on specialized hardware. In the modern enterprise this scenario is not acceptable. Analytics plays a crucial role in helping business units become more agile and move faster to respond to the needs of the business and build products customers really want. But they are still bogged down by this centralized, oversubscribed, old style data warehouse model. Cloud based analytics change this completely.
A business unit can now go out and create their own data warehouse in the cloud of a size and speed that exactly matches what they need and are willing to pay for. It can be a small, 2 node, data warehouse that runs during the day, a big 1000 node data warehouse that just runs for a few hours on a Thursday afternoon, or one that runs during the night to give personnel the data they need when they come into work in the morning.
A great example of this is the work global business publication The Financial Times (FT) is doing with analytics. The FT is over 120 years old and has transformed how it has been using the cloud to run Business Intelligence (BI) workloads to completely revolutionize how they offer content to customers, giving them the ability to run analytics on all their stories, personalizing the paper, giving readers a more tailored reading experience. With the new BI system the company is able to run analytics across 140 stories per day, in real time, and increase their agility for completing analytics tasks from months to days. As part of this the FT has also expanded their BI to better target advertising to readers. By using Amazon Redshift they are able to process 120m unique events per day and integrate their internal logs with external data sources, which is helping to create a more dynamic paper for their readers. All of this while cutting their datawarehouse cost by 80%.
Cloud Analytics will enable everything to become smart. These days everything has the ability to become “smart” - a smart watch, smart clothes, a smart TV, a smart home, a smart car. However, in almost all cases this “smartness” runs in software in the cloud not the object or the device itself.
Whether it is the thermostat in your home, the activity tracker on your wrist, or the smart movie recommendations on your beautiful ultra HD TV, all are powered by analytics engines running in the cloud. As all the intelligence of these smart products live in the cloud it is spawning a new generation of devices. A good example here is the work Philips is doing to make street lighting smart with their CityTouch product.
Philips CityTouch is an intelligent light management system for city-wide street lighting. It offers connected street lighting solutions that allow entire suburbs and cities to actively control street lighting to manage the after dark environment in real time. This allows local councils to keep certain streets well lit, to accommodate high foot traffic, bring on lighting during adverse weather, when ambient light dims to a dangerous level, or even turn lighting down, for example in an industrial estate, where there are no people. This technology is already being used in places like Prague and in suburbs of London. CityTouch is using the cloud as the backend technology to run the system and extract business value from large amounts of data collected from sensors installed in the street lights. This data is allowing councils to better understand their cities after dark and employ more efficient light management programmes and avoid too much light pollution which can have an adverse effect on residents and wildlife around cities.
Cloud Analytics improves city life. Related to the above is the ability for cloud analytics to take information from the city environment to improve the living conditions for citizens around the world. A good example is the work the Urban Center for Computation and Data of the City of Chicago is doing. The City of Chicago is one of the first to bring sensors throughout the city that will permanently measure air quality, light intensity, sound volume, heat, precipitation, wind and traffic. The data from these sensor stream into the cloud where it is analyzed to find ways to improve the life of its citizens. The collected datasets from Chicago’s “Array of Things” will be made publically available on the cloud for researchers to find innovate ways to analyze the data.
Many cities have already expressed interest in following Chicago’s lead to use the cloud to improve city life and many are beginning to do the same in Europe such as the Peterborough City Council in the UK. Peterborough City Council is making public data sets available to outsource innovation to the local community. The different data sets from the council are being mashed together where people are mapping, for example, crime data against weather patterns to help the council understand if there are there more burglaries when it is hot and how they should resource the local police force. Or mapping hospital admission data against weather to identify trends and patterns. This data is being made open and available to everyone to drive innovation, thanks to the power of the cloud.
Cloud Analytics enable the Industrial Internet of Things. Often when we think about the Internet of Things (IoT) we focus on what this will mean for the consumer. But we are already seeing the rise of a different IoT - the Industrial Internet of Things. Industrial machinery is instrumented and Internet connected to stream data into the cloud to gain usage insights, improve efficiencies and prevent outages.
Whether this is General Electric instrumenting their gas turbines, Shell dropping sensors in their oil wells, Kärcher with fleets of industrial cleaning machines, or construction sites enabled with sensors from Deconstruction, all of these send continuous data streams for real time analysis into the cloud.
Cloud enables video analytics. For a long time video was recorded to be archived, played back and watched. With the unlimited processing power of the cloud there is a new trend arising: treating video as a data stream to be analyzed. This is being called Video Content Analysis (VCA) and it has many application areas from retail to transportation.
A common area of application is in locations where video cameras are present such as malls and large retail stores. Video is analyzed to help stores understand traffic patterns. Analytics provide the numbers of customers moving as well as dwell times, and other statistics. This allows retailers to improve their store layouts and in-store marketing effectiveness.
Another popular area is that of real time crowd analysis at large events, such as concerts, to understand movement throughout the venue and remove bottlenecks before they occur in order to improve visitor experience. Similar applications are used by transport departments to regulate traffic, detect stalled cars on highways, detect objects on high speed railways, and other transport issues.
Another innovative examples that has taken VCA into the consumer domain is Dropcam. Dropcam analyzes video streamed by Internet enabled video cameras to provide their customers with alerts. Dropcam is currently the largest video producer on the Internet, ingesting more video data into the cloud than YouTube.
VCA is also becoming a crucial tool in sports management. Teams are using video analysis to process many different angles on the players. For example the many video streams recorded during a Premier League match are used by teams to improve player performance and drive specific training schemes.
In the US video analytics is being used by MLB baseball teams to provide augmented real time analytics on video screens around the stadium while the NFL is using VCA to create automatic condensed versions of American football matches bringing the run time down by 60%-70%.
Cloud transforms health care analytics. Data analytics is quickly becoming central to analyzing health risk factors and improving patient care. Despite healthcare being an area that is under pressure to reduce cost and speed up patient care, cloud is playing a crucial role and helping healthcare go digital.
Cloud powers innovative solutions such as Phillips Healthsuite, a platform that manages healthcare data and provides support for doctors as well as patients. The Philips HealthSuite digital platform analyzes and stores 15 PB of patient data gathered from 390 million imaging studies, medical records, and patient inputs to provide healthcare providers with actionable data, which they can use to directly impact patient care. This is reinventing healthcare for billions of people around the world. As we move through 2015 and beyond we can expect to see cloud play even more of a role in the advancement of the field of patient diagnosis and care.
Cloud enables secure analytics. With analytics enabling so many new areas, from online shopping to healthcare to home automation, it becomes paramount that the analytics data is kept secure and private. The deep integration of encryption into the storage and in the analytics engines, with users being able to bring their own encryption keys, ensures that only the users of these services have access to the data and no one else.
In Amazon Redshift data blocks, system metadata, partial results from queries and backups are encrypted with a random generated key, then this is set of keys is encrypted with a master key. This encryption is standard operation practice; customers do not need to do anything. If our customers want full control over who can access their data they can make use of their own Master key to encrypt the data block keys. Customer can make use of the AWS Key Management Service to securely manage their own keys that are stored in Hardware Security Modules to ensure that only the customer has access to the keys and that only the customer controls who has access to their data.
Cloud enables collaborative research analytics. As Jim Gray already predicted in his 4th paradigm much of the research word is shifting from computational models to data-driven sciences. We already see this by many researchers making their datasets available for collaborative real-time analytics in the cloud. Whether these data sets come streamed from Mars or from the bottom of the Oceans, the cloud is the place to ingest, store, organize, analyze and share this data.
An interesting commercial example are the connected sequence systems from Illumina; the sequenced data is directly streamed to the cloud where the customer has access to BaseSpace, a cloud based market place for algorithms that can be used to process their data.
At AWS we are proud to power the progress that puts analytic tools in the hands of everyone. We humbled by what our customers are already doing with our current toolset. But it is still Day One; we will continue to innovate in this space such that our customers can go on to do even greater things.