In the first part of this article, we discussed four new data analytics capabilities powered by ML and NLP technologies that large and small companies can use to enhance their ability to generate data insights. In this second part, we'll introduce a couple of powerful yet affordable new tools offering these capabilities and an agile approach that any company can adopt to quickly start leveraging modern BI and ML technologies to enhance decision-making and business performance.
Amazon QuickSight is a scalable, serverless, embeddable, machine learning-powered business intelligence (BI) service built for the cloud. QuickSight lets large and small companies easily create and publish interactive BI dashboards that include ML-powered insights. These dashboards can be accessed from any device, and seamlessly embedded into applications, portals, and websites.
QuickSight is serverless (i.e., does not require its users to provision their own servers) and can automatically scale to tens of thousands of users, without any infrastructure to manage or capacity to plan for. It is also the first BI service to offer pay-per-session pricing, where customers only pay when their users access their dashboards or reports, making it very cost-effective for both large and small scale deployments.
With QuickSight, business users can also ask questions regarding their data in plain language and receive answers in seconds.
Another important feature of this tool is SPICE (super-fast, parallel, in-memory, calculation engine). With SPICE, users can experience blazing fast performance at scale. SPICE automatically replicates data for high availability allowing thousands of users to simultaneously perform fast, interactive analysis, saving time and resources.
QuickSight also provide the maximum level of security with state-of-the-art encryption for both in transit and at rest data.
There are several BI tools available in the market today. In my opinion, there are two key benefits that differentiate Amazon QuickSight from all other offers:
Companies that need a more advanced time series ML-powered forecasting service can easily integrate this BI tool with Amazon Forecast (a tool that can produce up to 50% more accurate forecasts thanks to its ability to use multiple ML algorithms and to incorporate additional related and meta data). Companies that need to develop custom machine learning models can integrate this BI tool with Amazon SageMaker.
A modern cloud-based ML-powered BI service is a very affordable and easy to use tool, and large and small companies should adopt one now to leverage the power of these technologies to enhance decision-making and business performance.
While the ML-powered forecast capability embedded in Amazon QuickSight is a good option for a company that wants to start testing the benefits of ML-powered time series forecasting, companies that want to achieve the best possible results should adopt a more advanced solution like Amazon Forecast. This tool currently offers the most advanced set of capabilities and the lowest total cost of ownership.
Based on the same technology used at Amazon.com, Amazon Forecast uses machine learning (ML) to combine time series data with additional variables to build highly accurate forecasts. Amazon Forecast requires no ML experience to get started.
Users only need to provide historical data, plus any additional data that they believe may impact their forecasts.
For example, the demand for a particular color of a shirt may change with the seasons and store location. This complex relationship is hard to determine on its own, but machine learning is ideally suited to recognize it. Once a user provides their data, Amazon Forecast will automatically examine it, identify what is meaningful, and produce a forecasting model capable of making predictions that are up to 50% more accurate than looking at time series data alone.
Amazon Forecast is a fully managed service, so there are no servers to provision, and no machine learning models to build, train, or deploy. Users pay only for what they use, and there are no minimum fees and no upfront commitments.
When using an advance tool like Amazon Forecast, companies only need to focus on identifying significant business problems that can be adequately solved with forecasting and defining effective strategies to collect, store, and use the most appropriate data. Specialized professional service firms, like us and others, can easily help with that.
Amazon Forecast has been designed to deliver four key benefits:
Using Amazon Forecast, we've been able to increase our forecasting accuracy from 27% to 76% reducing wastage by 20% for the fresh produce category. The tool helped us optimize our under and over forecasting costs leading to stock-outs at 3% and improved gross margins. We are now expanding the model to other categories, iterating with additional related datasets, and adding newer data to Amazon Forecast to continuously improve the model accuracy. - Supratim Banerjee, Chief Transformation Officer - More Retail
Thanks to an innovating pricing strategy, Amazon Forecast is currently the fully managed forecasting service in the market with the lowest total cost of ownership (TCO). The key cost drivers are the number of hours used to train the ML model, the amount of data used to train the model, and the number of generated forecasts (i.e., one product sold at one retail location would equal to one forecast). The number of generated forecasts is by far the most significant driver of the total cost.
As an example, let's consider the case of a clothing company selling 2,000 items in 50 stores. Each combination of an item and store location would equate to one time series for a total of 100,000 time series to forecast (2,000 items x 50 stores). We'll also assume that the company will use a training dataset of about 5 GB for this task, that the company will be generating the default set of quantiles for each forecast (i.e. P=10, P=50, P=90), and take into account that a model using this amount of data would take about 20 hours to train. Finally, assuming that the company would want to generate a new set of updated rolling forecasts each week, the total weekly cost for this service would be equal to $185.24, and the total annual cost would be $9,632.48.
A modern cloud-base ML-powered time series forecasting service is a powerful yet affordable tool, and large and small companies should adopt one now to leverage the power of this technology to enhance decision-making and business performance.
Once business executives decide to enhance their organization's ability to develop and use data insights to inform business decisions and drive continuous improvements in business performance, the next question is how to do it.
The recommended approach is first to take a few weeks to build a data lake (assuming one properly setup is not already available). Then, to start expanding the organization's ability to leverage BI and ML capabilities through an agile approach and a series of two-weeks long sprints focused on solving specific business problems. And finally, to develop an explicit data strategy articulating which additional data the company needs to generate better insights and drive improvements in decision making and business performance.
To start leveraging business intelligence and data analytics tools a company needs to set up a data warehouse or a data lake to collect and store the required data. Historically, setting up and managing a data warehouse or a data lake involved a lot of manual, complicated, time-consuming, and expensive tasks. The work required includes loading data from diverse sources, monitoring those data flows, setting up schemas and partitions, turning on encryption and managing keys, defining transformation jobs and monitoring their operation, re-organizing data into a columnar format, configuring access control settings, deduplicating redundant data, matching linked records, granting access to data sets, and auditing access over time. Manually managing those tasks can take several months of work for the initial setup phase and many hours each following month for maintenance.
A modern cloud-based data lake is a cost-effective centralized and secure repository that enables a company to govern, discover, share, and analyze all its structured and unstructured data at any scale. Data lakes help companies to break down data silos and run flexible analytics and machine learning to guide better decisions.
Creating a data lake with a modern automated system like AWS Lake Formation is as simple as defining data sources and what data access and security policies a company wants to apply, and the initial setup work is reduced from several months to just a few weeks.
AWS Lake Formation helps companies automatically collect and catalog data from databases and object storage, move the data into a new data lake, clean and classify the company data using machine learning algorithms, and secure access to the company sensitive data. After that, users can access a centralized data catalog which describes available data sets and their appropriate usage and leverage these data sets with their choice of analytics and machine learning services.
With Lake Formation companies can:
Often, the best way to learn a new skill is by practicing it, and this is especially true for developing new skills and capabilities across an entire business organization.
Instead of developing a master plan encompassing all possible use cases and benefits resulting from building new BI and ML capabilities, companies should adopt an agile approach and execute a series of two-weeks long sprints, each focusing on a specific business problem. Business organizations should tackle more manageable issues first and address more challenging issues once they have developed more confidence and familiarity with the new tools.
Initially, problems typically addressed include things like:
More advanced challenges addressed at a later stage might include issues like:
As business executives embark on this journey, they'll start to realize that some of the data required to enhance decision making and business performance are not available. They'll also discover that some of the available data are missing some critical attributes or are not structured correctly.
As the negative impact on decision making and continuous performance improvement resulting from missing data will become more and more apparent, the organization will start developing an explicit data strategy as part of its annual strategy and planning cycle articulating what activities and investments are needed to enhance its data assets.
A company with an explicit data strategy indicates an organization that has mastered how to leverage data insights to enhance decision-making and business performance continuously. - Paolo Timoni, Partner - Augeo Partners
Enhancing the ability to generate data insights to improve decision-making and business performance is an essential priority for most businesses. Modern BI and data analytics tools powered by the cloud and machine learning are affordable and easy to use. The time to start leveraging these powerful new technologies for both large and small companies is now and Augeo Partners can help.