To be effective, Big Data analytics must deliver on the end-user experience, and not on the hype associated with artificial intelligence and machine learning.
Some might assume Big Data analytics is synonymous with machine learning (ML) or artificial intelligence (AI), but this is incorrect. Although some aspects of these technologies may be used in Big Data analytic applications, focusing only on them is sure to create confusion and inflate expectations. The hype around AI suggests automatic generation of insights when it is applied to Big Data. It yields results with little or no effort from the end user. The data analytics user experience, however, is quite different.
For example, one vendor shared a case study claiming they used an AI-focused data analytics solution to solve a production issue. In the presentation, the vendor included the formula they used to find the correlation between a production process and product outcomes. The formula was regression, which is a fine tool, but it’s misleading to claim it is equivalent to AI. The formula used wasn’t even a variation of regression; it was the most basic form of a machine-learning (ML) algorithm and a feature of Microsoft Excel since its introduction in the data analysis tool pack in 2007.
Given that a 10-year old function was repackaged, relabeled, and hyped as AI indicates confusion among the flavors of cognitive computing. Partly to blame is a lack of understanding about what constitutes AI, ML, deep learning, and other variations of "cognitive computing," as well as the arguments about supervised and unsupervised variations.
This kind of hype is in contrast to the bitter reality, which shows a large percentage of collected data doesn’t even get analyzed or leveraged for insights by the engineers and analysts who could use it to produce actionable information.
While many technologies, including AI, can be used in data analytics applications, the focus should not be on the technologies. Instead, the focus should be on the user experience and the acceleration of insights into process data.
The focus must be on subject matter experts (SMEs)—process engineers, data analysis experts, and others—and on their ability to find insights in their data. Regardless of the technologies used in data analytics applications, the goal always is to make the solution accessible to a user who doesn’t have extensive data science expertise.
Well-designed data analytics applications empower SMEs to produce results. As data analytics applications tap additional functionality, they should implement these innovations in the context of easy-to-use features accessible by process experts.
Data analytics applications should use available technologies to help end users succeed. If it’s ML or AI functionality, great, but data analytics applications should not be restricted to only these features, or others for the sake of market hype. Many algorithms and innovations are available to assist users who need to find insights in data.
Algorithm functions can include digital signal processing for data cleansing, shape detection in process signals, and map-reduce models for distributed computing. Calling algorithms ML, AI, or otherwise, isn’t the point. The point is to help end users succeed faster.
Cognitive computing algorithms are an important part of the solution for analytics in process manufacturing and the Industrial Internet of Things (IIoT) solutions, but only a part of it. Other aspects include data wrangling, which is the required data connectivity, cleansing, and contextualization to prepare data for use. The data analytics application’s focus must include these preparatory steps so results are accelerated from data connection to insight to distribution.
Finally, data analytics applications should enable users to expand and extend analytics to whatever level is required. End users will keep expanding the use of data analytics applications and the need for specific algorithms. Therefore, data analytics applications should include extensibility to additional algorithms through features such as REST API, OData, and integration of algorithms into the user experience.
When data analytics applications allow users to focus on the problem rather than the technology, desired results can be obtained quickly, as these use case examples show.
Problem: When moving from experiments to full-scale production, the reactor yields realized in the lab could not be duplicated upon scale-up. The causes were difficult to identify using manual methods and existing data management tools such as spreadsheets.
Solution: Data analytics was used to bring together data from disparate sources associated with the lab and full-scale production. This allowed the company to quickly compare the two processes and ascertain key differences. Full-scale production parameters were adjusted to improve yields.
Problem: Plant operators could not get information quickly enough from the plant’s automation system to control pollution abatement equipment in real time. This resulted in frequent overdosing of mitigation chemicals with resulting costs.
Solution: The data analytics application’s ability to process information quickly allows for effective control of the pollution abatement system as easily as any another process variable, optimizing chemical use for maximum nitrogen oxide (NOx), sulfur oxide (SOx), and mercury reduction at the lowest mitigation chemical cost while extending equipment life.
Problem: Heavily regulated and energy-intensive production steps such as smelting were difficult to characterize using conventional spreadsheets. As a result, balancing multiple factors and making optimal tradeoffs among costs, environmental factors, and energy efficiency was almost impossible.
Solution: Data analytics was used to analyze tradeoffs among multiple variables, which made it easier to control the process based on which variable is the most critical at any given time. Adjusting the mix for evolving costs or new regulations was simplified by trying out various what-if scenarios.
Problem: Refiners had trouble predicting fouling with reactor guard beds due to the variety of conditions capable of causing problems. Data from various sensors tended to be noisy, which made it difficult to identify actual trends and take corrective measures.
Solution: Using its data cleansing and capsule technologies, the data analytics solution eliminated noise on the fly and identified critical operating conditions contributing to accelerated fouling. Benchmarks created during steady-state conditions made it easy to determine when problem situations are forming, and to perform predictive maintenance (PdM).
Problem: The pulping process for a linerboard mill mixed various wood species based on availability. Verifying that feedstock changes would not negatively affect the final product was difficult given the long time from pulping to finished product.
Solution: The data analytics application characterized final product quality based on pulp characteristics from historical data. The producer then started testing the pulp before it reached the paper machine and used this information to accurately predict critical characteristics of the end product.
With the right data analytics applications, there won’t be any ceiling regardless of the exploration requirements. Current hype and promises will mature over time and data analytics will become an expected and stable component of the work environment. However, the end user with context and expertise has to be the priority. Incorporating AI, ML, and other technologies will enable data analytics applications to produce faster actionable insights from Big Data.