Why Is IoT Data Still So Bad?
First Analytics was among the first to realize the potential value of IoT data. In fact, we coined the phrase “The Analytics of Things“. Around that time we registered the domain www.analyticsofthings.com.
First Analytics chairman, Tom Davenport, followed up with several articles, such as, When will the analytics of things grow up? Quite a few years ago we posted Why your analytics aspirations will fail. More recently we presented this theme at a conference. So now, about a decade into IoT analytics, we ask, why is IoT data still so bad?
The story hasn’t changed much. While the volume and types of data have exploded, the same underlying flaws remain. Whenever we get a new piece of data from a client to assess, it never takes us long to uncover flaws.
The issues can be classified into four categories, as depicted in the accompanying figure.
- Quality: missing swaths of data.
Patterns of missingness are varied (time, location, device, intermittency, etc.), unpredictable, and much more common than one would expect.
- Quality: anomalous data
Numerical observations that are extreme, flat-out wrong or infeasible are common.
- Usability: poor management of history
Machine “historians” have short-term memory: why do they call them historians if they don’t keep history for very long? And granular data is often not retained. Data is pre-aggregated to support BI and the source data disposed of.
- Usability: poor data model design
Engineers and architects do not have advanced analytics in mind when designing PLCs, event recorders, and their supporting data collection systems.
Stating again what we said those many years ago, there are indeed very valuable applications to be found in the Internet of Things. And you may be embarking on some IoT initiatives with exciting new data sources. But to avoid failure you must invest the time and effort at the very outset to qualify your data. Gaps can often be addressed through a well thought-out plan, but it will usually take time to build the right data systems and history.
Too many organizations spend the early stages of their IoT initiatives in planning for applications they later come to learn cannot be supported by the data. Using a data-first, data-driven approach is a tangible way to get your IoT initiative started and mitigate the risk of failure before you spend too much time and expense in planning.