Not There Yet: ChatGPT’s Data Analysis Tool (A Railroad Safety Data Example)
Large Language Models (LLMs) have historically struggled with numerical data, primarily because they are, at their core, language models. But OpenAI has aspirations to move into the business intelligence and analytics spaces. One way is through the Advanced Data Analysis tool.
We are continually monitoring updates to generative AI models to see how they progress. For this case, we gave the data analysis tool some very basic safety incident data from the Federal Railroad Administration. We have worked extensively with railroads to improve safety for more than a decade, so we are very famiar with this data.
We uploaded annual data for reportable injury rates and highway-rail incident rates for each of the class I (major) railroads. We then interrogated it in a conversational manner. The text of the chat, including the graphs it produced, can be seen below.
<Start of chat session>
For basic summarizations it seems okay and pretty elementary. It is thorough in that it tells you the steps it is going to take.
It did get more interesting when we asked it to comment by railroad, and when we specifically asked it about a poor performer (KCS, which is now combined with CP, but that happened in 2023 and data is only through 2022).
As for the math behind the analysis, it appears that all it has at its disposal is linear regression, and that is not good enough in this situation. Relationships are not always linear, and regression is not the best tool for understanding patterns over time. We ourselves have done better with integrated time series / regression models or vector autoregression. To its credit, it acknowledges at the very end that the model is simple.
Furthermore, it seems unaware of the nuances of aggregating data. You may not always want to do a simple average. A naïve user may fall into a trap in relying on the model to do all the steps an experienced analyst would.
So while the commentary is good, and even potentially insightful, it may not be based on the best numerical analysis.
That said, we feel this is a technology that will be very interesting to watch, especially the commentary. And we recognize that better responses to this simple example could be obtained through better prompt engineering.
We are already working with some of our clients to integrate generative AI into their BI tools. These are small step, proofs of concepts, given the nascent and rapidly evolving nature of these technologies. Please check back here often for updates, or contact us to see how you might apply these technologies in your company.