Engineers at Tokyo Institute of Technology (Tokyo Tech) have demonstrated a simple computational approach for supporting the classification performance of neural networks operating on sensor time series. The proposed technique involves feeding the recorded signal as an external forcing into an elementary non-linear dynamical system, and providing its temporal responses to this disturbance to the neural network alongside the original data.
In the world around us, a proliferation of sensors is taking place, promising to support the efficiency and sustainability of practically all aspects of human activity. One challenge that engineers involved in delivering the internet-of-things to society have to face, is how to handle the flood of data resulting from such sensors. Especially, there is a need to reduce the data as much as possible at the edge, close to the sensors themselves, because streaming all data to the cloud would have an unacceptable technical, economic and environmental footprint. As a response to this, much research is being conducted worldwide towards small-sized, highly efficient classifiers suitable for detecting particular behaviors and situations of interest while running on limited computational resources. An example application scenario is the real-time monitoring of the behavior of livestock, having the purpose of detecting subtle changes that are indicative of prodromal disease.
“An emerging approach to support the development of time series classifiers suitable for edge artificial intelligence is that of data augmentation. Basically, it is about finding creative and innovative ways of generating additional data to help get the very best performance out of neural networks that necessarily have to be quite small to meet power and size requirements. While the theory of classifiers is well established, it can be said that data augmentation is still almost in its infancy for time series. In our laboratory, for example, we have been working on a variety of techniques based on empirical considerations as well as mathematical principles,” explains Ms. Chao Li, doctoral student at the Nano Sensing Unit where the study was conducted, and joint-lead author of the study.
Usually, data augmentation is performed just before or during classifier training, and runs on powerful workstations or cloud computers. The result is that the amount of data available to train a classifier is extended along the time dimension, as would be the case if longer recordings had been made available. This is important because high-quality data of the type necessary for classifier training is precious and expensive to prepare. However, this is not the only form of data augmentation possible. “We came up with the idea of extending the data along the other dimension, that is, the number of time series, meaning the number of input dimensions. Usually, edge applications may operate on one, or at most a few sensor time series. One possibility is performing computational operations to generate more of them, which try to make as much as possible of the initial information available to the classifier in a form suitable for it to learn it efficiently. While many signal processing operations could be implemented, a particularly disruptive computation is to simulate a dynamical system, endowed with its own intrinsic activity, and try to disturb it by externally forcing it with a signal recorded from the environment,” explains Dr. Ludovico Minati, lead author of the study.
Starting from a concept previously developed and patented in the Biointerfaces unit for improving the performance of brain-interface systems, the researchers carefully considered many practical aspects of how to realize it. Targeting the classification of the basic cattle behaviors using a collar-mounted accelerometer, they developed ways to filter and preprocess the kinematic signals and of injecting them so that the simulated dynamical system would accept and respond to them without diverging. Then, they explored how to extract the most relevant time series from its activity, in order to supply it either to a predetermined feature extractor and multi-layer perceptron or to a convolutional neural network. “Many low-dimensional systems such as the Rössler and Lorenz systems, which have been studied for decades by physicists and control engineers, actually have a remarkable computational potential that remains largely unexplored. This study takes an unusual step towards deploying it in a concrete application scenario,” explains Prof. Mattia Frasca from the University of Catania (Italy), who provided several theoretical contributions to the Tokyo Tech researchers on the behaviors of these kinds of systems and their implementations as analog circuits.
By augmenting the data through the additional time series derived from the dynamical systems, namely one separate Rössler system per accelerometer axis, the researchers were able to increase the classification performance by an appreciable amount. “While this is truly just an initial study to propose a provocative idea and substantial future work is needed, we were also able to realize the dynamical system using a very simple analog hardware circuit and still observe an improvement thanks to exploiting its responses,” adds Dr. Ludovico Minati. “Our approach reminds of reservoir computing, on which we recently conducted research using elementary transistor circuits known as the Minati-Frasca oscillators. However, it is actually different, because the dynamics are low dimensional, and a single oscillator is used instead of a network. In this sense, it may be even more suitable for low-power implementation” adds Mr. Jim Bartels, also a doctoral student at the unit.
After the interview, the team explained that this type of exploratory research will need to be extended and developed on other datasets and settings to ascertain its general applicability to concrete cases, though these initial results are promising. “One take-home point is that this approach can be implemented with quite limited resources, either digitally or in an analog way. Our past work in fact has shown CMOS chaotic systems operating with as low power as 1 μW, which could be suitable for this usage. As optimizations of process technologies and conventional designs approach their limits, the confident exploration of radically new ideas such as this one seems necessary for continued innovation,” concludes Dr. Hiroyuki Ito, head of the unit. The methodology, results and related considerations are reported in a recent article published in the journal Chaos, Solitons and Fractals, and all of the experimental recordings have been made freely available for others to use in future work.