Deep learning models perform remarkably well on many classification tasks recently. The superior performance of deep neural networks relies on the large number of training data, which at the same time must have an equal class distribution in order to be efficient. However, in most real-world applications, the labeled data may be limited with high imbalance ratios among the classes, and thus, the learning process of most classification algorithms is adversely affected resulting in unstable predictions and low performance. Three main categories of approaches address the problem of imbalanced learning, i.e., data-level, algorithmic level, and hybrid methods, which combine the two aforementioned approaches. Data generative methods are typically based on generative adversarial networks, which require significant amounts of data, while model-level methods entail extensive domain expert knowledge to craft the learning objectives, thereby being less accessible for users without such knowledge. Moreover, the vast majority of these approaches are designed and applied to imaging applications, less to time series, and extremely rare to both of them. To address the above issues, we introduce GENDA, a generative neighborhood-based deep autoencoder, which is simple yet effective in its design and can be successfully applied to both image and time-series data. GENDA is based on learning latent representations that rely on the neighboring embedding space of the samples. Extensive experiments, conducted on a variety of widely-used real datasets demonstrate the efficacy of the proposed method.
@ARTICLE{10054417, author={Troullinou, Eirini and Tsagkatakis, Grigorios and Losonczy, Attila and Poirazi, Panayiota and Tsakalides, Panagiotis}, journal={IEEE Transactions on Artificial Intelligence}, title={A Generative Neighborhood-Based Deep Autoencoder for Robust Imbalanced Classification}, year={2024}, volume={5}, number={1}, pages={80-91}, keywords={Data models;Classification algorithms;Time series analysis;Artificial intelligence;Predictive models;Data augmentation;Data augmentation;image data;imbalanced classification;latent space;time-series data}, doi={10.1109/TAI.2023.3249685}}