Yuhui QuanSchool of Computer Science & Engineering, South China Univ. of Tech., Guangzhou 510006, China; Department of Mathematics, National University of Singapore, Singapore 117542Chenglong BaoDepartment of Mathematics, National University of Singapore, Singapore 117542Hui JiDepartment of Mathematics, National University of Singapore, Singapore 117542
Most existing dictionary learning algorithms consider a linear sparse model, which often cannot effectively characterize the nonlinear properties present in many types of visual data, e.g. dynamic texture (DT). Such nonlinear properties can be exploited by the so-called kernel sparse coding. This paper proposed an equiangular kernel dictionary learning method with optimal mutual coherence to exploit the nonlinear sparsity of high-dimensional visual data. Two main issues are addressed in the proposed method: (1) coding stability for redundant dictionary of infinite-dimensional space; and (2) computational efficiency for computing kernel matrix of training samples of high-dimensional data. The proposed kernel sparse coding method is applied to dynamic texture analysis with both local DT pattern extraction and global DT pattern characterization. The experimental results showed its performance gain over existing methods.
Zihao WangBNRist, Department of Computer Science and Technology, RIIT, Institute of Internet Industry, Tsinghua UniversityDatong ZhouDepartment of Mathematical Sciences, Tsinghua UniversityMing YangDepartment of Computer Science and Technology, Tsinghua UniversityYong ZhangBNRist, Department of Computer Science and Technology, RIIT, Institute of Internet Industry, Tsinghua UniversityChenglong BaoYau Mathematical Sciences Center, Tsinghua UniversityHao WuDepartment of Mathematical Sciences, Tsinghua University
Computing the distance among linguistic objects is an essential problem in natural language processing. The word mover’s distance (WMD) has been successfully applied to measure the document distance by synthesizing the low-level word similarity with the framework of optimal transport (OT). However, due to the global transportation nature of OT, the WMD may overestimate the semantic dissimilarity when documents contain unequal semantic details. In this paper, we propose to address this overestimation issue with a novel Wasserstein-Fisher-Rao (WFR) document distance grounded on unbalanced optimal transport theory. Compared to the WMD, the WFR document distance provides a trade-off between global transportation and local truncation, which leads to a better similarity measure for unequal semantic details. Moreover, an efficient prune strategy is particularly designed for the WFR document distance to facilitate the top-k queries among a large number of documents. Extensive experimental results show that the WFR document distance achieves higher accuracy that WMD and even its supervised variation s-WMD.
Existing domain adaptation methods aim at learning features that can be generalized among domains. These methods commonly require to update source classifier to adapt to the target domain and do not properly handle the trade-off between the source domain and the target domain. In this work, instead of training a classifier to adapt to the target domain, we use a separable component called data calibrator to help the fixed source classifier recover discrimination power in the target domain,
while preserving the source domain’s performance. When the difference between two domains is small, the source
classifier’s representation is sufficient to perform well in the target domain and outperforms GAN-based methods in
digits. Otherwise, the proposed method can leverage synthetic images generated by GANs to boost performance and achieve state-of-the-art performance in digits datasets and driving scene semantic segmentation. Our method also empirically suggests the potential connection between domain adaptation and adversarial attacks.
Code release is available at https://github.com/yeshaokai/Calibrator-Domain-Adaptation
Linfeng ZhangTsinghua University; Institute for interdisciplinary Information Core TechnologyMuzhou YuInstitute for interdisciplinary Information Core Technology; Xi’an Jiaotong UniversityTong ChenTsinghua UniversityZuoqiang ShiTsinghua UniversityChenglong BaoTsinghua UniversityKaisheng MaTsinghua University
Training process is crucial for the deployment of the network in applications which have two strict requirements on both accuracy and robustness. However, most existing approaches are in a dilemma, i.e. model accuracy and robustness forming an embarrassing tradeoff – the improvement of one leads to the drop of the other. The challenge remains for as we try to improve the accuracy and robustness simultaneously. In this paper, we propose a novel training method via introducing the auxiliary classifiers for training on corrupted samples, while the clean samples are normally trained with the primary classifier. In the training stage, a novel distillation method named input-aware self distillation is proposed to facilitate the primary classifier to learn the robust information from auxiliary classifiers. Along with it, a new normalization method - selective batch normal-
ization is proposed to prevent the model from the negative influence of corrupted images. At the end of the training period, a L2-norm penalty is applied to the weights of primary and auxiliary classifiers such that their weights are asymptotically identical. In the stage of inference, only the primary classifier is used and thus no extra computation and storage are needed. Extensive experiments on CIFAR10, CIFAR100 and ImageNet show that noticeable improvements on both accuracy and robustness can be observed by the proposed auxiliary training. On average, auxiliary training achieves 2.21% accuracy and 21.64% robustness
(measured by corruption error) improvements over traditional training methods on CIFAR100. Codes have been released on github.
Zonghan YangInstitute for Artificial Intelligence, Beijing National Research Center for Information Science and Technology, Department of Computer Science and Technology, Tsinghua UniversityYang LiuInstitute for Artificial Intelligence, Beijing National Research Center for Information Science and Technology, Department of Computer Science and Technology, Tsinghua UniversityChenglong BaoYau Mathematical Sciences Center, Tsinghua UniversityZuoqiang ShiDepartment of Mathematical Sciences, Tsinghua University
Although ordinary differential equations (ODEs) provide insights for designing network architectures, its relationship with the non-residual convolutional neural networks (CNNs) is still unclear. In this paper, we present a novel ODE model by adding a damping term. It can be shown that the proposed model can recover both a ResNet and a CNN by adjusting an interpolation coefficient. Therefore, the damped ODE model provides a unified framework for the interpretation of residual and non-residual networks. The Lyapunov analysis reveals better stability of the proposed model, and thus yields robustness improvement of the learned networks. Experiments on a number of image classification benchmarks show that the proposed model substantially improves the accuracy of ResNet and ResNeXt over the perturbed inputs from both stochastic noise and adversarial attack methods. Moreover, the loss landscape analysis demonstrates the improved robustness of our method along the attack direction.