Sort by: Year Popularity Relevance

LiteMORT: A memory efficient gradient boosting tree system on adaptive compact distributions

 - 2020

Gradient boosted decision trees (GBDT) is the leading algorithm for many commercial and academic data applications. We give a deep analysis of this algorithm, especially the histogram technique, which is a basis for the regulized distribution with compact support. We present three new modifications. 1) Share memory technique to reduce memory usag...


Snippext: Semi-supervised Opinion Mining with Augmented Data

, , ,  - 2020

Online services are interested in solutions to opinion mining, which is the problem of extracting aspects, opinions, and sentiments from text. One method to mine opinions is to leverage the recent success of pre-trained language models which can be fine-tuned to obtain high-quality extractions from reviews. However, fine-tuning language models st...


Unsupervised Anomaly Detection for X-Ray Images

, , , , , , ,  - 2020

Obtaining labels for medical (image) data requires scarce and expensive experts. Moreover, due to ambiguous symptoms, single images rarely suffice to correctly diagnose a medical condition. Instead, it often requires to take additional background information such as the patient's medical history or test results into account. Hence, instead of foc...


Exact Information Bottleneck with Invertible Neural Networks: Getting the Best of Discriminative and Generative Modeling

, , ,  - 2020

The Information Bottleneck (IB) principle offers a unified approach to many learning and prediction problems. Although optimal in an information-theoretic sense, practical applications of IB are hampered by a lack of accurate high-dimensional estimators of mutual information, its main constituent. We propose to combine IB with invertible neural n...


Unsupervised Sentiment Analysis for Code-mixed Data

,  - 2020

Code-mixing is the practice of alternating between two or more languages. Mostly observed in multilingual societies, its occurrence is increasing and therefore its importance. A major part of sentiment analysis research has been monolingual, and most of them perform poorly on code-mixed text. In this work, we introduce methods that use different ...


Reasoning on Knowledge Graphs with Debate Dynamics

, , , , ,  - 2020

We propose a novel method for automatic reasoning on knowledge graphs based on debate dynamics. The main idea is to frame the task of triple classification as a debate game between two reinforcement learning agents which extract arguments -- paths in the knowledge graph -- with the goal to promote the fact being true (thesis) or the fact being fa...


Inductive Document Network Embedding with Topic-Word Attention

, ,  - 2020

Document network embedding aims at learning representations for a structured text corpus i.e. when documents are linked to each other. Recent algorithms extend network embedding approaches by incorporating the text content associated with the nodes in their formulations. In most cases, it is hard to interpret the learned representations. Moreover...


Fine-Grained Fashion Similarity Learning by Attribute-Specific Embedding Network

, , , , , ,  - 2020

This paper strives to learn fine-grained fashion similarity. In this similarity paradigm, one should pay more attention to the similarity in terms of a specific design/attribute among fashion items, which has potential values in many fashion related applications such as fashion copyright protection. To this end, we propose an Attribute-Specific E...


Deep Optimized Multiple Description Image Coding via Scalar Quantization Learning

, , ,  - 2020

In this paper, we introduce a deep multiple description coding (MDC) framework optimized by minimizing multiple description (MD) compressive loss. First, MD multi-scale-dilated encoder network generates multiple description tensors, which are discretized by scalar quantizers, while these quantized tensors are decompressed by MD cascaded-ResBlock ...


Explainable Deep Convolutional Candlestick Learner

, , ,  - 2020

Candlesticks are graphical representations of price movements for a given period. The traders can discovery the trend of the asset by looking at the candlestick patterns. Although deep convolutional neural networks have achieved great success for recognizing the candlestick patterns, their reasoning hides inside a black box. The traders cannot ma...