Mirror operated in collaboration with local support

Computer Science

New submissions

[ total of 350 entries: 1-350 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Thu, 21 May 20

[1]  arXiv:2005.09634 [pdf]
Title: Automated Copper Alloy Grain Size Evaluation Using a Deep-learning CNN
Subjects: Computer Vision and Pattern Recognition (cs.CV); Materials Science (cond-mat.mtrl-sci); Machine Learning (cs.LG); Machine Learning (stat.ML)

Moog Inc. has automated the evaluation of copper (Cu) alloy grain size using a deep-learning convolutional neural network (CNN). The proof-of-concept automated image acquisition and batch-wise image processing offers the potential for significantly reduced labor, improved accuracy of grain evaluation, and decreased overall turnaround times for approving Cu alloy bar stock for use in flight critical aircraft hardware. A classification accuracy of 91.1% on individual sub-images of the Cu alloy coupons was achieved. Process development included minimizing the variation in acquired image color, brightness, and resolution to create a dataset with 12300 sub-images, and then optimizing the CNN hyperparameters on this dataset using statistical design of experiments (DoE).
Over the development of the automated Cu alloy grain size evaluation, a degree of "explainability" in the artificial intelligence (XAI) output was realized, based on the decomposition of the large raw images into many smaller dataset sub-images, through the ability to explain the CNN ensemble image output via inspection of the classification results from the individual smaller sub-images.

[2]  arXiv:2005.09635 [pdf, other]
Title: InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs
Comments: 18 pages, 18 figures, 5 tables. arXiv admin note: text overlap with arXiv:1907.10786
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Although Generative Adversarial Networks (GANs) have made significant progress in face synthesis, there lacks enough understanding of what GANs have learned in the latent representation to map a randomly sampled code to a photo-realistic face image. In this work, we propose a framework, called InterFaceGAN, to interpret the disentangled face representation learned by the state-of-the-art GAN models and thoroughly analyze the properties of the facial semantics in the latent space. We first find that GANs actually learn various semantics in some linear subspaces of the latent space when being trained to synthesize high-quality faces. After identifying the subspaces of the corresponding latent semantics, we are able to realistically manipulate the facial attributes occurring in the synthesized images without retraining the model. We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection, resulting in more precise control of the attribute manipulation. Besides manipulating gender, age, expression, and the presence of eyeglasses, we can even alter the face pose as well as fix the artifacts accidentally generated by GANs. Furthermore, we perform in-depth face identity analysis and layer-wise analysis to quantitatively evaluate the editing results. Finally, we apply our approach to real face editing by involving GAN inversion approaches as well as explicitly training additional feed-forward models based on the synthetic data established by InterFaceGAN. Extensive experimental results suggest that learning to synthesize faces spontaneously brings a disentangled and controllable face representation.

[3]  arXiv:2005.09639 [pdf]
Title: Webpage Segmentation for Extracting Images and Their Surrounding Contextual Information
Comments: arXiv admin note: substantial text overlap with arXiv:2005.02156
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)

Web images come in hand with valuable contextual information. Although this information has long been mined for various uses such as image annotation, clustering of images, inference of image semantic content, etc., insufficient attention has been given to address issues in mining this contextual information. In this paper, we propose a webpage segmentation algorithm targeting the extraction of web images and their contextual information based on their characteristics as they appear on webpages. We conducted a user study to obtain a human-labeled dataset to validate the effectiveness of our method and experiments demonstrated that our method can achieve better results compared to an existing segmentation algorithm.

[4]  arXiv:2005.09643 [pdf]
Title: An Innovative Approach to Determine Rebar Depth and Size by Comparing GPR Data with a Theoretical Database
Comments: 10 pages
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Ground penetrating radar (GPR) is an efficient technique used for rapidly recognizing embedded rebar in concrete structures. However, due to the difficulty in extracting signals from GPR data and the intrinsic coupling between the rebar depth and size showing in the data, simultaneously determining rebar depth and size is challenging. This paper proposes an innovative algorithm to address this issue. First, the hyperbola signal from the GPR data is identified by direct wave removal, signal reconstruction and separation. Subsequently, a database is developed from a series of theoretical hyperbolas and then compared with the extracted hyperbola outlines. Finally, the rebar depth and size are determined by searching for the closest counterpart in the database. The obtained results are very promising and indicate that: (1) implementing the method presented in this paper can completely remove the direct wave noise from the GPR data, and can successfully extract the outlines from the interlaced hyperbolas; and (2) the proposed method can simultaneously determine the rebar depth and size with the accuracy of 100% and 95.11%, respectively.

[5]  arXiv:2005.09645 [pdf, other]
Title: The Second Type of Uncertainty in Monte Carlo Tree Search
Comments: arXiv admin note: text overlap with arXiv:1805.09218
Subjects: Artificial Intelligence (cs.AI)

Monte Carlo Tree Search (MCTS) efficiently balances exploration and exploitation in tree search based on count-derived uncertainty. However, these local visit counts ignore a second type of uncertainty induced by the size of the subtree below an action. We first show how, due to the lack of this second uncertainty type, MCTS may completely fail in well-known sparse exploration problems, known from the reinforcement learning community. We then introduce a new algorithm, which estimates the size of the subtree below an action, and leverages this information in the UCB formula to better direct exploration. Subsequently, we generalize these ideas by showing that loops, i.e., the repeated occurrence of (approximately) the same state in the same trace, are actually a special case of subtree depth variation. Testing on a variety of tasks shows that our algorithms increase sample efficiency, especially when the planning budget per timestep is small.

[6]  arXiv:2005.09649 [pdf, other]
Title: Embeddings-Based Clustering for Target Specific Stances: The Case of a Polarized Turkey
Comments: arXiv admin note: text overlap with arXiv:1909.10213
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Computers and Society (cs.CY)

On June 24, 2018, Turkey conducted a highly consequential election in which the Turkish people elected their president and parliament in the first election under a new presidential system. During the election period, the Turkish people extensively shared their political opinions on Twitter. One aspect of polarization among the electorate was support for or opposition to the reelection of Recep Tayyip Erdo\u{g}an. In this paper, we present an unsupervised method for target-specific stance detection in a polarized setting, specifically Turkish politics, achieving 90% precision in identifying user stances, while maintaining more than 80% recall. The method involves representing users in an embedding space using Google's Convolutional Neural Network (CNN) based multilingual universal sentence encoder. The representations are then projected onto a lower dimensional space in a manner that reflects similarities and are consequently clustered. We show the effectiveness of our method in properly clustering users of divergent groups across multiple targets that include political figures, different groups, and parties. We perform our analysis on a large dataset of 108M Turkish election-related tweets along with the timeline tweets of 168k Turkish users, who authored 213M tweets. Given the resultant user stances, we are able to observe correlations between topics and compute topic polarization.

[7]  arXiv:2005.09674 [pdf, other]
Title: Tensor completion via nonconvex tensor ring rank minimization with guaranteed convergence
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In recent studies, the tensor ring (TR) rank has shown high effectiveness in tensor completion due to its ability of capturing the intrinsic structure within high-order tensors. A recently proposed TR rank minimization method is based on the convex relaxation by penalizing the weighted sum of nuclear norm of TR unfolding matrices. However, this method treats each singular value equally and neglects their physical meanings, which usually leads to suboptimal solutions in practice. In this paper, we propose to use the logdet-based function as a nonconvex smooth relaxation of the TR rank for tensor completion, which can more accurately approximate the TR rank and better promote the low-rankness of the solution. To solve the proposed nonconvex model efficiently, we develop an alternating direction method of multipliers algorithm and theoretically prove that, under some mild assumptions, our algorithm converges to a stationary point. Extensive experiments on color images, multispectral images, and color videos demonstrate that the proposed method outperforms several state-of-the-art competitors in both visual and quantitative comparison. Key words: nonconvex optimization, tensor ring rank, logdet function, tensor completion, alternating direction method of multipliers.

[8]  arXiv:2005.09679 [pdf, other]
Title: Boussinesq-Peregrine water wave models and their numerical approximation
Subjects: Numerical Analysis (math.NA)

In this paper we consider the numerical solution of Boussinesq-Peregrine type systems by the application of the Galerkin finite element method. The structure of the Boussinesq systems is explained and certain alternative nonlinear and dispersive terms are compared. A detailed study of the convergence properties of the standard Galerkin method, using various finite element spaces on unstructured triangular grids, is presented.
Along with the study of the Peregrine system, a new Boussinesq system of BBM-BBM type is derived. The new system has the same structure in its momentum equation but differs slightly in the mass conservation equation compared to the Peregrine system. Further, the finite element method applied to the new system has better convergence properties, when used for its numerical approximation.
Due to the lack of analytical formulas for solitary wave solutions for the systems under consideration, a Galerkin finite element method combined with the Petviashvili iteration is proposed for the numerical generation of accurate approximations of line solitary waves. Various numerical experiments related to the propagation of solitary and periodic waves over variable bottom topography and their interaction with the boundaries of the domains are presented. We conclude that both systems have similar accuracy when approximate long waves of small amplitude while the Galerkin finite element method is more effective when applied to BBM-BBM type systems.

[9]  arXiv:2005.09681 [pdf, other]
Title: Representation Learning with Fine-grained Patterns
Subjects: Computer Vision and Pattern Recognition (cs.CV)

With the development of computational power and techniques for data collection, deep learning demonstrates a superior performance over many existing algorithms on benchmark data sets. Many efforts have been devoted to studying the mechanism of deep learning. One of important observations is that deep learning can learn the discriminative patterns from raw materials directly in a task-dependent manner. It makes the patterns obtained by deep learning outperform hand-crafted features significantly. However, those patterns can be misled by the training task when the target task is different. In this work, we investigate a prevalent problem in real-world applications, where the training set only accesses to the supervised information from superclasses but the target task is defined on fine-grained classes. Each superclass can contain multiple fine-grained classes. In this scenario, fine-grained patterns are essential to classify examples from fine-grained classes while they can be neglected when training only with labels from superclasses. To mitigate the challenge, we propose the algorithm to explore the fine-grained patterns sufficiently without additional supervised information. Besides, our analysis indicates that the performance of learned patterns on the fine-grained classes can be theoretically guaranteed. Finally, an efficient algorithm is developed to reduce the cost of optimization. The experiments on real-world data sets verify that the propose algorithm can significantly improve the performance on the fine-grained classes with information from superclasses only.

[10]  arXiv:2005.09683 [pdf, other]
Title: Neural Collaborative Filtering vs. Matrix Factorization Revisited
Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG); Machine Learning (stat.ML)

Embedding based models have been the state of the art in collaborative filtering for over a decade. Traditionally, the dot product or higher order equivalents have been used to combine two or more embeddings, e.g., most notably in matrix factorization. In recent years, it was suggested to replace the dot product with a learned similarity e.g. using a multilayer perceptron (MLP). This approach is often referred to as neural collaborative filtering (NCF). In this work, we revisit the experiments of the NCF paper that popularized learned similarities using MLPs. First, we show that with a proper hyperparameter selection, a simple dot product substantially outperforms the proposed learned similarities. Second, while a MLP can in theory approximate any function, we show that it is non-trivial to learn a dot product with an MLP. Finally, we discuss practical issues that arise when applying MLP based similarities and show that MLPs are too costly to use for item recommendation in production environments while dot products allow to apply very efficient retrieval algorithms. We conclude that MLPs should be used with care as embedding combiner and that dot products might be a better default choice.

[11]  arXiv:2005.09704 [pdf, other]
Title: Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting
Comments: CVPR 2020 oral paper. 22 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

Recently data-driven image inpainting methods have made inspiring progress, impacting fundamental image editing tasks such as object removal and damaged image repairing. These methods are more effective than classic approaches, however, due to memory limitations they can only handle low-resolution inputs, typically smaller than 1K. Meanwhile, the resolution of photos captured with mobile devices increases up to 8K. Naive up-sampling of the low-resolution inpainted result can merely yield a large yet blurry result. Whereas, adding a high-frequency residual image onto the large blurry image can generate a sharp result, rich in details and textures. Motivated by this, we propose a Contextual Residual Aggregation (CRA) mechanism that can produce high-frequency residuals for missing contents by weighted aggregating residuals from contextual patches, thus only requiring a low-resolution prediction from the network. Since convolutional layers of the neural network only need to operate on low-resolution inputs and outputs, the cost of memory and computing power is thus well suppressed. Moreover, the need for high-resolution training datasets is alleviated. In our experiments, we train the proposed model on small images with resolutions 512x512 and perform inference on high-resolution images, achieving compelling inpainting quality. Our model can inpaint images as large as 8K with considerable hole sizes, which is intractable with previous learning-based approaches. We further elaborate on the light-weight design of the network architecture, achieving real-time performance on 2K images on a GTX 1080 Ti GPU. Codes are available at: Atlas200dk/sample-imageinpainting-HiFill.

[12]  arXiv:2005.09714 [pdf, other]
Title: The Lazarus Effect: Healing Compromised Devices in the Internet of Small Things
Comments: In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security (ASIA CCS 20)
Subjects: Cryptography and Security (cs.CR)

We live in a time when billions of IoT devices are being deployed and increasingly relied upon. This makes ensuring their availability and recoverability in case of a compromise a paramount goal. The large and rapidly growing number of deployed IoT devices make manual recovery impractical, especially if the devices are dispersed over a large area. Thus, there is a need for a reliable and scalable remote recovery mechanism that works even after attackers have taken full control over devices, possibly misusing them or trying to render them useless.
To tackle this problem, we present Lazarus, a system that enables the remote recovery of compromised IoT devices. With Lazarus, an IoT administrator can remotely control the code running on IoT devices unconditionally and within a guaranteed time bound. This makes recovery possible even in case of severe corruption of the devices' software stack. We impose only minimal hardware requirements, making Lazarus applicable even for low-end constrained off-the-shelf IoT devices. We isolate Lazarus's minimal recovery trusted computing base from untrusted software both in time and by using a trusted execution environment. The temporal isolation prevents secrets from being leaked through side-channels to untrusted software. Inside the trusted execution environment, we place minimal functionality that constrains untrusted software at runtime.
We implement Lazarus on an ARM Cortex-M33-based microcontroller in a full setup with an IoT hub, device provisioning and secure update functionality. Our prototype can recover compromised embedded OSs and bare-metal applications and prevents attackers from bricking devices, for example, through flash wear out. We show this at the example of FreeRTOS, which requires no modifications but only a single additional task. Our evaluation shows negligible runtime performance impact and moderate memory requirements.

[13]  arXiv:2005.09721 [pdf, ps, other]
Title: Motif Discovery Algorithms in Static and Temporal Networks: A Survey
Subjects: Social and Information Networks (cs.SI); Combinatorics (math.CO); Physics and Society (physics.soc-ph)

Motifs are the fundamental components of complex systems. The topological structure of networks representing complex systems and the frequency and distribution of motifs in these networks are intertwined. The complexities associated with graph and subgraph isomorphism problems, as the core of frequent subgraph mining, have direct impacts on the performance of motif discovery algorithms. To cope with these complexities, researchers have adopted different strategies for candidate generation and enumeration, and frequency computation. In the past few years, there has been an increasing interest in the analysis and mining of temporal networks. These networks, in contrast to their static counterparts, change over time in the form of insertion, deletion, or substitution of edges or vertices or their attributes. In this paper, we provide a survey of motif discovery algorithms proposed in the literature for mining static and temporal networks and review the corresponding algorithms based on their adopted strategies for candidate generation and frequency computation. As we witness the generation of a large amount of network data in social media platforms, bioinformatics applications, and communication and transportation networks and the advance in distributed computing and big data technology, we also conduct a survey on the algorithms proposed to resolve the CPU-bound and I/O bound problems in mining static and temporal networks.

[14]  arXiv:2005.09723 [pdf, ps, other]
Title: High Velocity Kernel File Systems with Bento
Authors: Samantha Miller (1), Kaiyuan Zhang (1), Danyang Zhuo (2), Tom Anderson (1) ((1) University of Washington, (2) Duke University)
Comments: 14 pages, 4 figures
Subjects: Operating Systems (cs.OS)

High development velocity is critical for modern cloud systems. However, rapid development and release cycles have mostly skipped operating systems. Modifications to behavior in Linux, the most widely used server operating system in the cloud, must be done slowly to minimize risk of introducing bugs, be limited in scope, or be implemented in userspace with a potential performance penalty. We propose Bento, a framework for high velocity development of Linux kernel file systems. Bento is inspired by the recent availability of type-safe, non-garbage collected languages like Rust. It interposes a thin layer between kernel calls to the file system and file system calls back to the kernel, exposing alternative interfaces to enable kernel file systems written in safe Rust. Future work will provide support for online upgrades, userspace debugging, and composable filesystems. We evaluate Bento by using it to implement the xv6 file system and comparing against baselines written using the kernel VFS layer and FUSE. We find that the Bento filesystem achieves comparable performance to the VFS version and much better performance than the FUSE version. We also evaluate against ext4 on the macrobenchmarks and find that ext4 performs between 33% and 3.2x better than the Bento xv6 file system.

[15]  arXiv:2005.09724 [pdf, other]
Title: Scheduling Flows on a Switch to Optimize Response Times
Subjects: Data Structures and Algorithms (cs.DS)

We study the scheduling of flows on a switch with the goal of optimizing metrics related to the response time of the flows. The input to the problem is a sequence of flow requests on a switch, where the switch is represented by a bipartite graph with a capacity on each vertex (or port), and a flow request is an edge with associated demand. In each round, a subset of edges can be scheduled subject to the constraint that the total demand of the scheduled edges incident on any vertex is at most the capacity of the vertex. Previous work has essentially settled the complexity of metrics based on {\em completion time}. The objective of average or maximum {\em response time}, however, is much more challenging. To the best of our knowledge, there are no prior approximation algorithms results for these metrics in the context of flow scheduling.
We present the first approximation algorithms for flow scheduling over a switch to optimize response time based metrics. For the average response time metric, whose NP-hardness follows directly from past work, we present an offline $O(1 + O(\log(n))/c)$ approximation algorithm for unit flows, assuming that the port capacities of the switch can be increased by a factor of $1 + c$, for any given positive integer $c$. For the maximum response time metric, we first establish that it is NP-hard to achieve an approximation factor of better than 4/3 without augmenting capacity. We then present an offline algorithm that achieves {\em optimal maximum response time}, assuming the capacity of each port is increased by at most $2 d_{max} - 1$, where $d_{max}$ is the maximum demand of any flow. Both algorithms are based on linear programming relaxations. We also study the online version of flow scheduling using the lens of competitive analysis, and present preliminary results along with experiments that evaluate the performance of fast online heuristics.

[16]  arXiv:2005.09725 [pdf, other]
Title: Inverse problems with second-order Total Generalized Variation constraints
Comments: Published in 2011 as a conference proceeding. Uploaded in 2020 on arXiv to ensure availability: the original proceedings are no longer online
Journal-ref: Proceedings of the 9th International Conference on Sampling Theory and Applications (SampTA) 2011, Singapore (2011)
Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV)

Total Generalized Variation (TGV) has recently been introduced as penalty functional for modelling images with edges as well as smooth variations. It can be interpreted as a "sparse" penalization of optimal balancing from the first up to the $k$-th distributional derivative and leads to desirable results when applied to image denoising, i.e., $L^2$-fitting with TGV penalty. The present paper studies TGV of second order in the context of solving ill-posed linear inverse problems. Existence and stability for solutions of Tikhonov-functional minimization with respect to the data is shown and applied to the problem of recovering an image from blurred and noisy data.

[17]  arXiv:2005.09726 [pdf, other]
Title: Mmwave Beam Management in Urban Vehicular Networks
Journal-ref: IEEE Systems Journal, 2020
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)

Millimeter-wave (mmwave) communication represents a potential solution to capacity shortage in vehicular networks. However, effective beam alignment between senders and receivers requires accurate knowledge of the vehicles' position for fast beam steering, which is often impractical to obtain in real time. We address this problem by leveraging the traffic signals regulating vehicular mobility: as an example, we may coordinate beams with red traffic lights, as they correspond to higher vehicle densities and lower speeds. To evaluate our intuition, we propose a tractable, yet accurate, mmwave communication model accounting for both the distance and the heading of vehicles being served. Using such a model, we optimize the beam design and define a low-complexity, heuristic strategy. For increased realism, we consider as reference scenario a large-scale, real-world mobility trace of vehicles in Luxembourg. The results show that our approach closely matches the optimum and always outperforms static beam design based on road topology alone. Remarkably, it also yields better performance than solutions based on real-time mobility information.

[18]  arXiv:2005.09727 [pdf, other]
Title: Ventral-Dorsal Neural Networks: Object Detection via Selective Attention
Comments: in Proceedings of WACV. arXiv admin note: substantial text overlap with arXiv:2005.07787
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Deep Convolutional Neural Networks (CNNs) have been repeatedly proven to perform well on image classification tasks. Object detection methods, however, are still in need of significant improvements. In this paper, we propose a new framework called Ventral-Dorsal Networks (VDNets) which is inspired by the structure of the human visual system. Roughly, the visual input signal is analyzed along two separate neural streams, one in the temporal lobe and the other in the parietal lobe. The coarse functional distinction between these streams is between object recognition -- the "what" of the signal -- and extracting location related information -- the "where" of the signal. The ventral pathway from primary visual cortex, entering the temporal lobe, is dominated by "what" information, while the dorsal pathway, into the parietal lobe, is dominated by "where" information. Inspired by this structure, we propose the integration of a "Ventral Network" and a "Dorsal Network", which are complementary. Information about object identity can guide localization, and location information can guide attention to relevant image regions, improving object recognition. This new dual network framework sharpens the focus of object detection. Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches on PASCAL VOC 2007 by 8% (mAP) and PASCAL VOC 2012 by 3% (mAP). Moreover, a comparison of techniques on Yearbook images displays substantial qualitative and quantitative benefits of VDNet.

[19]  arXiv:2005.09740 [pdf]
Title: GLEAKE: Global and Local Embedding Automatic Keyphrase Extraction
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

Automated methods for granular categorization of large corpora of text documents have become increasingly more important with the rate scientific, news, medical, and web documents are growing in the last few years. Automatic keyphrase extraction (AKE) aims to automatically detect a small set of single or multi-words from within a single textual document that captures the main topics of the document. AKE plays an important role in various NLP and information retrieval tasks such as document summarization and categorization, full-text indexing, and article recommendation. Due to the lack of sufficient human-labeled data in different textual contents, supervised learning approaches are not ideal for automatic detection of keyphrases from the content of textual bodies. With the state-of-the-art advances in text embedding techniques, NLP researchers have focused on developing unsupervised methods to obtain meaningful insights from raw datasets. In this work, we introduce Global and Local Embedding Automatic Keyphrase Extractor (GLEAKE) for the task of AKE. GLEAKE utilizes single and multi-word embedding techniques to explore the syntactic and semantic aspects of the candidate phrases and then combines them into a series of embedding-based graphs. Moreover, GLEAKE applies network analysis techniques on each embedding-based graph to refine the most significant phrases as a final set of keyphrases. We demonstrate the high performance of GLEAKE by evaluating its results on five standard AKE datasets from different domains and writing styles and by showing its superiority with regards to other state-of-the-art methods.

[20]  arXiv:2005.09741 [pdf, other]
Title: Data-driven feedback stabilization of nonlinear systems: Koopman-based model predictive control
Comments: 11 pages
Subjects: Systems and Control (eess.SY)

In this work, a predictive control framework is presented for feedback stabilization of nonlinear systems. To achieve this, we integrate Koopman operator theory with Lyapunov-based model predictive control (LMPC). The main idea is to transform nonlinear dynamics from state-space to function space using Koopman eigenfunctions - for control affine systems this results in a bilinear model in the (lifted) function space. Then, a predictive controller is formulated in Koopman eigenfunction coordinates which uses an auxiliary Control Lyapunov Function (CLF) based bounded controller as a constraint to ensure stability of the Koopman system in the function space. Provided there exists a continuously differentiable inverse mapping between the original state-space and (lifted) function space, we show that the designed controller is capable of translating the feedback stabilizability of the Koopman bilinear system to the original nonlinear system. Remarkably, the feedback control design proposed in this work remains completely data-driven and does not require any explicit knowledge of the original system. Furthermore, due to the bilinear structure of the Koopman model, seeking a CLF is no longer a bottleneck for LMPC. Benchmark numerical examples demonstrate the utility of the proposed feedback control design.

[21]  arXiv:2005.09743 [pdf, ps, other]
Title: On Evaluating Weakly Supervised Action Segmentation Methods
Comments: Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Action segmentation is the task of temporally segmenting every frame of an untrimmed video. Weakly supervised approaches to action segmentation, especially from transcripts have been of considerable interest to the computer vision community. In this work, we focus on two aspects of the use and evaluation of weakly supervised action segmentation approaches that are often overlooked: the performance variance over multiple training runs and the impact of selecting feature extractors for this task.To tackle the first problem, we train each method on the Breakfast dataset 5 times and provide average and standard deviation of the results. Our experiments show that the standard deviation over these repetitions is between 1 and 2.5% and significantly affects the comparison between different approaches. Furthermore, our investigation on feature extraction shows that, for the studied weakly-supervised action segmentation methods, higher-level I3D features perform worse than classical IDT features.

[22]  arXiv:2005.09745 [pdf, ps, other]
Title: Optimal Resource Allocation for Elastic and Inelastic Jobs
Subjects: Performance (cs.PF)

Modern data centers are tasked with processing heterogeneous workloads consisting of various classes of jobs. These classes differ in their arrival rates, size distributions, and job parallelizability. With respect to paralellizability, some jobs are elastic, meaning they can parallelize linearly across many servers. Other jobs are inelastic, meaning they can only run on a single server. Although job classes can differ drastically, they are typically forced to share a single cluster. When sharing a cluster among heterogeneous jobs, one must decide how to allocate servers to each job at every moment in time. In this paper, we design and analyze allocation policies which aim to minimize the mean response time across jobs, where a job's response time is the time from when it arrives until it completes.
We model this problem in a stochastic setting where each job may be elastic or inelastic. Job sizes are drawn from exponential distributions, but are unknown to the system. We show that, in the common case where elastic jobs are larger on average than inelastic jobs, the optimal allocation policy is Inelastic-First, giving inelastic jobs preemptive priority over elastic jobs. We obtain this result by introducing a novel sample path argument. We also show that there exist cases where Elastic-First (giving priority to elastic jobs) performs better than Inelastic-First. We then provide the first analysis of mean response time under both Elastic-First and Inelastic-First by leveraging recent techniques for solving high-dimensional Markov chains.

[23]  arXiv:2005.09747 [pdf]
Title: An Efficient Machine-Learning Approach for PDF Tabulation in Turbulent Combustion Closure
Journal-ref: Combustion Science and Technology, 1-20 (2019)
Subjects: Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG); Fluid Dynamics (physics.flu-dyn)

Probability density function (PDF) based turbulent combustion modelling is limited by the need to store multi-dimensional PDF tables that can take up large amounts of memory. A significant saving in storage can be achieved by using various machine-learning techniques that represent the thermo-chemical quantities of a PDF table using mathematical functions. These functions can be computationally more expensive than the existing interpolation methods used for thermo-chemical quantities. More importantly, the training time can amount to a considerable portion of the simulation time. In this work, we address these issues by introducing an adaptive training algorithm that relies on multi-layer perception (MLP) neural networks for regression and self-organizing maps (SOMs) for clustering data to tabulate using different networks. The algorithm is designed to address both the multi-dimensionality of the PDF table as well as the computational efficiency of the proposed algorithm. SOM clustering divides the PDF table into several parts based on similarities in data. Each cluster of data is trained using an MLP algorithm on simple network architectures to generate local functions for thermo-chemical quantities. The algorithm is validated for the so-called DLR-A turbulent jet diffusion flame using both RANS and LES simulations and the results of the PDF tabulation are compared to the standard linear interpolation method. The comparison yields a very good agreement between the two tabulation techniques and establishes the MLP-SOM approach as a viable method for PDF tabulation.

[24]  arXiv:2005.09748 [pdf, other]
Title: The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework
Subjects: Hardware Architecture (cs.AR)

Computers continue to diversify with respect to system designs, emerging memory technologies, and application memory demands. Unfortunately, continually adapting the conventional virtual memory framework to each possible system configuration is challenging, and often results in performance loss or requires non-trivial workarounds. To address these challenges, we propose a new virtual memory framework, the Virtual Block Interface (VBI). We design VBI based on the key idea that delegating memory management duties to hardware can reduce the overheads and software complexity associated with virtual memory. VBI introduces a set of variable-sized virtual blocks (VBs) to applications. Each VB is a contiguous region of the globally-visible VBI address space, and an application can allocate each semantically meaningful unit of information (e.g., a data structure) in a separate VB. VBI decouples access protection from memory allocation and address translation. While the OS controls which programs have access to which VBs, dedicated hardware in the memory controller manages the physical memory allocation and address translation of the VBs. This approach enables several architectural optimizations to (1) efficiently and flexibly cater to different and increasingly diverse system configurations, and (2) eliminate key inefficiencies of conventional virtual memory. We demonstrate the benefits of VBI with two important use cases: (1) reducing the overheads of address translation (for both native execution and virtual machine environments), as VBI reduces the number of translation requests and associated memory accesses; and (2) two heterogeneous main memory architectures, where VBI increases the effectiveness of managing fast memory regions. For both cases, VBI significanttly improves performance over conventional virtual memory.

[25]  arXiv:2005.09752 [pdf, other]
Title: Learning Representations using Spectral-Biased Random Walks on Graphs
Comments: Accepted at IJCNN 2020: International Joint Conference on Neural Networks
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)

Several state-of-the-art neural graph embedding methods are based on short random walks (stochastic processes) because of their ease of computation, simplicity in capturing complex local graph properties, scalability, and interpretibility. In this work, we are interested in studying how much a probabilistic bias in this stochastic process affects the quality of the nodes picked by the process. In particular, our biased walk, with a certain probability, favors movement towards nodes whose neighborhoods bear a structural resemblance to the current node's neighborhood. We succinctly capture this neighborhood as a probability measure based on the spectrum of the node's neighborhood subgraph represented as a normalized laplacian matrix. We propose the use of a paragraph vector model with a novel Wasserstein regularization term. We empirically evaluate our approach against several state-of-the-art node embedding techniques on a wide variety of real-world datasets and demonstrate that our proposed method significantly improves upon existing methods on both link prediction and node classification tasks.

[26]  arXiv:2005.09755 [pdf, other]
Title: Adapting a Kidney Exchange Algorithm to Align with Human Values
Journal-ref: Artificial Intelligence 283 (2020) 103261
Subjects: Artificial Intelligence (cs.AI)

The efficient and fair allocation of limited resources is a classical problem in economics and computer science. In kidney exchanges, a central market maker allocates living kidney donors to patients in need of an organ. Patients and donors in kidney exchanges are prioritized using ad-hoc weights decided on by committee and then fed into an allocation algorithm that determines who gets what--and who does not. In this paper, we provide an end-to-end methodology for estimating weights of individual participant profiles in a kidney exchange. We first elicit from human subjects a list of patient attributes they consider acceptable for the purpose of prioritizing patients (e.g., medical characteristics, lifestyle choices, and so on). Then, we ask subjects comparison queries between patient profiles and estimate weights in a principled way from their responses. We show how to use these weights in kidney exchange market clearing algorithms. We then evaluate the impact of the weights in simulations and find that the precise numerical values of the weights we computed matter little, other than the ordering of profiles that they imply. However, compared to not prioritizing patients at all, there is a significant effect, with certain classes of patients being (de)prioritized based on the human-elicited value judgments.

[27]  arXiv:2005.09758 [pdf, other]
Title: Predictor Antenna Systems: Exploiting Channel State Information for Vehicle Communications
Authors: Hao Guo
Comments: Licentiate thesis, Chalmers University of Technology
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

Vehicle communication is one of the most important use cases in the fifth generation of wireless networks (5G). The growing demand for quality of service (QoS) characterized by performance metrics, such as spectrum efficiency, peak data rate, and outage probability, is mainly limited by inaccurate prediction/estimation of channel state information (CSI) of the rapidly changing environment around moving vehicles. One way to increase the prediction horizon of CSI in order to improve the QoS is deploying predictor antennas (PAs). A PA system consists of two sets of antennas typically mounted on the roof of a vehicle, where the PAs positioned at the front of the vehicle are used to predict the CSI observed by the receive antennas (RAs) that are aligned behind the PAs. In realistic PA systems, however, the actual benefit is affected by a variety of factors, including spatial mismatch, antenna utilization, temporal correlation of scattering environment, and CSI estimation error. This thesis investigates different resource allocation schemes for the PA systems under practical constraints.

[28]  arXiv:2005.09774 [pdf, other]
Title: Weak and Semi-Contraction Theory with Application to Network Systems
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)

We develop two generalizations of contraction theory: semi-contraction and weak-contraction theory. First, using the notion of semi-norm, we propose a geometric framework for semi-contraction theory. We introduce matrix semi-measures, characterize their properties, and provide techniques to compute them. We also propose two optimization problems for the spectral abscissa of a given matrix using its semi-measures. For dynamical systems, we use the semi-measure of their Jacobian to study convergence of their trajectories to invariant subspaces. Second, we provide a comprehensive treatment of weakly-contracting systems; we prove a dichotomy for asymptotic behavior of their trajectories and show that, for their equilibrium points, local asymptotic stability implies global asymptotic stability. Third, we introduce the class of doubly-contracting systems and show that every trajectory of a doubly-contracting system converges to an equilibrium point. Finally, we apply our results to various important network systems including affine averaging and affine flow systems, continuous-time distributed primal-dual algorithms, and networks of diffusively-coupled oscillators.

[29]  arXiv:2005.09781 [pdf, other]
Title: SwarmControl: An Automated Distributed Control Framework for Self-Optimizing Drone Networks
Subjects: Networking and Internet Architecture (cs.NI)

Networks of Unmanned Aerial Vehicles (UAVs), composed of hundreds, possibly thousands of highly mobile and wirelessly connected flying drones will play a vital role in future Internet of Things (IoT) and 5G networks. However, how to control UAV networks in an automated and scalable fashion in distributed, interference-prone, and potentially adversarial environments is still an open research problem. This article introduces SwarmControl, a new software-defined control framework for UAV wireless networks based on distributed optimization principles. In essence, SwarmControl provides the Network Operator (NO) with a unified centralized abstraction of the networking and flight control functionalities. High-level control directives are then automatically decomposed and converted into distributed network control actions that are executed through programmable software-radio protocol stacks. SwarmControl (i) constructs a network control problem representation of the directives of the NO; (ii) decomposes it into a set of distributed sub-problems; and (iii) automatically generates numerical solution algorithms to be executed at individual UAVs. We present a prototype of an SDR-based, fully reconfigurable UAV network platform that implements the proposed control framework, based on which we assess the effectiveness and flexibility of SwarmControl with extensive flight experiments. Results indicate that the SwarmControl framework enables swift reconfiguration of the network control functionalities, and it can achieve an average throughput gain of 159% compared to the state-of-the-art solutions.

[30]  arXiv:2005.09784 [pdf, other]
Title: Images and Misinformation in Political Groups: Evidence from WhatsApp in India
Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY)

WhatsApp is a key medium for the spread of news and rumors, often shared as images. We study a large collection of politically-oriented WhatsApp groups in India, focusing on the period leading up to the 2019 Indian national elections. By labeling samples of random and popular images, we find that around 13% of shared images are known misinformation and most fall into three types of images. Machine learning methods can be used to predict whether a viral image is misinformation, but are brittle to shifts in content over time.

[31]  arXiv:2005.09787 [pdf, other]
Title: Self-Updating Models with Error Remediation
Comments: 17 pages, 13 figures, published in the proceedings of the Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications II conference in the SPIE Defense + Commercial Sensing, 2020 symposium
Journal-ref: Proc. SPIE 11413, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications II, 114131W (18 May 2020)
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Many environments currently employ machine learning models for data processing and analytics that were built using a limited number of training data points. Once deployed, the models are exposed to significant amounts of previously-unseen data, not all of which is representative of the original, limited training data. However, updating these deployed models can be difficult due to logistical, bandwidth, time, hardware, and/or data sensitivity constraints. We propose a framework, Self-Updating Models with Error Remediation (SUMER), in which a deployed model updates itself as new data becomes available. SUMER uses techniques from semi-supervised learning and noise remediation to iteratively retrain a deployed model using intelligently-chosen predictions from the model as the labels for new training iterations. A key component of SUMER is the notion of error remediation as self-labeled data can be susceptible to the propagation of errors. We investigate the use of SUMER across various data sets and iterations. We find that self-updating models (SUMs) generally perform better than models that do not attempt to self-update when presented with additional previously-unseen data. This performance gap is accentuated in cases where there is only limited amounts of initial training data. We also find that the performance of SUMER is generally better than the performance of SUMs, demonstrating a benefit in applying error remediation. Consequently, SUMER can autonomously enhance the operational capabilities of existing data processing systems by intelligently updating models in dynamic environments.

[32]  arXiv:2005.09790 [pdf, other]
Title: Layer 2 Atomic Cross-Blockchain Function Calls
Comments: 8 pages, 5 figures
Subjects: Cryptography and Security (cs.CR)

The Layer 2 Atomic Cross-Blockchain Function Calls protocol allows composable programming across Ethereum blockchains. It allows for inter-contract and inter-blockchain function calls that are both synchronous and atomic: if one part fails, the whole call graph of function calls is rolled back. The only existing atomic cross-blockchain function call protocol is a Blockchain Layer 1 protocol, which requires changes to the blockchain client software to operate. Blockchain Layer 2 technologies such as the one described in this paper require no such changes. They operate on top of the infrastructure provided by the blockchain client software. This paper introduces the protocol, provides an initial safety and liveness analysis, and presents the expected overhead of using this technology when compared to using non-atomic single blockchain transactions. The protocol's atomic behaviour comes at the cost of additional transactions. On the Root Blockchain, three transactions plus one transaction for each Segment Blockchain where there are no state updates and two transactions for each Segment Blockchain where there are state updates is required.

[33]  arXiv:2005.09796 [pdf, ps, other]
Title: List Decodable Mean Estimation in Nearly Linear Time
Subjects: Data Structures and Algorithms (cs.DS)

Learning from data in the presence of outliers is a fundamental problem in statistics. Until recently, no computationally efficient algorithms were known to compute the mean of a high dimensional distribution under natural assumptions in the presence of even a small fraction of outliers. In this paper, we consider robust statistics in the presence of overwhelming outliers where the majority of the dataset is introduced adversarially. With only an $\alpha < 1/2$ fraction of "inliers" (clean data) the mean of a distribution is unidentifiable. However, in their influential work, [CSV17] introduces a polynomial time algorithm recovering the mean of distributions with bounded covariance by outputting a succinct list of $O(1/\alpha)$ candidate solutions, one of which is guaranteed to be close to the true distributional mean; a direct analog of 'List Decoding' in the theory of error correcting codes. In this work, we develop an algorithm for list decodable mean estimation in the same setting achieving up to constants the information theoretically optimal recovery, optimal sample complexity, and in nearly linear time up to polylogarithmic factors in dimension. Our conceptual innovation is to design a descent style algorithm on a nonconvex landscape, iteratively removing minima to generate a succinct list of solutions. Our runtime bottleneck is a saddle-point optimization for which we design custom primal dual solvers for generalized packing and covering SDP's under Ky-Fan norms, which may be of independent interest.

[34]  arXiv:2005.09800 [pdf, other]
Title: Fingerprinting Encrypted Voice Traffic on Smart Speakers with Deep Learning
Journal-ref: 13th ACM Conference on Security and Privacy in Wireless and Mobile Networks (WiSec '20), July 8--10, 2020, Linz (Virtual Event), Austria
Subjects: Cryptography and Security (cs.CR)

This paper investigates the privacy leakage of smart speakers under an encrypted traffic analysis attack, referred to as voice command fingerprinting. In this attack, an adversary can eavesdrop both outgoing and incoming encrypted voice traffic of a smart speaker, and infers which voice command a user says over encrypted traffic. We first built an automatic voice traffic collection tool and collected two large-scale datasets on two smart speakers, Amazon Echo and Google Home. Then, we implemented proof-of-concept attacks by leveraging deep learning. Our experimental results over the two datasets indicate disturbing privacy concerns. Specifically, compared to 1% accuracy with random guess, our attacks can correctly infer voice commands over encrypted traffic with 92.89\% accuracy on Amazon Echo. Despite variances that human voices may cause on outgoing traffic, our proof-of-concept attacks remain effective even only leveraging incoming traffic (i.e., the traffic from the server). This is because the AI-based voice services running on the server side response commands in the same voice and with a deterministic or predictable manner in text, which leaves distinguishable pattern over encrypted traffic. We also built a proof-of-concept defense to obfuscate encrypted traffic. Our results show that the defense can effectively mitigate attack accuracy on Amazon Echo to 32.18%.

[35]  arXiv:2005.09801 [pdf]
Title: FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval
Comments: 10 pages, to be published in SIGIR20 Industry Track
Subjects: Information Retrieval (cs.IR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

In this paper, we address the text and image matching in cross-modal retrieval of the fashion industry. Different from the matching in the general domain, the fashion matching is required to pay much more attention to the fine-grained information in the fashion images and texts. Pioneer approaches detect the region of interests (i.e., RoIs) from images and use the RoI embeddings as image representations. In general, RoIs tend to represent the "object-level" information in the fashion images, while fashion texts are prone to describe more detailed information, e.g. styles, attributes. RoIs are thus not fine-grained enough for fashion text and image matching. To this end, we propose FashionBERT, which leverages patches as image features. With the pre-trained BERT model as the backbone network, FashionBERT learns high level representations of texts and images. Meanwhile, we propose an adaptive loss to trade off multitask learning in the FashionBERT modeling. Two tasks (i.e., text and image matching and cross-modal retrieval) are incorporated to evaluate FashionBERT. On the public dataset, experiments demonstrate FashionBERT achieves significant improvements in performances than the baseline and state-of-the-art approaches. In practice, FashionBERT is applied in a concrete cross-modal retrieval application. We provide the detailed matching performance and inference efficiency analysis.

[36]  arXiv:2005.09803 [pdf, other]
Title: A Computational Analysis of Polarization on Indian and Pakistani Social Media
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL)

Between February 14, 2019 and March 4, 2019, a terrorist attack in Pulwama, Kashmir followed by retaliatory air strikes led to rising tensions between India and Pakistan, two nuclear-armed countries. In this work, we examine polarizing messaging on Twitter during these events, particularly focusing on the positions of Indian and Pakistani politicians. We use label propagation technique focused on hashtag cooccurences to find polarizing tweets and users. Our analysis reveals that politicians in the ruling political party in India (BJP) used polarized hashtags and called for escalation of conflict more so than politicians from other parties. Our work offers the first analysis of how escalating tensions between India and Pakistan manifest on Twitter and provides a framework for studying polarizing messages.

[37]  arXiv:2005.09806 [pdf, other]
Title: Coverage Analysis of Net Inscriptions in Coloured Petri Net Models
Comments: Technical Report
Subjects: Software Engineering (cs.SE)

High-level Petri net such as Coloured Petri Nets (CPNs) are characterised by the combination of Petri nets and a high-level programming language. In the context of CPNs and CPN Tools, the inscriptions (e.g., arc expressions and guards) are specified using Standard ML (SML). The application of simulation and state space exploration (SSE) for validating CPN models traditionally focuses on behavioural properties related to net structure, i.e., places and transitions. This means that the net inscriptions are only implicitly validated, and the extent to which these have been covered is not made explicit. The contribution of this paper is an approach that establishes a link between coverage analysis known from programming languages and net inscriptions of CPN models. Specifically, we consider Modified Condition/Decision Coverage (MC/DC) which generalises branch coverage of SML decisions. We have implemented our approach in a library for CPN Tools comprised of an annotation and instrumentation mechanism that transparently intercept and collect evaluations of Boolean conditions, and a post-processing tool that determines whether each decision is MC/DC-covered by a set of models executions (runs). We evaluate our approach on four larger public-available CPN models.

[38]  arXiv:2005.09807 [pdf, other]
Title: Neural Ordinary Differential Equation based Recurrent Neural Network Model
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Neural differential equations are a promising new member in the neural network family. They show the potential of differential equations for time series data analysis. In this paper, the strength of the ordinary differential equation (ODE) is explored with a new extension. The main goal of this work is to answer the following questions: (i)~can ODE be used to redefine the existing neural network model? (ii)~can Neural ODEs solve the irregular sampling rate challenge of existing neural network models for a continuous time series, i.e., length and dynamic nature, (iii)~how to reduce the training and evaluation time of existing Neural ODE systems? This work leverages the mathematical foundation of ODEs to redesign traditional RNNs such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU). The main contribution of this paper is to illustrate the design of two new ODE-based RNN models (GRU-ODE model and LSTM-ODE) which can compute the hidden state and cell state at any point of time using an ODE solver. These models reduce the computation overhead of hidden state and cell state by a vast amount. The performance evaluation of these two new models for learning continuous time series with irregular sampling rate is then demonstrated. Experiments show that these new ODE based RNN models require less training time than Latent ODEs and conventional Neural ODEs. They can achieve higher accuracy quickly, and the design of the neural network is simpler than, previous neural ODE systems.

[39]  arXiv:2005.09810 [pdf, other]
Title: $p$-Norm Flow Diffusion for Local Graph Clustering
Comments: 29 pages, 5 figures, 3 tables
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Optimization and Control (math.OC); Machine Learning (stat.ML)

Local graph clustering and the closely related seed set expansion problem are primitives on graphs that are central to a wide range of analytic and learning tasks such as local clustering, community detection, nodes ranking and feature inference. Prior work on local graph clustering mostly falls into two categories with numerical and combinatorial roots respectively. In this work, we draw inspiration from both fields and propose a family of convex optimization formulations based on the idea of diffusion with p-norm network flow for $p\in (1,\infty)$. In the context of local clustering, we characterize the optimal solutions for these optimization problems and show their usefulness in finding low conductance cuts around input seed set. In particular, we achieve quadratic approximation of conductance in the case of $p=2$ similar to the Cheeger-type bounds of spectral methods, constant factor approximation when $p\rightarrow\infty$ similar to max-flow based methods, and a smooth transition for general $p$ values in between. Thus, our optimization formulation can be viewed as bridging the numerical and combinatorial approaches, and we can achieve the best of both worlds in terms of speed and noise robustness. We show that the proposed problem can be solved in strongly local running time for $p\ge 2$ and conduct empirical evaluations on both synthetic and real-world graphs to illustrate our approach compares favorably with existing methods.

[40]  arXiv:2005.09812 [pdf, other]
Title: Active Speakers in Context
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Current methods for active speak er detection focus on modeling short-term audiovisual information from a single speaker. Although this strategy can be enough for addressing single-speaker scenarios, it prevents accurate detection when the task is to identify who of many candidate speakers are talking. This paper introduces the Active Speaker Context, a novel representation that models relationships between multiple speakers over long time horizons. Our Active Speaker Context is designed to learn pairwise and temporal relations from an structured ensemble of audio-visual observations. Our experiments show that a structured feature ensemble already benefits the active speaker detection performance. Moreover, we find that the proposed Active Speaker Context improves the state-of-the-art on the AVA-ActiveSpeaker dataset achieving a mAP of 87.1%. We present ablation studies that verify that this result is a direct consequence of our long-term multi-speaker analysis.

[41]  arXiv:2005.09814 [pdf, other]
Title: Mirror Descent Policy Optimization
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

We propose deep Reinforcement Learning (RL) algorithms inspired by mirror descent, a well-known first-order trust region optimization method for solving constrained convex problems. Our approach, which we call as Mirror Descent Policy Optimization (MDPO), is based on the idea of iteratively solving a `trust-region' problem that minimizes a sum of two terms: a linearization of the objective function and a proximity term that restricts two consecutive updates to be close to each other. Following this approach we derive on-policy and off-policy variants of the MDPO algorithm and analyze their performance while emphasizing important implementation details, motivated by the existing theoretical framework. We highlight the connections between on-policy MDPO and two popular trust region RL algorithms: TRPO and PPO, and conduct a comprehensive empirical comparison of these algorithms. We then derive off-policy MDPO and compare its performance to existing approaches. Importantly, we show that the theoretical framework of MDPO can be scaled to deep RL while achieving good performance on popular benchmarks.

[42]  arXiv:2005.09816 [pdf, other]
Title: Relevant Region Prediction for Crowd Counting
Comments: accepted by Neurocomputing
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Crowd counting is a concerned and challenging task in computer vision. Existing density map based methods excessively focus on the individuals' localization which harms the crowd counting performance in highly congested scenes. In addition, the dependency between the regions of different density is also ignored. In this paper, we propose Relevant Region Prediction (RRP) for crowd counting, which consists of the Count Map and the Region Relation-Aware Module (RRAM). Each pixel in the count map represents the number of heads falling into the corresponding local area in the input image, which discards the detailed spatial information and forces the network pay more attention to counting rather than localizing individuals. Based on the Graph Convolutional Network (GCN), Region Relation-Aware Module is proposed to capture and exploit the important region dependency. The module builds a fully connected directed graph between the regions of different density where each node (region) is represented by weighted global pooled feature, and GCN is learned to map this region graph to a set of relation-aware regions representations. Experimental results on three datasets show that our method obviously outperforms other existing state-of-the-art methods.

[43]  arXiv:2005.09819 [pdf, other]
Title: Distributed Dynamic Economic Dispatch using Alternating Direction Method of Multipliers
Comments: Accepted for 2020 Applied Energy Symposium (MITAB)
Subjects: Systems and Control (eess.SY)

With the proliferation of distributed energy resources and the volume of data stored due to advancement in metering infrastructure, energy management in power system operation needs distributed computing. In this paper, we propose a fully distributed Alternating Direction Method of Multipliers (ADMM) algorithm to solve the distributed economic dispatch (ED) problem, where the optimization problem is fully decomposed between participating agents. In our proposed framework, each agent estimates the dual variable and the average of the total power mismatch of the network using dynamic average consensus, which replaces the dual updater in the traditional ADMM with a distributed alternative. Unlike other distributed ADMM, the proposed method does not rely on any specific assumption and captures the real-time demand change. The algorithm is validated successfully via case studies for IEEE 30-bus and 300-bus test systems with the penetration of solar photovoltaic.

[44]  arXiv:2005.09826 [pdf, other]
Title: User Activity Detection and Channel Estimation for Grant-Free Random Access in LEO Satellite-Enabled Internet-of-Things
Comments: 14 pages, 9 figures, accepted by Internet of Things Journal
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

With recent advances on the dense low-earth orbit (LEO) constellation, LEO satellite network has become one promising solution to providing global coverage for Internet-of-Things (IoT) services. Confronted with the sporadic transmission from randomly activated IoT devices, we consider the random access (RA) mechanism, and propose a grant-free RA (GF-RA) scheme to reduce the access delay to the mobile LEO satellites. A Bernoulli-Rician message passing with expectation maximization (BR-MP-EM) algorithm is proposed for this terrestrial-satellite GF-RA system to address the user activity detection (UAD) and channel estimation (CE) problem. This BR-MP-EM algorithm is divided into two stages. In the inner iterations, the Bernoulli messages and Rician messages are updated for the joint UAD and CE problem. Based on the output of the inner iterations, the expectation maximization (EM) method is employed in the outer iterations to update the hyper-parameters related to the channel impairments. Finally, simulation results show the UAD and CE accuracy of the proposed BR-MP-EM algorithm, as well as the robustness against the channel impairments.

[45]  arXiv:2005.09830 [pdf, other]
Title: Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review
Comments: 21 pages, submitted to IEEE Transactions on Neural Networks and Learning Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Recently, the advancement of deep learning in discriminative feature learning from 3D LiDAR data has led to rapid development in the field of autonomous driving. However, automated processing uneven, unstructured, noisy, and massive 3D point clouds is a challenging and tedious task. In this paper, we provide a systematic review of existing compelling deep learning architectures applied in LiDAR point clouds, detailing for specific tasks in autonomous driving such as segmentation, detection, and classification. Although several published research papers focus on specific topics in computer vision for autonomous vehicles, to date, no general survey on deep learning applied in LiDAR point clouds for autonomous vehicles exists. Thus, the goal of this paper is to narrow the gap in this topic. More than 140 key contributions in the recent five years are summarized in this survey, including the milestone 3D deep architectures, the remarkable deep learning applications in 3D semantic segmentation, object detection, and classification; specific datasets, evaluation metrics, and the state of the art performance. Finally, we conclude the remaining challenges and future researches.

[46]  arXiv:2005.09833 [pdf, other]
Title: Learning and Reasoning for Robot Dialog and Navigation Tasks
Comments: Accepted to SIGDIAL 2020. arXiv admin note: substantial text overlap with arXiv:1809.11074
Subjects: Artificial Intelligence (cs.AI)

Reinforcement learning and probabilistic reasoning algorithms aim at learning from interaction experiences and reasoning with probabilistic contextual knowledge respectively. In this research, we develop algorithms for robot task completions, while looking into the complementary strengths of reinforcement learning and probabilistic reasoning techniques. The robots learn from trial-and-error experiences to augment their declarative knowledge base, and the augmented knowledge can be used for speeding up the learning process in potentially different tasks. We have implemented and evaluated the developed algorithms using mobile robots conducting dialog and navigation tasks. From the results, we see that our robot's performance can be improved by both reasoning with human knowledge and learning from task-completion experience. More interestingly, the robot was able to learn from navigation tasks to improve its dialog strategies.

[47]  arXiv:2005.09834 [pdf, other]
Title: Exploring Recurrent, Memory and Attention Based Architectures for Scoring Interactional Aspects of Human-Machine Text Dialog
Subjects: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)

An important step towards enabling English language learners to improve their conversational speaking proficiency involves automated scoring of multiple aspects of interactional competence and subsequent targeted feedback. This paper builds on previous work in this direction to investigate multiple neural architectures -- recurrent, attention and memory based -- along with feature-engineered models for the automated scoring of interactional and topic development aspects of text dialog data. We conducted experiments on a conversational database of text dialogs from human learners interacting with a cloud-based dialog system, which were triple-scored along multiple dimensions of conversational proficiency. We find that fusion of multiple architectures performs competently on our automated scoring task relative to expert inter-rater agreements, with (i) hand-engineered features passed to a support vector learner and (ii) transformer-based architectures contributing most prominently to the fusion.

[48]  arXiv:2005.09835 [pdf, ps, other]
Title: Single-step triangular splitting iteration method for a class of complex symmetric linear system
Authors: Xi'an Li, Jian Lu
Subjects: Numerical Analysis (math.NA)

In this work, a single-step triangular splitting (SSTS) iteration method is proposed for solving a class of block two-by-two real linear system which arises from complex symmetric linear system. Then, we investigate the convergence properties of this method and determine its optimal iteration parameters as well as corresponding optimal convergence factor. It is worth mentioning that the SSTS iteration method is robust and superior to PSBTS iteration method under suitable conditions. Finally, some numerical experiments are carried out to validate the theoretical results and evaluate this new method.

[49]  arXiv:2005.09836 [pdf, other]
Title: Computations and Complexities of Tarski's Fixed Points and Supermodular Games
Subjects: Computer Science and Game Theory (cs.GT); Computational Complexity (cs.CC); Theoretical Economics (econ.TH)

We consider two models of computation for Tarski's order preserving function f related to fixed points in a complete lattice: the oracle function model and the polynomial function model. In both models, we find the first polynomial time algorithm for finding a Tarski's fixed point. In addition, we provide a matching oracle bound for determining the uniqueness in the oracle function model and prove it is Co-NP hard in the polynomial function model. The existence of the pure Nash equilibrium in supermodular games is proved by Tarski's fixed point theorem. Exploring the difference between supermodular games and Tarski's fixed point, we also develop the computational results for finding one pure Nash equilibrium and determining the uniqueness of the equilibrium in supermodular games.

[50]  arXiv:2005.09837 [pdf, ps, other]
Title: Positive emotions help rank negative reviews in e-commerce
Authors: Di Weng, Jichang Zhao
Comments: Emotion lexicons are publicly available at this https URL
Subjects: Computation and Language (cs.CL); Computers and Society (cs.CY)

Negative reviews, the poor ratings in postpurchase evaluation, play an indispensable role in e-commerce, especially in shaping future sales and firm equities. However, extant studies seldom examine their potential value for sellers and producers in enhancing capabilities of providing better services and products. For those who exploited the helpfulness of reviews in the view of e-commerce keepers, the ranking approaches were developed for customers instead. To fill this gap, in terms of combining description texts and emotion polarities, the aim of the ranking method in this study is to provide the most helpful negative reviews under a certain product attribute for online sellers and producers. By applying a more reasonable evaluating procedure, experts with related backgrounds are hired to vote for the ranking approaches. Our ranking method turns out to be more reliable for ranking negative reviews for sellers and producers, demonstrating a better performance than the baselines like BM25 with a result of 8% higher. In this paper, we also enrich the previous understandings of emotions in valuing reviews. Specifically, it is surprisingly found that positive emotions are more helpful rather than negative emotions in ranking negative reviews. The unexpected strengthening from positive emotions in ranking suggests that less polarized reviews on negative experience in fact offer more rational feedbacks and thus more helpfulness to the sellers and producers. The presented ranking method could provide e-commerce practitioners with an efficient and effective way to leverage negative reviews from online consumers.

[51]  arXiv:2005.09838 [pdf, ps, other]
Title: A Parallelizable Optimization Method for Missing Internet Traffic Tensor Data
Subjects: Numerical Analysis (math.NA)

Recovery of internet network traffic data from incomplete observed data is an important issue in internet network engineering and management. In this paper, by fully combining the temporal stability and periodicity features in internet traffic data, a new separable optimization model for internet data recovery is proposed, which is based upon the t-product and the rapid discrete Fourier transform of tensors. Moreover, by using generalized inverse matrices, an easy-to-operate and effective algorithm is proposed. In theory, we prove that under suitable conditions, every accumulation point of the sequence generated by the proposed algorithm is a stationary point of the established model. Numerical simulation results carried on the widely used real-world internet network datasets, show good performance of the proposed method. In the case of moderate sampling rates, the proposed method works very well, its effect is better than that of some existing internet traffic data recovery methods in the literature. The separable structural features presented in the optimization model provide the possibility to design more efficient parallel algorithms.

[52]  arXiv:2005.09841 [pdf, other]
Title: Best Arm Identification in Spectral Bandits
Comments: To be published in International Joint Conference on Artificial Intelligence
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

We study best-arm identification with fixed confidence in bandit models with graph smoothness constraint. We provide and analyze an efficient gradient ascent algorithm to compute the sample complexity of this problem as a solution of a non-smooth max-min problem (providing in passing a simplified analysis for the unconstrained case). Building on this algorithm, we propose an asymptotically optimal strategy. We furthermore illustrate by numerical experiments both the strategy's efficiency and the impact of the smoothness constraint on the sample complexity. Best Arm Identification (BAI) is an important challenge in many applications ranging from parameter tuning to clinical trials. It is now very well understood in vanilla bandit models, but real-world problems typically involve some dependency between arms that requires more involved models. Assuming a graph structure on the arms is an elegant practical way to encompass this phenomenon, but this had been done so far only for regret minimization. Addressing BAI with graph constraints involves delicate optimization problems for which the present paper offers a solution.

[53]  arXiv:2005.09856 [pdf, other]
Title: A Novel Meta Learning Framework for Feature Selection using Data Synthesis and Fuzzy Similarity
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

This paper presents a novel meta learning framework for feature selection (FS) based on fuzzy similarity. The proposed method aims to recommend the best FS method from four candidate FS methods for any given dataset. This is achieved by firstly constructing a large training data repository using data synthesis. Six meta features that represent the characteristics of the training dataset are then extracted. The best FS method for each of the training datasets is used as the meta label. Both the meta features and the corresponding meta labels are subsequently used to train a classification model using a fuzzy similarity measure based framework. Finally the trained model is used to recommend the most suitable FS method for a given unseen dataset. This proposed method was evaluated based on eight public datasets of real-world applications. It successfully recommended the best method for five datasets and the second best method for one dataset, which outperformed any of the four individual FS methods. Besides, the proposed method is computationally efficient for algorithm selection, leading to negligible additional time for the feature selection process. Thus, the paper contributes a novel method for effectively recommending which feature selection method to use for any new given dataset.

[54]  arXiv:2005.09857 [pdf, other]
Title: Collision-free Trajectory Planning for Autonomous Surface Vehicle
Subjects: Robotics (cs.RO)

In this paper, we propose an efficient and accurate method for autonomous surface vehicles to generate a smooth and collision-free trajectory considering its dynamics constraints. We decouple the trajectory planning problem as a front-end feasible path searching and a back-end kinodynamic trajectory optimization. Firstly, we model the type of two-thrusts under-actuated surface vessel. Then we adopt a sampling-based path searching to find an asymptotic optimal path through the obstacle-surrounding environment and extract several waypoints from it. We apply a numerical optimization method in the back-end to generate the trajectory. From the perspective of security in the field voyage, we propose the sailing corridor method to guarantee the trajectory away from obstacles. Moreover, considering limited fuel ASV carrying, we design a numerical objective function which can optimize a fuel-saving trajectory. Finally, we validate and compare the proposed method in simulation environments and the results fit our expected trajectory.

[55]  arXiv:2005.09858 [pdf, other]
Title: A Comprehensive Survey of Control Strategies for Autonomous Quadrotors
Comments: 12 pages, 15 figures, 1 table
Journal-ref: Canadian Journal of Electrical and Computer Engineering, Vol.43, No.1, pages 3-16, 2019
Subjects: Systems and Control (eess.SY)

Over the past several decades there has been a constant increase in the use of Unmanned Aerial Systems (UAS). Hence, there has also been a growth in the number of control algorithms to service the many applications embodied by these vehicles. Initially UAS were made popular by the military for Reconnaissance, Intelligence, Surveillance, and Target Acquisition (RISTA) applications. Now-a-days UAS are used for everything from crop surveys to tourism. Nowhere is this more evident than with multi-rotor Unmanned Aerial Vehicle (UAV). This paper presents a survey of control methods for multi-rotor systems, namely quadrotors. In doing so, we hope to visualize a clear path to what additional capabilities might be needed in the future. In our examination, we review many of the notable research organizations and their efforts to expand the utility of multirotor aircraft. We also summarize the basic literature definitions and control strategies for autonomous quadrotors.

[56]  arXiv:2005.09863 [pdf, other]
Title: Understanding Negative Sampling in Graph Representation Learning
Comments: KDD 2020
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Graph representation learning has been extensively studied in recent years. Despite its potential in generating continuous embeddings for various networks, both the effectiveness and efficiency to infer high-quality representations toward large corpus of nodes are still challenging. Sampling is a critical point to achieve the performance goals. Prior arts usually focus on sampling positive node pairs, while the strategy for negative sampling is left insufficiently explored. To bridge the gap, we systematically analyze the role of negative sampling from the perspectives of both objective and risk, theoretically demonstrating that negative sampling is as important as positive sampling in determining the optimization objective and the resulted variance. To the best of our knowledge, we are the first to derive the theory and quantify that the negative sampling distribution should be positively but sub-linearly correlated to their positive sampling distribution. With the guidance of the theory, we propose MCNS, approximating the positive distribution with self-contrast approximation and accelerating negative sampling by Metropolis-Hastings. We evaluate our method on 5 datasets that cover extensive downstream graph learning tasks, including link prediction, node classification and personalized recommendation, on a total of 19 experimental settings. These relatively comprehensive experimental results demonstrate its robustness and superiorities.

[57]  arXiv:2005.09867 [pdf]
Title: A reinforcement learning based decision support system in textile manufacturing process
Authors: Zhenglei He (GEMTEX), Kim Phuc Tran (GEMTEX), Sébastien Thomassey (GEMTEX), Xianyi Zeng (GEMTEX), Changhai Yi
Journal-ref: 15th International Conference on Intelligent Systems and Knowledge Engineering (ISKE2020), Aug 2020, Cologne, Germany
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

This paper introduced a reinforcement learning based decision support system in textile manufacturing process. A solution optimization problem of color fading ozonation is discussed and set up as a Markov Decision Process (MDP) in terms of tuple {S, A, P, R}. Q-learning is used to train an agent in the interaction with the setup environment by accumulating the reward R. According to the application result, it is found that the proposed MDP model has well expressed the optimization problem of textile manufacturing process discussed in this paper, therefore the use of reinforcement learning to support decision making in this sector is conducted and proven that is applicable with promising prospects.

[58]  arXiv:2005.09868 [pdf, ps, other]
Title: Data-Importance Aware Radio Resource Allocation: Wireless Communication Helps Machine Learning
Comments: 5 pages, 6 figures, to be presented at IEEE Communications Letters
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

The rich mobile data and edge computing enabled wireless networks motivate to deploy artificial intelligence (AI) at network edge, known as \emph{edge AI}, which integrates wireless communication and machine learning. In communication, data bits are equally important, while in machine learning some data bits are more important. Therefore we can allocate more radio resources to the more important data and allocate less radio resources to the less important data, so as to efficiently utilize the limited radio resources. To this end, how to define "more or less important" of data is the key problem. In this article, we propose two importance criteria to differentiate data's importance based on their effects on machine learning, one for centralized edge machine learning and the other for distributed edge machine learning. Then, the corresponding radio resource allocation schemes are proposed to improve performance of machine learning. Extensive experiments are conducted for verifying the effectiveness of the proposed data-importance aware radio resource allocation schemes.

[59]  arXiv:2005.09874 [pdf]
Title: An Incremental Clustering Method for Anomaly Detection in Flight Data
Authors: Weizun Zhao (1), Lishuai Li (1), Sameer Alam (2), Yanjun Wang (3 and 4) ((1) Department of Systems Engineering and Engineering Management, City University of Hong Kong, (2) School of Mechanical & Aerospace Engineering, Nanyang Technological University, (3) College of Civil Aviation, Nanjing University of Aeronautics and Astronautics, (4) Department of Aeronautics and Astronautics, Massachusetts Institute of Technology)
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Safety is a top priority for civil aviation. Data mining in digital Flight Data Recorder (FDR) or Quick Access Recorder (QAR) data, commonly referred as black box data on aircraft, has gained interest from researchers, airlines, and aviation regulation agencies for safety management. New anomaly detection methods based on supervised or unsupervised learning have been developed to monitor pilot operations and detect any risks from onboard digital flight data recorder data. However, all existing anomaly detection methods are offline learning - the models are trained once using historical data and used for all future predictions. In practice, new QAR data are generated by every flight and collected by airlines whenever a datalink is available. Offline methods cannot respond to new data in time. Though these offline models can be updated by being re-trained after adding new data to the original training set, it is time-consuming and computational costly to train a new model every time new data come in. To address this problem, we propose a novel incremental anomaly detection method to identify common patterns and detect outliers in flight operations from FDR data. The proposed method is based on Gaussian Mixture Model (GMM). An initial GMM cluster model is trained on historical offline data. Then, it continuously adapts to new incoming data points via an expectation-maximization (EM) algorithm. To track changes in flight operation patterns, only model parameters need to be saved, not the raw flight data. The proposed method was tested on two sets of simulation data. Comparable results were found from the proposed online method and a classic offline model. A real-world application of the proposed method is demonstrated using FDR data from daily operations of an airline. Results are presented and future challenges of using online learning scheme for flight data analytics are discussed.

[60]  arXiv:2005.09900 [pdf, ps, other]
Title: Fair Outlier Detection
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Databases (cs.DB)

An outlier detection method may be considered fair over specified sensitive attributes if the results of outlier detection are not skewed towards particular groups defined on such sensitive attributes. In this task, we consider, for the first time to our best knowledge, the task of fair outlier detection. In this work, we consider the task of fair outlier detection over multiple multi-valued sensitive attributes (e.g., gender, race, religion, nationality, marital status etc.). We propose a fair outlier detection method, FairLOF, that is inspired by the popular LOF formulation for neighborhood-based outlier detection. We outline ways in which unfairness could be induced within LOF and develop three heuristic principles to enhance fairness, which form the basis of the FairLOF method. Being a novel task, we develop an evaluation framework for fair outlier detection, and use that to benchmark FairLOF on quality and fairness of results. Through an extensive empirical evaluation over real-world datasets, we illustrate that FairLOF is able to achieve significant improvements in fairness at sometimes marginal degradations on result quality as measured against the fairness-agnostic LOF method.

[61]  arXiv:2005.09902 [pdf, other]
Title: An LSTM approach to Predict Migration based on Google Trends
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Being able to model and predict international migration as precisely as possible is crucial for policy making. Recently Google Trends data in addition to other economic and demographic data have been shown to improve the prediction quality of a gravity linear model for the one-year ahead predictions. In this work, we replace the linear model with a long short-term memory (LSTM) approach and compare it with two existing approaches: the linear gravity model and an artificial neural network (ANN) model. Our LSTM approach combined with Google Trends data outperforms both these models on various metrics in the task of predicting the one-year ahead incoming international migration to 34 OECD countries: the root mean square error has been divided by 5 on the test set and the mean average error by 4. This positive result demonstrates that machine learning techniques constitute a serious alternative over traditional approaches for studying migration mechanisms.

[62]  arXiv:2005.09904 [pdf, ps, other]
Title: BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Comments: 12 pages, 10 figures
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

The number of parameters in deep neural networks (DNNs) is rapidly increasing to support complicated tasks and to improve model accuracy. Correspondingly, the amount of computations and required memory footprint increase as well. Quantization is an efficient method to address such concerns by compressing DNNs such that computations can be simplified while required storage footprint is significantly reduced. Unfortunately, commercial CPUs and GPUs do not fully support quantization because only fixed data transfers (such as 32 bits) are allowed. As a result, even if weights are quantized into a few bits, CPUs and GPUs cannot access multiple quantized weights without memory bandwidth waste. Success of quantization in practice, hence, relies on an efficient computation engine design, especially for matrix multiplication that is a basic computation engine in most DNNs. In this paper, we propose a novel matrix multiplication method, called BiQGEMM, dedicated to quantized DNNs. BiQGEMM can access multiple quantized weights simultaneously in one instruction. In addition, BiQGEMM pre-computes intermediate results that are highly redundant when quantization leads to limited available computation space. Since pre-computed values are stored in lookup tables and reused, BiQGEMM achieves lower amount of overall computations. Our extensive experimental results show that BiQGEMM presents higher performance than conventional schemes when DNNs are quantized.

[63]  arXiv:2005.09907 [pdf, other]
Title: Accounting for Input Noise in Gaussian Process Parameter Retrieval
Journal-ref: IEEE Geoscience and Remote Sensing Letters ( Volume: 17 , Issue: 3 , March 2020 )
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Gaussian processes (GPs) are a class of Kernel methods that have shown to be very useful in geoscience and remote sensing applications for parameter retrieval, model inversion, and emulation. They are widely used because they are simple, flexible, and provide accurate estimates. GPs are based on a Bayesian statistical framework which provides a posterior probability function for each estimation. Therefore, besides the usual prediction (given in this case by the mean function), GPs come equipped with the possibility to obtain a predictive variance (i.e., error bars, confidence intervals) for each prediction. Unfortunately, the GP formulation usually assumes that there is no noise in the inputs, only in the observations. However, this is often not the case in earth observation problems where an accurate assessment of the measuring instrument error is typically available, and where there is huge interest in characterizing the error propagation through the processing pipeline. In this letter, we demonstrate how one can account for input noise estimates using a GP model formulation which propagates the error terms using the derivative of the predictive mean function. We analyze the resulting predictive variance term and show how they more accurately represent the model error in a temperature prediction problem from infrared sounding data.

[64]  arXiv:2005.09908 [pdf, other]
Title: Consistent and Flexible Selectivity Estimation for High-dimensional Data
Subjects: Databases (cs.DB); Machine Learning (cs.LG)

Selectivity estimation aims at estimating the number of database objects that satisfy a selection criterion. Answering this problem accurately and efficiently is essential to applications, such as density estimation, outlier detection, query optimization, and data integration. The estimation problem is especially challenging for large-scale high-dimensional data due to the curse of dimensionality, the need to make the estimator consistent (i.e., the selectivity is non-decreasing w.r.t. the threshold), and the large variance of selectivity across different queries. We propose a new deep learning-based model that learns a query dependent piece-wise linear function as the estimator. We design a novel model architecture so that the model is flexible to fit any selection criterion. To improve the accuracy for large datasets, we propose to divide the dataset into multiple disjoint partitions and build a local model on each of them. We perform experiments on real datasets and show that the proposed model guarantees the consistency and significantly outperforms state-of-the-art models in terms of both accuracy and efficiency.

[65]  arXiv:2005.09910 [pdf]
Title: Multitask Learning with Single Gradient Step Update for Task Balancing
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Multitask learning is a methodology to boost generalization performance and also reduce computational intensity and memory usage. However, learning multiple tasks simultaneously can be more difficult than learning a single task because it can cause imbalance problem among multiple tasks. To address the imbalance problem, we propose an algorithm to balance between tasks at gradient-level by applying learning method of gradient-based meta-learning to multitask learning. The proposed method trains shared layer and task specific layers separately so that the two layers of different roles in multitask network can be fitted to their own purpose. Especially, the shared layer that contains knowledge shared among tasks is trained by employing single gradient step update and inner/outer loop training procedure to mitigate the imbalance problem at gradient-level. We apply the proposed method to various multitask computer vision problems and achieve state of the art performance.

[66]  arXiv:2005.09916 [pdf, other]
Title: Fast Decoding of Codes in the Rank, Subspace, and Sum-Rank Metric
Subjects: Information Theory (cs.IT); Symbolic Computation (cs.SC)

We speed up existing decoding algorithms for three code classes in different metrics: interleaved Gabidulin codes in the rank metric, lifted interleaved Gabidulin codes in the subspace metric, and linearized Reed-Solomon codes in the sum-rank metric. The speed-ups are achieved by reducing the core of the underlying computational problems of the decoders to one common tool: computing left and right approximant bases of matrices over skew polynomial rings. To accomplish this, we describe a skew-analogue of the existing PM-Basis algorithm for matrices over usual polynomials. This captures the bulk of the work in multiplication of skew polynomials, and the complexity benefit comes from existing algorithms performing this faster than in classical quadratic complexity. The new faster algorithms for the various decoding-related computational problems are interesting in their own and have further applications, in particular parts of decoders of several other codes and foundational problems related to the remainder-evaluation of skew polynomials.

[67]  arXiv:2005.09917 [pdf, other]
Title: Rethinking Performance Estimation in Neural Architecture Search
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Neural architecture search (NAS) remains a challenging problem, which is attributed to the indispensable and time-consuming component of performance estimation (PE). In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space. Since searching an optimal BPE is extremely time-consuming as it requires to train a large number of networks for evaluation, we propose a Minimum Importance Pruning (MIP) approach. Given a dataset and a BPE search space, MIP estimates the importance of hyper-parameters using random forest and subsequently prunes the minimum one from the next iteration. In this way, MIP effectively prunes less important hyper-parameters to allocate more computational resource on more important ones, thus achieving an effective exploration. By combining BPE with various search algorithms including reinforcement learning, evolution algorithm, random search, and differentiable architecture search, we achieve 1, 000x of NAS speed up with a negligible performance drop comparing to the SOTA

[68]  arXiv:2005.09925 [pdf, other]
Title: The Many Faces of Balance: Multilevel Structural Evaluation of Signed Directed Social Networks
Comments: Combined 11-page manuscript and 13-page supplementary information
Subjects: Social and Information Networks (cs.SI); Optimization and Control (math.OC); Physics and Society (physics.soc-ph)

Balance theory explains the forces behind the structure of social systems commonly modeled as static undirected signed networks. We expand the modeling to incorporate directionality of the edges and consider three levels of analysis: triads, subgroups, and the whole network. For triad-level balance, we utilize semicycles that satisfy the condition of transitivity. For subgroup-level balance, we derive measures of cohesiveness (internal solidarity) and divisiveness (external antagonism) to capture balance in subgroups using the most fitting partition of nodes into two groups. For network-level balance, we use the normalized line index which relies on the proportion of edges whose position suits balance. Through extensive computational analysis, we document frequently repeated patterns of social structure in triads, subgroups, and the whole network across a range of social settings from college students and Wikipedia users to philosophers and Bitcoin traders. We then apply our multilevel framework to examine balance in temporal and multilayer networks which demonstrates the generalizability of our approach and leads to new observations on balance with respect to time and layer dimensions. Our complementary findings on a variety of social networks highlight the need to evaluate balance at different levels for which we propose a comprehensive yet parsimonious approach.

[69]  arXiv:2005.09927 [pdf, other]
Title: Range Conditioned Dilated Convolutions for Scale Invariant 3D Object Detection
Comments: 20 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)

This paper presents a novel 3D object detection framework that processes LiDAR data directly on a representation of the sensor's native range images. When operating in the range image view, one faces learning challenges, including occlusion and considerable scale variation, limiting the obtainable accuracy. To address these challenges, a range-conditioned dilated block (RCD) is proposed to dynamically adjust a continuous dilation rate as a function of the measured range, achieving scale invariance. Furthermore, soft range gating helps mitigate the effect of occlusion. An end-to-end trained box-refinement network brings additional performance improvements in occluded areas, and produces more accurate bounding box predictions. On the challenging Waymo Open Dataset, our improved range-based detector outperforms state of the art at long range detection. Our framework is superior to prior multiview, voxel-based methods over all ranges, setting a new baseline for range-based 3D detection on this large scale public dataset.

[70]  arXiv:2005.09941 [pdf, other]
Title: Non-Uniform Gaussian Blur of Hexagonal Bins in Cartesian Coordinates
Comments: 10 pages, 6 figures. Figures 1-5 made in LaTeX and look OK in preview
Subjects: Graphics (cs.GR)

In a recent application of the Bokeh Python library for visualizing physico-chemical properties of chemical entities text-mined from the scientific literature, we found ourselves facing the task of smoothing hexagonally binned data in Cartesian coordinates. To the best of our knowledge, no documentation for how to do this exist in the public domain. This short paper shows how to accomplish this in general and for Bokeh in particular. We illustrate the method with a real-world example and discuss some potential advantages of using hexagonal bins in these and similar applications.

[71]  arXiv:2005.09944 [pdf, ps, other]
Title: Time-optimal Loosely-stabilizing Leader Election in Population Protocols
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

We consider the leader election problem in population protocol models. In pragmatic settings of population protocols, self-stabilization is a highly desired feature owing to its fault resilience and the benefit of initialization freedom. However, the design of self-stabilizing leader election is possible only under a strong assumption (i.e. the knowledge of the \emph{exact} size of a network) and rich computational resources (i.e. the number of states). Loose-stabilization, introduced by Sudo et al [Theoretical Computer Science, 2012], is a promising relaxed concept of self-stabilization to address the aforementioned issue. Loose-stabilization guarantees that starting from any configuration, the network will reach a safe configuration where a single leader exists within a short time, and thereafter it will maintain the single leader for a long time, but not forever. The main contribution of the paper is a time-optimal loosely-stabilizing leader election protocol. While the shortest convergence time achieved so far in loosely-stabilizing leader election is $O(\log^3 n)$ parallel time, the proposed protocol with design parameter $\tau \ge 1$ attains $O(\tau \log n)$ parallel convergence time and $\Omega(n^{\tau})$ parallel holding time (i.e. the length of the period keeping the unique leader), both in expectation. This protocol is time-optimal in the sense of both the convergence and holding times in expectation because any loosely-stabilizing leader election protocol with the same length of the holding time is known to require $\Omega(\tau \log n)$ parallel time.

[72]  arXiv:2005.09945 [pdf, other]
Title: Early Classification of Time Series. Cost-based Optimization Criterion and Algorithms
Comments: Under review (ECML : Machine Learning journal track)
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

An increasing number of applications require to recognize the class of an incoming time series as quickly as possible without unduly compromising the accuracy of the prediction. In this paper, we put forward a new optimization criterion which takes into account both the cost of misclassification and the cost of delaying the decision. Based on this optimization criterion, we derived a family of non-myopic algorithms which try to anticipate the expected future gain in information in balance with the cost of waiting. In one class of algorithms, unsupervised-based, the expectations use the clustering of time series, while in a second class, supervised-based, time series are grouped according to the confidence level of the classifier used to label them.
Extensive experiments carried out on real data sets using a large range of delay cost functions show that the presented algorithms are able to satisfactorily solving the earliness vs. accuracy trade-off, with the supervised-based approaches faring better than the unsupervised-based ones. In addition, all these methods perform better in a wide variety of conditions than a state of the art method based on a myopic strategy which is recognized as very competitive.

[73]  arXiv:2005.09946 [pdf, ps, other]
Title: GM-CTSC at SemEval-2020 Task 1: Gaussian Mixtures Cross Temporal Similarity Clustering
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

This paper describes the system proposed for the SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection. We focused our approach on the detection problem. Given the semantics of words captured by temporal word embeddings in different time periods, we investigate the use of unsupervised methods to detect when the target word has gained or loosed senses. To this end, we defined a new algorithm based on Gaussian Mixture Models to cluster the target similarities computed over the two periods. We compared the proposed approach with a number of similarity-based thresholds. We found that, although the performance of the detection methods varies across the word embedding algorithms, the combination of Gaussian Mixture with Temporal Referencing resulted in our best system.

[74]  arXiv:2005.09959 [pdf, other]
Title: Behavioral Software Engineering: Methodological Introduction to Psychometrics
Comments: 47 pages (pp. 1-28 for the main paper, pp. 29-47 working example in the appendix), 8 figures in the main paper
Subjects: Software Engineering (cs.SE); Computers and Society (cs.CY)

Humans are what constitutes the most complex and complicated, yet fascinating, component of a software engineering endeavor. A meaningful and deep understanding of the human aspects of software engineering requires psychological constructs to be taken into account. We argue that psychology and statistics theory would facilitate the development and adoption of valid and reliable instruments to assess these constructs. In particular, to ensure high quality, the psychometric properties of measurement instruments need evaluation. In this paper, we provide an introduction to psychometric theory for the evaluation of measurement instruments (e.g., psychological tests and questionnaires) for software engineering researchers. We present guidelines that enable using existing instruments and developing new ones adequately. We conducted a field survey of the psychology literature, including journal articles, textbooks, and society standards, framed by the Standards for Educational and Psychological Testing (American Educational Research Association et al, 2014). We detail activities used when operationalizing new psychological constructs, such as item analysis, factor analysis, standardization and normalization, reliability, validity, and fairness in testing and test bias. With this paper, we hope to encourage a culture change in software engineering research towards the adoption of established methods from social science. To improve the quality of behavioral research, we believe that studies focusing on introducing and psychometrically validating measurement instruments need to be more common. Finally, we present an example of a psychometric evaluation based on our guidelines, to which we openly provide code and dataset.

[75]  arXiv:2005.09961 [pdf, other]
Title: Monte Carlo Inverse Folding
Subjects: Artificial Intelligence (cs.AI)

The RNA Inverse Folding problem comes from computational biology. The goal is to find a molecule that has a given folding. It is important for scientific fields such as bioengineering, pharmaceutical research, biochemistry, synthetic biology and RNA nanostructures. Nested Monte Carlo Search has given excellent results for this problem. We propose to adapt and evaluate different Monte Carlo Search algorithms for the RNA Inverse Folding problem.

[76]  arXiv:2005.09966 [pdf, other]
Title: SADDEL: Joint Speech Separation and Denoising Model based on Multitask Learning
Comments: The two first authors made equal contributions
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Speech data collected in real-world scenarios often encounters two issues. First, multiple sources may exist simultaneously, and the number of sources may vary with time. Second, the existence of background noise in recording is inevitable. To handle the first issue, we refer to speech separation approaches, that separate speech from an unknown number of speakers. To address the second issue, we refer to speech denoising approaches, which remove noise components and retrieve pure speech signals. Numerous deep learning based methods for speech separation and denoising have been proposed that show promising results. However, few works attempt to address the issues simultaneously, despite speech separation and denoising tasks having similar nature. In this study, we propose a joint speech separation and denoising framework based on the multitask learning criterion to tackle the two issues simultaneously. The experimental results show that the proposed framework not only performs well on both speech separation and denoising tasks, but also outperforms related methods in most conditions.

[77]  arXiv:2005.09968 [pdf]
Title: Development of a Shape-memorable Adaptive Pin Array Fixture
Comments: Submitted to Advanced Robotics, 2020
Subjects: Robotics (cs.RO)

This paper proposes an adaptive pin-array fixture. The key idea of this research is to use the shape-memorable mechanism of pin array to fix multiple different shaped parts with common pin configuration. The clamping area consists of a matrix of passively slid-able pins that conform themselves to the contour of the target object. Vertical motion of the pins enables the fixture to encase the profile of the object. The shape memorable mechanism is realized by the combination of the rubber bush and fixing mechanism of a pin. Several physical peg-in-hole tasks is conducted to verify the feasibility of the fixture.

[78]  arXiv:2005.09971 [pdf, other]
Title: Hidden Markov Models and their Application for Predicting Failure Events
Comments: Will be published in the proceedings of ICCS 2020; @Booklet{EasyChair:3183, author = {Paul Hofmann and Zaid Tashman}, title = {Hidden Markov Models and their Application for Predicting Failure Events}, howpublished = {EasyChair Preprint no. 3183}, year = {EasyChair, 2020}}
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

We show how Markov mixed membership models (MMMM) can be used to predict the degradation of assets. We model the degradation path of individual assets, to predict overall failure rates. Instead of a separate distribution for each hidden state, we use hierarchical mixtures of distributions in the exponential family. In our approach the observation distribution of the states is a finite mixture distribution of a small set of (simpler) distributions shared across all states. Using tied-mixture observation distributions offers several advantages. The mixtures act as a regularization for typically very sparse problems, and they reduce the computational effort for the learning algorithm since there are fewer distributions to be found. Using shared mixtures enables sharing of statistical strength between the Markov states and thus transfer learning. We determine for individual assets the trade-off between the risk of failure and extended operating hours by combining a MMMM with a partially observable Markov decision process (POMDP) to dynamically optimize the policy for when and how to maintain the asset.

[79]  arXiv:2005.09973 [pdf, other]
Title: Dynamic Refinement Network for Oriented and Densely Packed Object Detection
Comments: Accepted by CVPR 2020 as Oral
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Object detection has achieved remarkable progress in the past decade. However, the detection of oriented and densely packed objects remains challenging because of following inherent reasons: (1) receptive fields of neurons are all axis-aligned and of the same shape, whereas objects are usually of diverse shapes and align along various directions; (2) detection models are typically trained with generic knowledge and may not generalize well to handle specific objects at test time; (3) the limited dataset hinders the development on this task. To resolve the first two issues, we present a dynamic refinement network that consists of two novel components, i.e., a feature selection module (FSM) and a dynamic refinement head (DRH). Our FSM enables neurons to adjust receptive fields in accordance with the shapes and orientations of target objects, whereas the DRH empowers our model to refine the prediction dynamically in an object-aware manner. To address the limited availability of related benchmarks, we collect an extensive and fully annotated dataset, namely, SKU110K-R, which is relabeled with oriented bounding boxes based on SKU110K. We perform quantitative evaluations on several publicly available benchmarks including DOTA, HRSC2016, SKU110K, and our own SKU110K-R dataset. Experimental results show that our method achieves consistent and substantial gains compared with baseline approaches. The code and dataset are available at https://github.com/Anymake/DRN_CVPR2020.

[80]  arXiv:2005.09976 [pdf]
Title: Alternative Effort-optimal Model-based Strategy for State Machine Testing of IoT Systems
Subjects: Software Engineering (cs.SE)

To effectively test parts of the Internet of Things (IoT) systems with a state machine character, Model-based Testing (MBT) approach can be taken. In MBT, a system model is created, and test cases are generated automatically from the model, and a number of current strategies exist. In this paper, we propose a novel alternative strategy that concurrently allows us to flexibly adjust the preferred length of the generated test cases, as well as to mark the states, in which the test case can start and end. Compared with an intuitive N-switch coverage-based strategy that aims at the same goals, our proposal generates a lower number of shorter test cases with fewer test step duplications.

[81]  arXiv:2005.09980 [pdf]
Title: Creative Artificial Intelligence -- Algorithms vs. humans in an incentivized writing competition
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); General Economics (econ.GN)

The release of openly available, robust text generation algorithms has spurred much public attention and debate, due to algorithm's purported ability to generate human-like text across various domains. Yet, empirical evidence using incentivized tasks to assess human behavioral reactions to such algorithms is lacking. We conducted two experiments assessing behavioral reactions to the state-of-the-art Natural Language Generation algorithm GPT-2 (Ntotal = 830). Using the identical starting lines of human poems, GPT-2 produced samples of multiple algorithmically-generated poems. From these samples, either a random poem was chosen (Human-out-of-the-loop) or the best one was selected (Human-in-the-loop) and in turn matched with a human written poem. Taking part in a new incentivized version of the Turing Test, participants failed to reliably detect the algorithmically-generated poems in the human-in-the-loop treatment, yet succeeded in the Human-out-of-the-loop treatment. Further, the results reveal a general aversion towards algorithmic poetry, independent on whether participants were informed about the algorithmic origin of the poem (Transparency) or not (Opacity). We discuss what these results convey about the performance of NLG algorithms to produce human-like text and propose methodologies to study such learning algorithms in experimental settings.

[82]  arXiv:2005.09984 [pdf, other]
Title: A Modified Fourier-Mellin Approach for Source Device Identification on Stabilized Videos
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)

To decide whether a digital video has been captured by a given device, multimedia forensic tools usually exploit characteristic noise traces left by the camera sensor on the acquired frames. This analysis requires that the noise pattern characterizing the camera and the noise pattern extracted from video frames under analysis are geometrically aligned. However, in many practical scenarios this does not occur, thus a re-alignment or synchronization has to be performed. Current solutions often require time consuming search of the realignment transformation parameters. In this paper, we propose to overcome this limitation by searching scaling and rotation parameters in the frequency domain. The proposed algorithm tested on real videos from a well-known state-of-the-art dataset shows promising results.

[83]  arXiv:2005.09992 [pdf, other]
Title: Beyond the storage capacity: data driven satisfiability transition
Comments: 5 pages, 2 figures
Subjects: Machine Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech)

Data structure has a dramatic impact on the properties of neural networks, yet its significance in the established theoretical frameworks is poorly understood. Here we compute the Vapnik-Chervonenkis entropy of a kernel machine operating on data grouped into equally labelled subsets. At variance with the unstructured scenario, entropy is non-monotonic in the size of the training set, and displays an additional critical point besides the storage capacity. Remarkably, the same behavior occurs in margin classifiers even with randomly labelled data, as is elucidated by identifying the synaptic volume encoding the transition. These findings reveal aspects of expressivity lying beyond the condensed description provided by the storage capacity, and they indicate the path towards more realistic bounds for the generalization error of neural networks.

[84]  arXiv:2005.09993 [pdf]
Title: Investigating Current State-of-The-Art Applications of Supportive Technologies for Individuals with ADHD
Authors: Fatemah Husain
Comments: 43 pages (systematic literature review)
Subjects: Human-Computer Interaction (cs.HC)

Attention Deficit Hyperactivity Disorder (ADHD) is a chronic mental and behavioral disorder that interferes with everyday activities and has three core symptoms: inattention, hyperactivity, and impulsivity. To help in reducing the effects of ADHD symptoms, there are multiple treatments, but none of them help in curing ADHD. Assistive technologies offer great opportunities in delivering treatments, especially those related to behavioral interventions, monitoring, and changing in a more flexible, acceptable and accessible way. Focusing on assistive technology for children with ADHD is very important as early support during childhood prevents the manifestation of its symptoms before entering adulthood. This systematic literature review paper investigates the available studies covering assistive technologies for children with ADHD. The contribution of this paper can help Human-Computer Interaction researchers to identify the procedures and research methods used throughout requirements, design, and evaluation phases in developing assistive technology for children with ADHD. Moreover, it provides researchers with information regarding frameworks and protocols of conducting studies on ADHD, current available solutions, and their limitations.

[85]  arXiv:2005.09996 [pdf, other]
Title: Heterogeneous Susceptibilities in Social Influence Models
Authors: Daniel K. Sewell
Journal-ref: Social Networks, 52:135-144, 2017
Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph); Methodology (stat.ME)

Network autocorrelation models are widely used to evaluate the impact of social influence on some variable of interest. This is a large class of models that parsimoniously accounts for how one's neighbors influence one's own behaviors or opinions by incorporating the network adjacency matrix into the joint distribution of the data. These models assume homogeneous susceptibility to social influence, however, which may be a strong assumption in many contexts. This paper proposes a hierarchical model that allows the influence parameter to be a function of individual attributes and/or of local network topological features. We derive an approximation of the posterior distribution in a general framework that is applicable to the Durbin, network effects, network disturbances, or network moving average autocorrelation models. The proposed approach can also be applied to investigating determinants of social influence in the context of egocentric network data. We apply our method to a data set collected via mobile phones in which we determine the effect of social influence on physical activity levels, as well as classroom data in which we investigate peer influence on student defiance. With this last data set, we also investigate the performance of the proposed egocentric network model.

[86]  arXiv:2005.09997 [pdf, other]
Title: Learning Semantic Program Embeddings with GraphInterval Neural Network
Comments: The abstract is simplified, for full abstract, please refer to the paper
Subjects: Software Engineering (cs.SE); Machine Learning (cs.LG); Programming Languages (cs.PL)

Learning distributed representations of source code has been a challenging task for machine learning models. Earlier works treated programs as text so that natural language methods can be readily applied. Unfortunately, such approaches do not capitalize on the rich structural information possessed by source code. Of late, Graph Neural Network (GNN) was proposed to learn embeddings of programs from their graph representations. Due to the homogeneous and expensive message-passing procedure, GNN can suffer from precision issues, especially when dealing with programs rendered into large graphs. In this paper, we present a new graph neural architecture, called Graph Interval Neural Network (GINN), to tackle the weaknesses of the existing GNN. Unlike the standard GNN, GINN generalizes from a curated graph representation obtained through an abstraction method designed to aid models to learn. In particular, GINN focuses exclusively on intervals for mining the feature representation of a program, furthermore, GINN operates on a hierarchy of intervals for scaling the learning to large graphs. We evaluate GINN for two popular downstream applications: variable misuse prediction and method name prediction. Results show in both cases GINN outperforms the state-of-the-art models by a comfortable margin. We have also created a neural bug detector based on GINN to catch null pointer deference bugs in Java code. While learning from the same 9,000 methods extracted from 64 projects, GINN-based bug detector significantly outperforms GNN-based bug detector on 13 unseen test projects. Next, we deploy our trained GINN-based bug detector and Facebook Infer to scan the codebase of 20 highly starred projects on GitHub. Through our manual inspection, we confirm 38 bugs out of 102 warnings raised by GINN-based bug detector compared to 34 bugs out of 129 warnings for Facebook Infer.

[87]  arXiv:2005.09998 [pdf, other]
Title: Tackling the DMN Challenges with cDMN: a Tight Integration of DMN and constraint reasoning
Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB); Logic in Computer Science (cs.LO)

This paper describes an extension to the DMN standard, called cDMN. It aims to enlarge the expressivity of DMN in order to solve more complex problems, while retaining DMN's goal of being readable by domain experts. We test cDMN by solving the most complex challenges posted on the DM Community website. We compare our own cDMN solutions to the solutions that have been submitted to the website and find that our approach is competitive, both in readability and compactness. Moreover, cDMN is able to solve more challenges than any other approach.

[88]  arXiv:2005.09999 [pdf, other]
Title: Model Predictive Instantaneous Safety Metric for Evaluation of Automated Driving Systems
Comments: Accepted at IEEE Intelligent Vehicles Symposium (IV), 2020
Subjects: Robotics (cs.RO)

Vehicles with Automated Driving Systems (ADS) operate in a high-dimensional continuous system with multi-agent interactions. This continuous system features various types of traffic agents (non-homogeneous) governed by continuous-motion ordinary differential equations (differential-drive). Each agent makes decisions independently that may lead to conflicts with the subject vehicle (SV), as well as other participants (non-cooperative). A typical vehicle safety evaluation procedure that uses various safety-critical scenarios and observes resultant collisions (or near collisions), is not sufficient enough to evaluate the performance of the ADS in terms of operational safety status maintenance. In this paper, we introduce a Model Predictive Instantaneous Safety Metric (MPrISM), which determines the safety status of the SV, considering the worst-case safety scenario for a given traffic snapshot. The method then analyzes the SV's closeness to a potential collision within a certain evaluation time period. The described metric induces theoretical guarantees of safety in terms of the time to collision under standard assumptions. Through formulating the solution as a series of minimax quadratic optimization problems of a specific structure, the method is tractable for real-time safety evaluation applications. Its capabilities are demonstrated with synthesized examples and cases derived from real-world tests.

[89]  arXiv:2005.10000 [pdf, other]
Title: Continuous Multiagent Control using Collective Behavior Entropy for Large-Scale Home Energy Management
Subjects: Systems and Control (eess.SY); Artificial Intelligence (cs.AI)

With the increasing popularity of electric vehicles, distributed energy generation and storage facilities in smart grid systems, an efficient Demand-Side Management (DSM) is urgent for energy savings and peak loads reduction. Traditional DSM works focusing on optimizing the energy activities for a single household can not scale up to large-scale home energy management problems. Multi-agent Deep Reinforcement Learning (MA-DRL) shows a potential way to solve the problem of scalability, where modern homes interact together to reduce energy consumers consumption while striking a balance between energy cost and peak loads reduction. However, it is difficult to solve such an environment with the non-stationarity, and existing MA-DRL approaches cannot effectively give incentives for expected group behavior. In this paper, we propose a collective MA-DRL algorithm with continuous action space to provide fine-grained control on a large scale microgrid. To mitigate the non-stationarity of the microgrid environment, a novel predictive model is proposed to measure the collective market behavior. Besides, a collective behavior entropy is introduced to reduce the high peak loads incurred by the collective behaviors of all householders in the smart grid. Empirical results show that our approach significantly outperforms the state-of-the-art methods regarding power cost reduction and daily peak loads optimization.

[90]  arXiv:2005.10004 [pdf, other]
Title: Characterizing networks of propaganda on Twitter: a case study
Subjects: Social and Information Networks (cs.SI)

The daily exposure of social media users to propaganda and disinformation campaigns has reinvigorated the need to investigate the local and global patterns of diffusion of different (mis)information content on social media. Echo chambers and influencers are often deemed responsible of both the polarization of users in online social networks and the success of propaganda and disinformation campaigns. This article adopts a data-driven approach to investigate the structuration of communities and propaganda networks on Twitter in order to assess the correctness of these imputations. In particular, the work aims at characterizing networks of propaganda extracted from a Twitter dataset by combining the information gained by three different classification approaches, focused respectively on (i) using Tweets content to infer the "polarization" of users around a specific topic, (ii) identifying users having an active role in the diffusion of different propaganda and disinformation items, and (iii) analyzing social ties to identify topological clusters and users playing a "central" role in the network. The work identifies highly partisan community structures along political alignments; furthermore, centrality metrics proved to be very informative to detect the most active users in the network and to distinguish users playing different roles; finally, polarization and clustering structure of the retweet graphs provided useful insights about relevant properties of users exposure, interactions, and participation to different propaganda items.

[91]  arXiv:2005.10006 [pdf, other]
Title: The Hetero-functional Graph Theory Toolbox
Comments: 12 pages, 6 figures
Subjects: Computational Engineering, Finance, and Science (cs.CE)

In the 20th century, newly invented technical artifacts were connected to form large-scale complex engineering systems. Furthermore, the interactions found within these networked systems has grown in both degree as well as heterogeneity. Consequently, these already complex engineering systems have converged in what is now called systems-of-systems. The analysis, design, planning, and operation of these engineering systems from a holistic perspective has necessitated ever-more sophisticated modeling techniques. Despite significant advancements in model-based systems engineering and network science, these seemingly disparate fields have experienced similar limitations in addressing the complexity of engineering systems. Hetero-Functional Graph Theory (HFGT) has emerged as a means to address some of these limitations. This paper serves as a user guide to a recently developed Hetero-functional Graph Theory Toolbox which facilitates the computation of HFGT mathematical models. It is written in the MATLAB language and has been tested with v9.6 (R2019a). It is openly available on GitHub together with a sample input file for straightforward re-use. The paper details the syntax and semantics of the input file, the principal data structure of the toolbox, and the functions used to construct and populate this data structure. The toolbox has been fully validated against several peer-review HFGT publications.

[92]  arXiv:2005.10008 [pdf, other]
Title: Batch Decorrelation for Active Metric Learning
Comments: Accepted to IJCAI-PRICAI 2020
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

We present an active learning strategy for training parametric models of distance metrics, given triplet-based similarity assessments: object $x_i$ is more similar to object $x_j$ than to $x_k$. In contrast to prior work on class-based learning, where the fundamental goal is classification and any implicit or explicit metric is binary, we focus on {\em perceptual} metrics that express the {\em degree} of (dis)similarity between objects. We find that standard active learning approaches degrade when annotations are requested for {\em batches} of triplets at a time: our studies suggest that correlation among triplets is responsible. In this work, we propose a novel method to {\em decorrelate} batches of triplets, that jointly balances informativeness and diversity while decoupling the choice of heuristic for each criterion. Experiments indicate our method is general, adaptable, and outperforms the state-of-the-art.

[93]  arXiv:2005.10009 [pdf, ps, other]
Title: On randomized trace estimates for indefinite matrices with an application to determinants
Subjects: Numerical Analysis (math.NA)

Randomized trace estimation is a popular and well studied technique that approximates the trace of a large-scale matrix $B$ by computing the average of $x^T Bx$ for many samples of a random vector $X$. Often, $B$ is symmetric positive definite (SPD) but a number of applications give rise to indefinite $B$. Most notably, this is the case for log-determinant estimation, a task that features prominently in statistical learning, for instance in maximum likelihood estimation for Gaussian process regression. The analysis of randomized trace estimates, including tail bounds, has mostly focused on the SPD case. A notable exception is recent work by Ubaru, Chen, and Saad on trace estimates for a matrix function $f(A)$ with Rademacher random vectors. In this work, we derive new tail bounds for randomized trace estimates applied to indefinite $B$ with Rademacher or Gaussian random vectors. These bounds significantly improve existing results for indefinite $B$, reducing the the number of required samples by a factor $n$ or even more, where $n$ is the size of $A$. Even for an SPD matrix, our work improves an existing result by Roosta-Khorasani and Ascher for Rademacher vectors. This work also analyzes the combination of randomized trace estimates with the Lanczos method for approximating the trace of $f(A)$. Particular attention is paid to the matrix logarithm, which is needed for log-determinant estimation. We improve and extend an existing result, to not only cover Rademacher but also Gaussian random vectors.

[94]  arXiv:2005.10017 [pdf, other]
Title: Differential Mapping Spiking Neural Network for Sensor-Based Robot Control
Comments: 11 pages, 11 figures
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

In this work, a spiking neural network is proposed for approximating differential sensorimotor maps of robotic systems. The computed model is used as a local Jacobian-like projection that relates changes in sensor space to changes in motor space. The network consists of an input (sensory) layer and an output (motor) layer connected through plastic synapses, with interinhibtory connections at the output layer. Spiking neurons are modeled as Izhikevich neurons with synapses' learning rule based on spike-timing-dependent plasticity. Feedback data from proprioceptive and exteroceptive sensors are encoded and fed into the input layer through a motor babbling process. As the main challenge to building an efficient SNN is to tune its parameters, we present an intuitive tuning method that enables us to considerably reduce the number of neurons and the amount of data required for training. Our proposed architecture represents a biologically plausible neural controller that is capable of handling noisy sensor readings to guide robot movements in real-time. Experimental results are presented to validate the control methodology with a vision-guided robot.

[95]  arXiv:2005.10019 [pdf, other]
Title: Every Colour You Are: Stance Prediction and Turnaround in Controversial Issues
Comments: Accepted at WebSci'20
Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY)

Web platforms have allowed political manifestation and debate for decades. Technology changes have brought new opportunities for expression, and the availability of longitudinal data of these debates entice new questions regarding who participates, and who updates their opinion. The aim of this work is to provide a methodology to measure these phenomena, and to test this methodology on a specific topic, abortion, as observed on one of the most popular micro-blogging platforms. To do so, we followed the discussion on Twitter about abortion in two Spanish-speaking countries from 2015 to 2018. Our main insights are two fold. On the one hand, people adopted new technologies to express their stances, particularly colored variations of heart emojis ([green heart] & [purple heart]) in a way that mirrored physical manifestations on abortion. On the other hand, even on issues with strong opinions, opinions can change, and these changes show differences in demographic groups. These findings imply that debate on the Web embraces new ways of stance adherence, and that changes of opinion can be measured and characterized.

[96]  arXiv:2005.10026 [pdf, ps, other]
Title: Reinforcement Learning for Variable Selection in a Branch and Bound Algorithm
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Mixed integer linear programs are commonly solved by Branch and Bound algorithms. A key factor of the efficiency of the most successful commercial solvers is their fine-tuned heuristics. In this paper, we leverage patterns in real-world instances to learn from scratch a new branching strategy optimised for a given problem and compare it with a commercial solver. We propose FMSTS, a novel Reinforcement Learning approach specifically designed for this task. The strength of our method lies in the consistency between a local value function and a global metric of interest. In addition, we provide insights for adapting known RL techniques to the Branch and Bound setting, and present a new neural network architecture inspired from the literature. To our knowledge, it is the first time Reinforcement Learning has been used to fully optimise the branching strategy. Computational experiments show that our method is appropriate and able to generalise well to new instances.

[97]  arXiv:2005.10027 [pdf, other]
Title: Open, Programmable, and Virtualized 5G Networks: State-of-the-Art and the Road Ahead
Subjects: Networking and Internet Architecture (cs.NI)

Fifth generation (5G) cellular networks will serve a wide variety of heterogeneous use cases, including mobile broadband users, ultra-low latency services, and massively dense connectivity scenarios. The resulting diverse communication requirements will demand networking with unprecedented flexibility, not currently provided by the monolithic and black-box approach of 4G cellular networks. The research community and an increasing number of standardization bodies and industry coalitions have recognized softwarization, virtualization, and disaggregation of networking functionalities as the key enablers of the needed shift to flexibility. Particularly, software-defined cellular networks are heralded as the prime technology to satisfy the new application-driven traffic requirements and to support the highly time-varying topology and interference dynamics, because of their openness through well-defined interfaces, and programmability, for swift and responsive network optimization. Leading the technological innovation in this direction, several 5G software-based projects and alliances have embraced the open source approach, making new libraries and frameworks available to the wireless community. This race to open source softwarization, however, has led to a deluge of solutions whose interoperability and interactions are often unclear. This article provides the first cohesive and exhaustive compendium of recent open source software and frameworks for 5G cellular networks, with a full stack and end-to-end perspective. We detail their capabilities and functionalities focusing on how their constituting elements fit the 5G ecosystem, and unravel the interactions among the surveyed solutions. Finally, we review platforms on which these frameworks can run, and discuss the limitations of the state-of-the-art, and feasible directions towards fully open source, programmable 5G networks.

[98]  arXiv:2005.10029 [pdf, ps, other]
Title: An FPRAS and Polynomial-Time Uniform Sampler for Tree Automata
Subjects: Data Structures and Algorithms (cs.DS)

In this work, we introduce the first fully polynomial time randomized approximation scheme (FPRAS) for counting the number of trees of size $n$ accepted by a tree automaton, where $n$ is given in unary, and the first polynomial time algorithm for sampling uniformly from this set of trees. Our results improve over the prior quasi-polynomial time randomized approximation scheme (QPRAS) and sampling algorithm of Gore, Jerrum, Kannan, Sweedyk, and Mahaney 97'. At the heart of our algorithm is a reduction to the problem of estimating the number of strings of length $n$ accepted by a succinct non-deterministic finite automaton (NFA), which is an NFA where the transitions are labeled by succinctly encoded sets of symbols, whose sizes can be exponential in the encoding length. Assuming these sets of symbols can be efficiently sampled from, and their sizes approximated, we show that there is an FPRAS and polynomial time almost uniform sampler for succinct NFAs, which may be of independent interest.
We demonstrate that, by applying our FPRAS for tree automata, we can obtain an FPRAS for many hitherto open problems in the fields of constraint satisfaction problems (CSPs), database systems, software verification, and knowledge compilation. Specifically, we obtain an FPRAS for counting solutions for CSPs that are acyclic or, more generally, that have bounded hypertree-width, which results in an FPRAS for counting the number of answers to conjunctive queries that are acyclic or which have bounded hypertree-width. Moreover, these results can be extended to unions of acyclic conjunctive queries, and to the more general class of unions of conjunctive queries with bounded hypertree-width. Finally, we also obtain FPRAS for the problems of counting the number of error threads in programs with nested call subroutines, and counting valid assignments to structured DNNF circuits.

[99]  arXiv:2005.10033 [pdf, other]
Title: Deep learning with 4D spatio-temporal data representations for OCT-based force estimation
Comments: Accepted for publication in Medical Image Analysis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Estimating the forces acting between instruments and tissue is a challenging problem for robot-assisted minimally-invasive surgery. Recently, numerous vision-based methods have been proposed to replace electro-mechanical approaches. Moreover, optical coherence tomography (OCT) and deep learning have been used for estimating forces based on deformation observed in volumetric image data. The method demonstrated the advantage of deep learning with 3D volumetric data over 2D depth images for force estimation. In this work, we extend the problem of deep learning-based force estimation to 4D spatio-temporal data with streams of 3D OCT volumes. For this purpose, we design and evaluate several methods extending spatio-temporal deep learning to 4D which is largely unexplored so far. Furthermore, we provide an in-depth analysis of multi-dimensional image data representations for force estimation, comparing our 4D approach to previous, lower-dimensional methods. Also, we analyze the effect of temporal information and we study the prediction of short-term future force values, which could facilitate safety features. For our 4D force estimation architectures, we find that efficient decoupling of spatial and temporal processing is advantageous. We show that using 4D spatio-temporal data outperforms all previously used data representations with a mean absolute error of 10.7mN. We find that temporal information is valuable for force estimation and we demonstrate the feasibility of force prediction.

[100]  arXiv:2005.10036 [pdf, other]
Title: Uncertainty Quantification Using Neural Networks for Molecular Property Prediction
Subjects: Machine Learning (cs.LG); Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)

Uncertainty quantification (UQ) is an important component of molecular property prediction, particularly for drug discovery applications where model predictions direct experimental design and where unanticipated imprecision wastes valuable time and resources. The need for UQ is especially acute for neural models, which are becoming increasingly standard yet are challenging to interpret. While several approaches to UQ have been proposed in the literature, there is no clear consensus on the comparative performance of these models. In this paper, we study this question in the context of regression tasks. We systematically evaluate several methods on five benchmark datasets using multiple complementary performance metrics. Our experiments show that none of the methods we tested is unequivocally superior to all others, and none produces a particularly reliable ranking of errors across multiple datasets. While we believe these results show that existing UQ methods are not sufficient for all common use-cases and demonstrate the benefits of further research, we conclude with a practical recommendation as to which existing techniques seem to perform well relative to others.

[101]  arXiv:2005.10038 [pdf, ps, other]
Title: Coopetition Against an Amazon
Subjects: Computer Science and Game Theory (cs.GT); Theoretical Economics (econ.TH)

This paper studies cooperative data-sharing between competitors vying to predict a consumer's tastes. We design optimal data-sharing schemes both for when they compete only with each other, and for when they additionally compete with an Amazon---a company with more, better data. In both cases we show that participants benefit from such coopetition. We then apply the insights from our optimal schemes to more general settings.

[102]  arXiv:2005.10039 [pdf, other]
Title: The Effects of Randomness on the Stability of Node Embeddings
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)

We systematically evaluate the (in-)stability of state-of-the-art node embedding algorithms due to randomness, i.e., the random variation of their outcomes given identical algorithms and graphs. We apply five node embeddings algorithms---HOPE, LINE, node2vec, SDNE, and GraphSAGE---to synthetic and empirical graphs and assess their stability under randomness with respect to (i) the geometry of embedding spaces as well as (ii) their performance in downstream tasks. We find significant instabilities in the geometry of embedding spaces independent of the centrality of a node. In the evaluation of downstream tasks, we find that the accuracy of node classification seems to be unaffected by random seeding while the actual classification of nodes can vary significantly. This suggests that instability effects need to be taken into account when working with node embeddings. Our work is relevant for researchers and engineers interested in the effectiveness, reliability, and reproducibility of node embedding approaches.

[103]  arXiv:2005.10043 [pdf, other]
Title: Leveraging Graph to Improve Abstractive Multi-Document Summarization
Comments: Accepted by ACL2020
Subjects: Computation and Language (cs.CL)

Graphs that capture relations between textual units have great benefits for detecting salient information from multiple documents and generating overall coherent summaries. In this paper, we develop a neural abstractive multi-document summarization (MDS) model which can leverage well-known graph representations of documents such as similarity graph and discourse graph, to more effectively process multiple input documents and produce abstractive summaries. Our model utilizes graphs to encode documents in order to capture cross-document relations, which is crucial to summarizing long documents. Our model can also take advantage of graphs to guide the summary generation process, which is beneficial for generating coherent and concise summaries. Furthermore, pre-trained language models can be easily combined with our model, which further improve the summarization performance significantly. Empirical results on the WikiSum and MultiNews dataset show that the proposed architecture brings substantial improvements over several strong baselines.

[104]  arXiv:2005.10044 [pdf, other]
Title: Benchmarking of a software stack for autonomous racing against a professional human race driver
Comments: Accepted at 2020 Fifteenth International Conference on Ecological Vehicles and Renewable Energies (EVER)
Subjects: Human-Computer Interaction (cs.HC); Robotics (cs.RO)

The way to full autonomy of public road vehicles requires the step-by-step replacement of the human driver, with the ultimate goal of replacing the driver completely. Eventually, the driving software has to be able to handle all situations that occur on its own, even emergency situations. These particular situations require extreme combined braking and steering actions at the limits of handling to avoid an accident or to diminish its consequences. An average human driver is not trained to handle such extreme and rarely occurring situations and therefore often fails to do so. However, professional race drivers are trained to drive a vehicle utilizing the maximum amount of possible tire forces. These abilities are of high interest for the development of autonomous driving software. Here, we compare a professional race driver and our software stack developed for autonomous racing with data analysis techniques established in motorsports. The goal of this research is to derive indications for further improvement of the performance of our software and to identify areas where it still fails to meet the performance level of the human race driver. Our results are used to extend our software's capabilities and also to incorporate our findings into the research and development of public road autonomous vehicles.

[105]  arXiv:2005.10048 [pdf, other]
Title: Enhancing Word Embeddings with Knowledge Extracted from Lexical Resources
Comments: Accepted to ACL 2020 SRW
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

In this work, we present an effective method for semantic specialization of word vector representations. To this end, we use traditional word embeddings and apply specialization methods to better capture semantic relations between words. In our approach, we leverage external knowledge from rich lexical resources such as BabelNet. We also show that our proposed post-specialization method based on an adversarial neural network with the Wasserstein distance allows to gain improvements over state-of-the-art methods on two tasks: word similarity and dialog state tracking.

[106]  arXiv:2005.10050 [pdf, other]
Title: Risk of Training Diagnostic Algorithms on Data with Demographic Bias
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

One of the critical challenges in machine learning applications is to have fair predictions. There are numerous recent examples in various domains that convincingly show that algorithms trained with biased datasets can easily lead to erroneous or discriminatory conclusions. This is even more crucial in clinical applications where the predictive algorithms are designed mainly based on a limited or given set of medical images and demographic variables such as age, sex and race are not taken into account. In this work, we conduct a survey of the MICCAI 2018 proceedings to investigate the common practice in medical image analysis applications. Surprisingly, we found that papers focusing on diagnosis rarely describe the demographics of the datasets used, and the diagnosis is purely based on images. In order to highlight the importance of considering the demographics in diagnosis tasks, we used a publicly available dataset of skin lesions. We then demonstrate that a classifier with an overall area under the curve (AUC) of 0.83 has variable performance between 0.76 and 0.91 on subgroups based on age and sex, even though the training set was relatively balanced. Moreover, we show that it is possible to learn unbiased features by explicitly using demographic variables in an adversarial training setup, which leads to balanced scores per subgroups. Finally, we discuss the implications of these results and provide recommendations for further research.

[107]  arXiv:2005.10053 [pdf, other]
Title: Map Generation from Large Scale Incomplete and Inaccurate Data Labels
Comments: This paper is accepted by KDD 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Accurately and globally mapping human infrastructure is an important and challenging task with applications in routing, regulation compliance monitoring, and natural disaster response management etc.. In this paper we present progress in developing an algorithmic pipeline and distributed compute system that automates the process of map creation using high resolution aerial images. Unlike previous studies, most of which use datasets that are available only in a few cities across the world, we utilizes publicly available imagery and map data, both of which cover the contiguous United States (CONUS). We approach the technical challenge of inaccurate and incomplete training data adopting state-of-the-art convolutional neural network architectures such as the U-Net and the CycleGAN to incrementally generate maps with increasingly more accurate and more complete labels of man-made infrastructure such as roads and houses. Since scaling the mapping task to CONUS calls for parallelization, we then adopted an asynchronous distributed stochastic parallel gradient descent training scheme to distribute the computational workload onto a cluster of GPUs with nearly linear speed-up.

[108]  arXiv:2005.10054 [pdf, ps, other]
Title: A New Lower Bound for Deterministic Truthful Scheduling
Comments: 15 pages
Subjects: Computer Science and Game Theory (cs.GT)

We study the problem of truthfully scheduling $m$ tasks to $n$ selfish unrelated machines, under the objective of makespan minimization, as was introduced in the seminal work of Nisan and Ronen [STOC'99]. Closing the current gap of $[2.618,n]$ on the approximation ratio of deterministic truthful mechanisms is a notorious open problem in the field of algorithmic mechanism design. We provide the first such improvement in more than a decade, since the lower bounds of $2.414$ (for $n=3$) and $2.618$ (for $n\to\infty$) by Christodoulou et al. [SODA'07] and Koutsoupias and Vidali [MFCS'07], respectively. More specifically, we show that the currently best lower bound of $2.618$ can be achieved even for just $n=4$ machines; for $n=5$ we already get the first improvement, namely $2.711$; and allowing the number of machines to grow arbitrarily large we can get a lower bound of $2.755$.

[109]  arXiv:2005.10061 [pdf, other]
Title: PeopleTraffic: a common framework for harmonizing privacy and epidemic risks
Authors: Ruggero Caravita
Subjects: Computers and Society (cs.CY); Physics and Society (physics.soc-ph)

PeopleTraffic is a proposed initiative to develop a real-time, open-data population density mapping tool open to public institutions, private companies and the civil society, providing a common framework for infection spreading prevention. The system is based on a real-time people' locations gathering and mapping system from available 2G, 3G and 4G mobile networks operators, enforcing privacy-by-design through the adoption of an innovative data anonymizing algorithm inspired by quantum information de-localizing processes. Besides being originally targeted to help balancing social distancing regulations during the Phase-2 of the COVID-19 pandemics, PeopleTraffic would be beneficial for any infection spreading prevention event, e.g. supporting policy-makers in strategic decision-making.

[110]  arXiv:2005.10062 [pdf, ps, other]
Title: Safeguarding MIMO Communications with Reconfigurable Metasurfaces and Artificial Noise
Comments: 6 pages, 3 figures, submitted to IEEE GLOBECOM 2020
Subjects: Information Theory (cs.IT)

Wireless communications empowered by Reconfigurable Intelligent (meta)Surfaces (RISs) are recently gaining remarkable research attention due to the increased system design flexibility offered by RISs for diverse functionalities. In this paper, we consider a Multiple Input Multiple Output (MIMO) physical layer security system including one legitimate passive and one eavesdropping passive RIS with the former being transparent to the eavesdropper and the latter's presence being unknown at the legitimate link. We first focus on the eavesdropping subsystem and present a joint design of the eavesdropper's combining vector and the reflection coefficients of the eavesdropping RIS. Then, focusing on secrecy rate maximization, we propose a transmission scheme that jointly designs the legitimate precoding vector and Artificial Noise (AN) covariance matrix, as well as the reflection coefficients of the legitimate RIS. Our simulation results reveal that, in the absence of a legitimate RIS, AN and precoding are incapable of offering nonzero secrecy rates even for eavesdropping RISs with small numbers of unit elements. However, when a L-element legitimate RIS is deployed, confidential communication can be safeguarded against cases with even more than a 3L-element eavesdropping RIS.

[111]  arXiv:2005.10063 [pdf, other]
Title: A Survey of Software Foundations in Open Source
Comments: 15 pages, 3 figures, 6 tables, extended version of paper published at ICSE-SEIS'18
Subjects: Software Engineering (cs.SE)

A number of software foundations have been created as legal instruments to better articulate the structure, collaboration and financial model of Open Source Software (OSS) projects. Some examples are the Apache, Linux, or Mozilla foundations. However, the mission and support provided by these foundations largely differ among them. In this paper we perform a study on the role of foundations in OSS development. We analyze the nature, activities, role and governance of 101 software foundations and then go deeper on the 27 having as concrete goal the development and evolution of specific open source projects (and not just generic actions to promote the free software movement or similar). Our results reveal the existence of a significant number of foundations with the sole purpose of promoting the free software movement and/or that limit themselves to core legal aspects but do not play any role in the day-to-day operations of the project (e.g., umbrella organizations for a large variety of projects). Therefore, while useful, foundations do not remove the need for specific projects to develop their own specific governance, contribution and development policies. A website to help projects to choose the foundation that best fits their needs is also available.

[112]  arXiv:2005.10067 [pdf, other]
Title: Personalized Chatbot Trustworthiness Ratings
Comments: 11 pages
Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)

Conversation agents, commonly referred to as chatbots, are increasingly deployed in many domains to allow people to have a natural interaction while trying to solve a specific problem. Given their widespread use, it is important to provide their users with methods and tools to increase users awareness of various properties of the chatbots, including non-functional properties that users may consider important in order to trust a specific chatbot. For example, users may want to use chatbots that are not biased, that do not use abusive language, that do not leak information to other users, and that respond in a style which is appropriate for the user's cognitive level.
In this paper, we address the setting where a chatbot cannot be modified, its training data cannot be accessed, and yet a neutral party wants to assess and communicate its trustworthiness to a user, tailored to the user's priorities over the various trust issues. Such a rating can help users choose among alternative chatbots, developers test their systems, business leaders price their offering, and regulators set policies. We envision a personalized rating methodology for chatbots that relies on separate rating modules for each issue, and users' detected priority orderings among the relevant trust issues, to generate an aggregate personalized rating for the trustworthiness of a chatbot. The method is independent of the specific trust issues and is parametric to the aggregation procedure, thereby allowing for seamless generalization. We illustrate its general use, integrate it with a live chatbot, and evaluate it on four dialog datasets and representative user profiles, validated with user surveys.

[113]  arXiv:2005.10068 [pdf, ps, other]
Title: Cell-Free Massive MIMO with Underlaid D2D Communications and Low Resolution ADCs
Comments: 30 pages, 7 figures, 1 table, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Information Theory (cs.IT)

In this article, we investigate uplink transmission of a cell-free massive multiple-input multiple-output (CF-mMIMO) system, underlaid with device-to-device (D2D) communications, and assuming that access points (APs) are equipped with low resolution analog-to-digital converters (ADCs). D2D user equipments (DUEs) are assumed to communicate in the same time-frequency resources as the CF-mMIMO user equipments (CFUEs). We derive closed-form expressions for achievable rates of both types of users, with perfect and imperfect channel state information. A set of orthogonal pilot sequences is reused among all the users to enable channel estimation. Then, greedy and graph coloring-based algorithms are employed to reduce pilot contamination. Furthermore, in order to control interference and improve the performance, two power control strategies are considered; the former aims at maximizing CFUEs' sum spectral efficiency (SE) subject to quality of service constraints on DUEs, while the latter maximizes weighted product of CFUEs' and DUEs' signal-to-interference-plus-noise-ratios (SINRs). For both the optimization problems, a solution based on geometric programming (GP) is developed. Finally, numerical results are provided to highlight the system performance and to show the improvements granted by the use of the proposed pilot assignment algorithms and power allocation solution, compared with the random pilot assignment and full power transmission case.

[114]  arXiv:2005.10070 [pdf, other]
Title: A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal
Comments: Camera-ready version for ACL 2020
Subjects: Computation and Language (cs.CL)

Multi-document summarization (MDS) aims to compress the content in large document collections into short summaries and has important applications in story clustering for newsfeeds, presentation of search results, and timeline generation. However, there is a lack of datasets that realistically address such use cases at a scale large enough for training supervised models for this task. This work presents a new dataset for MDS that is large both in the total number of document clusters and in the size of individual clusters. We build this dataset by leveraging the Wikipedia Current Events Portal (WCEP), which provides concise and neutral human-written summaries of news events, with links to external source articles. We also automatically extend these source articles by looking for related articles in the Common Crawl archive. We provide a quantitative analysis of the dataset and empirical results for several state-of-the-art MDS techniques.

[115]  arXiv:2005.10079 [pdf, other]
Title: Towards a Generalized Approach to Nonlocal Elasticity via Fractional-Order Mechanics
Comments: 36 pages, 15 figures
Subjects: Numerical Analysis (math.NA); Computational Engineering, Finance, and Science (cs.CE); Dynamical Systems (math.DS)

This study presents a fractional-order continuum mechanics approach that allows combining selected characteristics of nonlocal elasticity, typical of classical integral and gradient formulations, under a single frame-invariant framework. The resulting generalized theory is capable of capturing both stiffening and softening effects and it is not subject to the inconsistencies often observed under selected external loads and boundary conditions. The governing equations of a 1D continuum are derived by continualization of the Lagrangian of a 1D lattice subject to long-range interactions. This approach is particularly well suited to highlight the connection between the fractional-order operators and the microscopic properties of the medium. The approach is also extended to derive, by means of variational principles, the governing equations of a 3D continuum in strong form. The positive definite potential energy, characteristic of our fractional formulation, always ensures well-posed governing equations. This aspect, combined with the differ-integral nature of fractional-order operators, guarantees both stability and the ability to capture dispersion without requiring additional inertia gradient terms. The proposed formulation is applied to the static and free vibration analyses of either Timoshenko beams or Mindlin plates. Numerical results, obtained by a fractional-order finite element method, show that the fractional-order formulation is able to model both stiffening and softening response in these slender structures. The numerical results provide the foundation to critically analyze the physical significance of the different fractional model parameters as well as their effect on the response of the structural elements.

[116]  arXiv:2005.10082 [pdf]
Title: What country, university or research institute, performed the best on COVID-19? Bibliometric analysis of scientific literature
Subjects: Digital Libraries (cs.DL); Computers and Society (cs.CY)

In this article, we conduct data mining to discover the countries, universities and companies, produced or collaborated the most research on Covid-19 since the pandemic started. We present some interesting findings, but despite analysing all available records on COVID-19 from the Web of Science Core Collection, we failed to reach any significant conclusions on how the world responded to the COVID-19 pandemic. Therefore, we increased our analysis to include all available data records on pandemics and epidemics from 1900 to 2020. We discover some interesting results on countries, universities and companies, that produced collaborated most the most in research on pandemic and epidemics. Then we compared the results with the analysing on COVID-19 data records. This has created some interesting findings that are explained and graphically visualised in the article.

[117]  arXiv:2005.10083 [pdf]
Title: Securing Digital Systems via Split-Chip Obfuscation
Subjects: Cryptography and Security (cs.CR)

Security is an important facet of integrated circuit design for many applications. IP privacy and Trojan insertion are growing threats as circuit fabrication in advanced nodes almost inevitably relies on untrusted foundries. A proposed solution is Split-Chip Obfuscation that uses a combination trusted and untrusted IC fabrication scheme. By utilizing two CMOS processes, a system is endowed with the stronger security guaranties of a trusted legacy node while also leveraging the performance and density of an advanced untrusted node. Critical to the effectiveness of Split-Chip Obfuscation is finding an optimum partitioning of a system between the two ICs. In this paper, we develop a design flow for the Split-Chip Obfuscation scheme, defining the essential system metrics and creating a tool to rapidly assess the large design space. We demonstrate the concept of such a tool and show its application on an example SoC.

[118]  arXiv:2005.10084 [pdf, other]
Title: Context-Aware Learning to Rank with Self-Attention
Subjects: Information Retrieval (cs.IR)

In learning to rank, one is interested in optimising the global ordering of a list of items according to their utility for users. Popular approaches learn a scoring function that scores items individually (i.e. without the context of other items in the list) by optimising a pointwise, pairwise or listwise loss. The list is then sorted in the descending order of the scores. Possible interactions between items present in the same list are taken into account in the training phase at the loss level. However, during inference, items are scored individually, and possible interactions between them are not considered. In this paper, we propose a context-aware neural network model that learns item scores by applying a self-attention mechanism. The relevance of a given item is thus determined in the context of all other items present in the list, both in training and in inference. We empirically demonstrate significant performance gains of self-attention based neural architecture over Multi-Layer Perceptron baselines. This effect is consistent across popular pointwise, pairwise and listwise losses on datasets with both implicit and explicit relevance feedback. Finally, we report new state-of-the-art results on MSLR-WEB30K, the learning to rank benchmark.

[119]  arXiv:2005.10085 [pdf, other]
Title: DisCoveR: Accurate & Efficient Discovery of Declarative Process Models
Comments: Author's original version
Subjects: Machine Learning (cs.LG); Formal Languages and Automata Theory (cs.FL); Software Engineering (cs.SE); Machine Learning (stat.ML)

Declarative process modeling formalisms - which capture high-level process constraints - have seen growing interest, especially for modeling flexible processes. This paper presents DisCoveR, an extremely efficient and accurate declarative miner for learning Dynamic Condition Response (DCR) Graphs from event logs. We precisely formalize the algorithm, describe a highly efficient bit vector implementation and rigorously evaluate performance against two other declarative miners, representing the state-of-the-art in Declare and DCR Graphs mining. DisCoveR outperforms each of these w.r.t. a binary classification task, achieving an average accuracy of 96.2% in the Process Discovery Contest 2019. Due to its linear time complexity, DisCoveR also achieves run-times 1-2 orders of magnitude below its declarative counterparts. Finally, we show how the miner has been integrated in a state-of-the-art declarative process modeling framework as a model recommendation tool, discuss how discovery can play an integral part of the modeling task and report on how the integration has improved the modeling experience of end-users.

[120]  arXiv:2005.10086 [pdf, other]
Title: Classifying Suspicious Content in Tor Darknet
Comments: To be published on the JNIC 2020 Conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)

One of the tasks of law enforcement agencies is to find evidence of criminal activity in the Darknet. However, visiting thousands of domains to locate visual information containing illegal acts manually requires a considerable amount of time and resources. Furthermore, the background of the images can pose a challenge when performing classification. To solve this problem, in this paper, we explore the automatic classification Tor Darknet images using Semantic Attention Keypoint Filtering, a strategy that filters non-significant features at a pixel level that do not belong to the object of interest, by combining saliency maps with Bag of Visual Words (BoVW). We evaluated SAKF on a custom Tor image dataset against CNN features: MobileNet v1 and Resnet50, and BoVW using dense SIFT descriptors, achieving a result of 87.98% accuracy and outperforming all other approaches.

[121]  arXiv:2005.10090 [pdf, other]
Title: Perceptual Hashing applied to Tor domains recognition
Comments: To be published on the JNIC 2020 Conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)

The Tor darknet hosts different types of illegal content, which are monitored by cybersecurity agencies. However, manually classifying Tor content can be slow and error-prone. To support this task, we introduce Frequency-Dominant Neighborhood Structure (F-DNS), a new perceptual hashing method for automatically classifying domains by their screenshots. First, we evaluated F-DNS using images subject to various content preserving operations. We compared them with their original images, achieving better correlation coefficients than other state-of-the-art methods, especially in the case of rotation. Then, we applied F-DNS to categorize Tor domains using the Darknet Usage Service Images-2K (DUSI-2K), a dataset with screenshots of active Tor service domains. Finally, we measured the performance of F-DNS against an image classification approach and a state-of-the-art hashing method. Our proposal obtained 98.75% accuracy in Tor images, surpassing all other methods compared.

[122]  arXiv:2005.10091 [pdf, other]
Title: Label Efficient Visual Abstractions for Autonomous Driving
Comments: First two authors contributed equally, listed in alphabetical order
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

It is well known that semantic segmentation can be used as an effective intermediate representation for learning driving policies. However, the task of street scene semantic segmentation requires expensive annotations. Furthermore, segmentation algorithms are often trained irrespective of the actual driving task, using auxiliary image-space loss functions which are not guaranteed to maximize driving metrics such as safety or distance traveled per intervention. In this work, we seek to quantify the impact of reducing segmentation annotation costs on learned behavior cloning agents. We analyze several segmentation-based intermediate representations. We use these visual abstractions to systematically study the trade-off between annotation efficiency and driving performance, i.e., the types of classes labeled, the number of image samples used to learn the visual abstraction model, and their granularity (e.g., object masks vs. 2D bounding boxes). Our analysis uncovers several practical insights into how segmentation-based visual abstractions can be exploited in a more label efficient manner. Surprisingly, we find that state-of-the-art driving performance can be achieved with orders of magnitude reduction in annotation cost. Beyond label efficiency, we find several additional training benefits when leveraging visual abstractions, such as a significant reduction in the variance of the learned policy when compared to state-of-the-art end-to-end driving models.

[123]  arXiv:2005.10094 [pdf]
Title: Acceptance of e-procurement in organisations
Comments: E-procurement and Structural Equation Modeling (SEM)
Subjects: Other Computer Science (cs.OH)

This research is concerned with the development of a realistic model for e-procurement adoption by organisations and groups observing the Rules of Islamic Sharia (RIS). This model is intended to be based on the behavioural control, subjective norms, and the recognition of the benefits and risks of e procurement adoption. The developed model,(E-PAM), combined and extended two existing models previously used for information technology adoption. Central to the design of the E-PAM is the principle that a realistic model should consider all relevant psychological, social, cultural, demography, and religious factors. .

[124]  arXiv:2005.10095 [pdf, ps, other]
Title: The K-Centre Problem for Necklaces
Subjects: Data Structures and Algorithms (cs.DS)

In graph theory, the objective of the k-centre problem is to find a set of $k$ vertices for which the largest distance of any vertex to its closest vertex in the $k$-set is minimised. In this paper, we introduce the $k$-centre problem for sets of necklaces, i.e. the equivalence classes of words under the cyclic shift. This can be seen as the k-centre problem on the complete weighted graph where every necklace is represented by a vertex, and each edge has a weight given by the overlap distance between any pair of necklaces. Similar to the graph case, the goal is to choose $k$ necklaces such that the distance from any word in the language and its nearest centre is minimised. However, in a case of k-centre problem for languages the size of associated graph maybe exponential in relation to the description of the language, i.e., the length of the words l and the size of the alphabet q. We derive several approximation algorithms for the $k$-centre problem on necklaces, with logarithmic approximation factor in the context of l and k, and within a constant factor for a more restricted case.

[125]  arXiv:2005.10098 [pdf, other]
Title: Classification of Industrial Control Systems screenshots using Transfer Learning
Comments: To be published on the JNIC 2020 Conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Industrial Control Systems depend heavily on security and monitoring protocols. Several tools are available for this purpose, which scout vulnerabilities and take screenshots from various control panels for later analysis. However, they do not adequately classify images into specific control groups, which can difficult operations performed by manual operators. In order to solve this problem, we use transfer learning with five CNN architectures, pre-trained on Imagenet, to determine which one best classifies screenshots obtained from Industrial Controls Systems. Using 337 manually labeled images, we train these architectures and study their performance both in accuracy and CPU and GPU time. We find out that MobilenetV1 is the best architecture based on its 97,95% of F1-Score, and its speed on CPU with 0.47 seconds per image. In systems where time is critical and GPU is available, VGG16 is preferable because it takes 0.04 seconds to process images, but dropping performance to 87,67%.

[126]  arXiv:2005.10101 [pdf, ps, other]
Title: A Unifying Approximate Potential for Weighted Congestion Games
Subjects: Computer Science and Game Theory (cs.GT)

We provide a unifying, black-box tool for establishing existence of approximate equilibria in weighted congestion games and, at the same time, bounding their Price of Stability. Our framework can handle resources with general costs--including, in particular, decreasing ones--and is formulated in terms of a set of parameters which are determined via elementary analytic properties of the cost functions.
We demonstrate the power of our tool by applying it to recover the recent result of Caragiannis and Fanelli [ICALP'19] for polynomial congestion games; improve upon the bounds for fair cost sharing games by Chen and Roughgarden [Theory Comput. Syst., 2009]; and derive new bounds for nondecreasing concave costs. An interesting feature of our framework is that it can be readily applied to mixtures of different families of cost functions; for example, we provide bounds for games whose resources are conical combinations of polynomial and concave costs.
In the core of our analysis lies the use of a unifying approximate potential function which is simple and general enough to be applicable to arbitrary congestion games, but at the same time powerful enough to produce state-of-the-art bounds across a range of different cost functions.

[127]  arXiv:2005.10103 [pdf, other]
Title: BeepTrace: Blockchain-enabled Privacy-preserving Contact Tracing for COVID-19 Pandemic and Beyond
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)

The outbreak of COVID-19 pandemic has exposed an urgent need for effective contact tracing solutions through mobile phone applications to prevent the infection from spreading further. However, due to the nature of contact tracing, public concern on privacy issues has been a bottleneck to the existing solutions, which is significantly affecting the uptake of contact tracing applications across the globe. In this paper, we present a blockchain-enabled privacy-preserving contact tracing scheme: BeepTrace, where we propose to adopt blockchain bridging the user/patient and the authorized solvers to desensitize the user ID and location information. Compared with recently proposed contract tracing solutions, our approach shows higher security and privacy with the additional advantages of being battery friendly and globally accessible. Results show viability in terms of the required resource at both server and mobile phone perspectives. Through breaking the privacy concerns of the public, the proposed BeepTrace solution can provide a timely framework for authorities, companies, software developers and researchers to fast develop and deploy effective digital contact tracing applications, to conquer COVID-19 pandemic soon. Meanwhile, the open initiative of BeepTrace allows worldwide collaborations, integrate existing tracing and positioning solutions with the help of blockchain technology.

[128]  arXiv:2005.10107 [pdf, other]
Title: Examining the State-of-the-Art in News Timeline Summarization
Comments: Camera-ready version for ACL 2020
Subjects: Computation and Language (cs.CL)

Previous work on automatic news timeline summarization (TLS) leaves an unclear picture about how this task can generally be approached and how well it is currently solved. This is mostly due to the focus on individual subtasks, such as date selection and date summarization, and to the previous lack of appropriate evaluation metrics for the full TLS task. In this paper, we compare different TLS strategies using appropriate evaluation frameworks, and propose a simple and effective combination of methods that improves over the state-of-the-art on all tested benchmarks. For a more robust evaluation, we also present a new TLS dataset, which is larger and spans longer time periods than previous datasets.

[129]  arXiv:2005.10110 [pdf, other]
Title: M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems
Comments: Accepted by KDD 2020 ads track as an oral paper
Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG)

Combining graph representation learning with multi-view data (side information) for recommendation is a trend in industry. Most existing methods can be categorized as \emph{multi-view representation fusion}; they first build one graph and then integrate multi-view data into a single compact representation for each node in the graph. However, these methods are raising concerns in both engineering and algorithm aspects: 1) multi-view data are abundant and informative in industry and may exceed the capacity of one single vector, and 2) inductive bias may be introduced as multi-view data are often from different distributions. In this paper, we use a \emph{multi-view representation alignment} approach to address this issue. Particularly, we propose a multi-task multi-view graph representation learning framework (M2GRL) to learn node representations from multi-view graphs for web-scale recommender systems. M2GRL constructs one graph for each single-view data, learns multiple separate representations from multiple graphs, and performs alignment to model cross-view relations. M2GRL chooses a multi-task learning paradigm to learn intra-view representations and cross-view relations jointly. Besides, M2GRL applies homoscedastic uncertainty to adaptively tune the loss weights of tasks during training. We deploy M2GRL at Taobao and train it on 57 billion examples. According to offline metrics and online A/B tests, M2GRL significantly outperforms other state-of-the-art algorithms. Further exploration on diversity recommendation in Taobao shows the effectiveness of utilizing multiple representations produced by \method{}, which we argue is a promising direction for various industrial recommendation tasks of different focus.

[130]  arXiv:2005.10111 [pdf, other]
Title: The Effectiveness of Discretization in Forecasting: An Empirical Study on Neural Time Series Models
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Time series modeling techniques based on deep learning have seen many advancements in recent years, especially in data-abundant settings and with the central aim of learning global models that can extract patterns across multiple time series. While the crucial importance of appropriate data pre-processing and scaling has often been noted in prior work, most studies focus on improving model architectures. In this paper we empirically investigate the effect of data input and output transformations on the predictive performance of several neural forecasting architectures. In particular, we investigate the effectiveness of several forms of data binning, i.e. converting real-valued time series into categorical ones, when combined with feed-forward, recurrent neural networks, and convolution-based sequence models. In many non-forecasting applications where these models have been very successful, the model inputs and outputs are categorical (e.g. words from a fixed vocabulary in natural language processing applications or quantized pixel color intensities in computer vision). For forecasting applications, where the time series are typically real-valued, various ad-hoc data transformations have been proposed, but have not been systematically compared. To remedy this, we evaluate the forecasting accuracy of instances of the aforementioned model classes when combined with different types of data scaling and binning. We find that binning almost always improves performance (compared to using normalized real-valued inputs), but that the particular type of binning chosen is of lesser importance.

[131]  arXiv:2005.10114 [pdf, other]
Title: Network On Network for Tabular Data Classification in Real-world Applications
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Tabular data is the most common data format adopted by our customers ranging from retail, finance to E-commerce, and tabular data classification plays an essential role to their businesses. In this paper, we present Network On Network (NON), a practical tabular data classification model based on deep neural network to provide accurate predictions. Various deep methods have been proposed and promising progress has been made. However, most of them use operations like neural network and factorization machines to fuse the embeddings of different features directly, and linearly combine the outputs of those operations to get the final prediction. As a result, the intra-field information and the non-linear interactions between those operations (e.g. neural network and factorization machines) are ignored. Intra-field information is the information that features inside each field belong to the same field. NON is proposed to take full advantage of intra-field information and non-linear interactions. It consists of three components: field-wise network at the bottom to capture the intra-field information, across field network in the middle to choose suitable operations data-drivenly, and operation fusion network on the top to fuse outputs of the chosen operations deeply. Extensive experiments on six real-world datasets demonstrate NON can outperform the state-of-the-art models significantly. Furthermore, both qualitative and quantitative study of the features in the embedding space show NON can capture intra-field information effectively.

[132]  arXiv:2005.10126 [pdf]
Title: State Complexity of Reversible Watson-Crick Automata
Subjects: Formal Languages and Automata Theory (cs.FL)

Reversible Watson-Crick automata introduced by Chatterjee et.al. is a reversible variant of an Watson-Crick automata. It has already been shown that the addition of DNA properties to reversible automata significantly increases the computational power of the model. In this paper, we analyze the state complexity of Reversible Watson-Crick automata with respect to non-deterministic finite automata. We show that Reversible Watson-Crick automata in spite of being reversible in nature enjoy state complexity advantage over non deterministic finite automata. The result is interesting because conversion from non deterministic to deterministic automata results in exponential blow up of the number of states and classically increase in number of heads of the automata cannot compensate for non-determinism in deterministic and reversible models.

[133]  arXiv:2005.10127 [pdf]
Title: Multi-head Watson-Crick quantum finite automata
Comments: arXiv admin note: substantial text overlap with arXiv:1507.05282, arXiv:1607.00811
Subjects: Formal Languages and Automata Theory (cs.FL)

Watson-Crick quantum finite automata were introduced by Ganguly et.al. by combining properties of DNA and Quantum automata. In this paper we introduce a multi-head version of the above automaton. We further show that the multi-head variant is computationally more powerful than one-way multi-head reversible finite automata. In fact we also show that the multi-head variant accepts a language which is not accepted by any one-way multi-head deterministic finite automata.

[134]  arXiv:2005.10131 [pdf, other]
Title: Combining the Causal Judgments of Experts with Possibly Different Focus Areas
Comments: Appear in the Proceedings of the Sixteenth International Conference on Principles of Knowledge Representation and Reasoning (KR2018}, 2018
Subjects: Artificial Intelligence (cs.AI)

In many real-world settings, a decision-maker must combine information provided by different experts in order to decide on an effective policy. Alrajeh, Chockler, and Halpern [2018] showed how to combine causal models that are compatible in the sense that, for variables that appear in both models, the experts agree on the causal structure. In this work we show how causal models can be combined in cases where the experts might disagree on the causal structure for variables that appear in both models due to having different focus areas. We provide a new formal definition of compatibility of models in this setting and show how compatible models can be combined. We also consider the complexity of determining whether models are compatible. We believe that the notions defined in this work are of direct relevance to many practical decision making scenarios that come up in natural, social, and medical science settings.

[135]  arXiv:2005.10137 [pdf, ps, other]
Title: Some Common Mistakes in the Teaching and Textbooks of Modal Logic
Authors: Xuefeng Wen
Journal-ref: This is an English translation of the paper published in Studies in Logic (Chinese), 11(4A), 69-86, 2018
Subjects: Logic in Computer Science (cs.LO); Logic (math.LO)

We discuss four common mistakes in the teaching and textbooks of modal logic. The first one is missing the axiom $\Diamond\varphi\leftrightarrow\neg\Box\neg\varphi$, when choosing $\Diamond$ as the primitive modal operator, misunderstanding that $\Box$ and $\Diamond$ are symmetric. The second one is forgetting to make the set of formulas for filtration closed under subformulas, when proving the finite model property through filtration, neglecting that $\Box\varphi$ and $\Diamond\varphi$ may be abbreviations of formulas. The third one is giving wrong definitions of canonical relations in minimal canonical models that are unmatched with the primitive modal operators. The final one is misunderstanding the rule of necessitation, without knowing its distinction from the rule of modus ponens. To better understand the rule of necessitation, we summarize six ways of defining deductive consequence in modal logic: omitted definition, classical definition, ternary definition, reduced definition, bounded definition, and deflationary definition, and show that the last three definitions are equivalent to each other.

[136]  arXiv:2005.10141 [pdf, ps, other]
Title: Rational Consensus
Comments: Appears in Proceedings of the 35th Annual ACM Symposium on Principles of Distributed Computing, 2016
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computer Science and Game Theory (cs.GT)

We provide a game-theoretic analysis of consensus, assuming that processes are controlled by rational agents and may fail by crashing. We consider agents that \emph{care only about consensus}: that is, (a) an agent's utility depends only on the consensus value achieved (and not, for example, on the number of messages the agent sends) and (b) agents strictly prefer reaching consensus to not reaching consensus. We show that, under these assumptions, there is no \emph{ex post Nash Equilibrium}, even with only one failure. Roughly speaking, this means that there must always exist a \emph{failure pattern} (a description of who fails, when they fail, and which agents they do not send messages to in the round that they fail) and initial preferences for which an agent can gain by deviating. On the other hand, if we assume that there is a distribution $\pi$ on the failure patterns and initial preferences, then under minimal assumptions on $\pi$, there is a Nash equilibrium that tolerates $f$ failures (i.e., $\pi$ puts probability 1 on there being at most $f$ failures) if $f+1 < n$ (where $n$ is the total number of agents). Moreover, we show that a slight extension of the Nash equilibrium strategy is also a \emph{sequential} equilibrium (under the same assumptions about the distribution $\pi$).

[137]  arXiv:2005.10142 [pdf, other]
Title: Enabling RAN Slicing Through Carrier Aggregation in mmWave Cellular Networks
Comments: 8 pages, 8 figures. Proc. of the 18th Mediterranean Communication and Computer Networking Conference (MedComNet 2020), Arona, Italy, 2020
Subjects: Networking and Internet Architecture (cs.NI)

The ever increasing number of connected devices and of new and heterogeneous mobile use cases implies that 5G cellular systems will face demanding technical challenges. For example, Ultra-Reliable Low-Latency Communication (URLLC) and enhanced Mobile Broadband (eMBB) scenarios present orthogonal Quality of Service (QoS) requirements that 5G aims to satisfy with a unified Radio Access Network (RAN) design. Network slicing and mmWave communications have been identified as possible enablers for 5G. They provide, respectively, the necessary scalability and flexibility to adapt the network to each specific use case environment, and low latency and multi-gigabit-per-second wireless links, which tap into a vast, currently unused portion of the spectrum. The optimization and integration of these technologies is still an open research challenge, which requires innovations at different layers of the protocol stack. This paper proposes to combine them in a RAN slicing framework for mmWaves, based on carrier aggregation. Notably, we introduce MilliSlice, a cross-carrier scheduling policy that exploits the diversity of the carriers and maximizes their utilization, thus simultaneously guaranteeing high throughput for the eMBB slices and low latency and high reliability for the URLLC flows.

[138]  arXiv:2005.10148 [pdf, other]
Title: NR V2X Communications at Millimeter Waves: An End-to-End Performance Evaluation
Comments: 6 pages, 7 figures. Submitted to IEEE Globecom 2020
Subjects: Networking and Internet Architecture (cs.NI)

3GPP NR V2X represents the new 3GPP standard for next-generation vehicular systems which, among other innovations, supports vehicle-to-vehicle (V2V) operations in the millimeter wave (mmWave) spectrum to address the communication requirements of future intelligent automotive networks. While mmWaves will enable massive data rates and low latency, the propagation characteristics at very high frequencies become very challenging, thereby calling for accurate performance evaluations as a means to properly assess the performance of such systems. Along these lines, in this paper MilliCar, the new ns-3 module based on the latest NR V2X specifications, is used to provide an end-to-end performance evaluation of mmWave V2V networks. We investigate the impact of different propagation scenarios and system parameters, including the inter-vehicle distance, the adopted frame numerology, and the modulation and coding scheme, and provide guidelines towards the most promising V2V deployment configurations.

[139]  arXiv:2005.10149 [pdf, ps, other]
Title: Discriminative Dictionary Design for Action Classification in Still Images and Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In this paper, we address the problem of action recognition from still images and videos. Traditional local features such as SIFT, STIP etc. invariably pose two potential problems: 1) they are not evenly distributed in different entities of a given category and 2) many of such features are not exclusive of the visual concept the entities represent. In order to generate a dictionary taking the aforementioned issues into account, we propose a novel discriminative method for identifying robust and category specific local features which maximize the class separability to a greater extent. Specifically, we pose the selection of potent local descriptors as filtering based feature selection problem which ranks the local features per category based on a novel measure of distinctiveness. The underlying visual entities are subsequently represented based on the learned dictionary and this stage is followed by action classification using the random forest model followed by label propagation refinement. The framework is validated on the action recognition datasets based on still images (Stanford-40) as well as videos (UCF-50) and exhibits superior performances than the representative methods from the literature.

[140]  arXiv:2005.10150 [pdf, other]
Title: GCN-Based User Representation Learning for Unifying Robust Recommendation and Fraudster Detection
Subjects: Social and Information Networks (cs.SI); Information Retrieval (cs.IR)

In recent years, recommender system has become an indispensable function in all e-commerce platforms. The review rating data for a recommender system typically comes from open platforms, which may attract a group of malicious users to deliberately insert fake feedback in an attempt to bias the recommender system to their favour. The presence of such attacks may violate modeling assumptions that high-quality data is always available and these data truly reflect users' interests and preferences. Therefore, it is of great practical significance to construct a robust recommender system that is able to generate stable recommendations even in the presence of shilling attacks. In this paper, we propose GraphRfi - a GCN-based user representation learning framework to perform robust recommendation and fraudster detection in a unified way. In its end-to-end learning process, the probability of a user being identified as a fraudster in the fraudster detection component automatically determines the contribution of this user's rating data in the recommendation component; while the prediction error outputted in the recommendation component acts as an important feature in the fraudster detection component. Thus, these two components can mutually enhance each other. Extensive experiments have been conducted and the experimental results show the superiority of our GraphRfi in the two tasks - robust rating prediction and fraudster detection. Furthermore, the proposed GraphRfi is validated to be more robust to the various types of shilling attacks over the state-of-the-art recommender systems.

[141]  arXiv:2005.10157 [pdf, other]
Title: Generating Question Titles for Stack Overflow from Mined Code Snippets
Subjects: Software Engineering (cs.SE)

Stack Overflow has been heavily used by software developers as a popular way to seek programming-related information from peers via the internet. The Stack Overflow community recommends users to provide the related code snippet when they are creating a question to help others better understand it and offer their help. Previous studies have shown that} a significant number of these questions are of low-quality and not attractive to other potential experts in Stack Overflow. These poorly asked questions are less likely to receive useful answers and hinder the overall knowledge generation and sharing process. Considering one of the reasons for introducing low-quality questions in SO is that many developers may not be able to clarify and summarize the key problems behind their presented code snippets due to their lack of knowledge and terminology related to the problem, and/or their poor writing skills, in this study we propose an approach to assist developers in writing high-quality questions by automatically generating question titles for a code snippet using a deep sequence-to-sequence learning approach. Our approach is fully data-driven and uses an attention mechanism to perform better content selection, a copy mechanism to handle the rare-words problem and a coverage mechanism to eliminate word repetition problem. We evaluate our approach on Stack Overflow datasets over a variety of programming languages (e.g., Python, Java, Javascript, C# and SQL) and our experimental results show that our approach significantly outperforms several state-of-the-art baselines in both automatic and human evaluation. We have released our code and datasets to facilitate other researchers to verify their ideas and inspire the follow-up work.

[142]  arXiv:2005.10161 [pdf, other]
Title: User Attention and Behaviour in Virtual Reality Art Encounter
Subjects: Human-Computer Interaction (cs.HC); Multiagent Systems (cs.MA); Multimedia (cs.MM)

With the proliferation of consumer virtual reality (VR) headsets and creative tools, content creators have started to experiment with new forms of interactive audience experience using immersive media. Understanding user attention and behaviours in virtual environment can greatly inform creative processes in VR. We developed an abstract VR painting and an experimentation system to study audience encounters through eye gaze and movement tracking. The data from a user experiment with 35 participants reveal a range of user activity patterns in art exploration. Deep learning models are used to study the connections between behavioural data and audience background. New integrated methods to visualise user attention as part of the artwork are also developed as a feedback loop to the content creator.

[143]  arXiv:2005.10169 [pdf]
Title: An Empirical Study of User Support Tools in Open Source Software
Comments: IEEE 15th Internatinal Conference on Control and Automation (ICCA), Edinburgh, Scotland, pp, 964-968, July 2019
Subjects: Software Engineering (cs.SE)

End users positive response is essential for the success of any software. This is true for both commercial and Open Source Software (OSS). OSS is popular not only because of its availability, which is usually free but due to the user support it provides, generally through public platforms. The study model of this research establishes a relationship between OSS user support and available support tools. To conduct this research, we used a dataset of 100 OSS projects in different categories and examined five user support tools provided by different OSS projects. The results show that project trackers, user mailing lists, and updated versions have a significant role in gaining user support. However, we were unable to find a significant association between user support and documentation, as well as between user support and the troubleshooting guidelines provided by OSS projects.

[144]  arXiv:2005.10174 [pdf, other]
Title: Monte Carlo Estimators for the Schatten p-norm of Symmetric Positive Semidefinite Matrices
Comments: 21 pages, 10 figures, 1 table
Subjects: Numerical Analysis (math.NA)

We present numerical methods for computing the Schatten $p$-norm of positive semi-definite matrices. Our motivation stems from uncertainty quantification and optimal experimental design for inverse problems, where the Schatten $p$-norm defines a design criterion known as the P-optimal criterion. Computing the Schatten $p$-norm of high-dimensional matrices is computationally expensive. We propose a matrix-free method to estimate the Schatten $p$-norm using a Monte Carlo estimator and derive convergence results and error estimates for the estimator. To efficiently compute the Schatten $p$-norm for non-integer and large values of $p$, we use an estimator using a Chebyshev polynomial approximation and extend our convergence and error analysis to this setting as well. We demonstrate the performance of our proposed estimators on several test matrices and through an application to optimal experimental design of a model inverse problem.

[145]  arXiv:2005.10175 [pdf, other]
Title: Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise
Comments: UAI 2020
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Greedy-GQ is an off-policy two timescale algorithm for optimal control in reinforcement learning. This paper develops the first finite-sample analysis for the Greedy-GQ algorithm with linear function approximation under Markovian noise. Our finite-sample analysis provides theoretical justification for choosing stepsizes for this two timescale algorithm for faster convergence in practice, and suggests a trade-off between the convergence rate and the quality of the obtained policy. Our paper extends the finite-sample analyses of two timescale reinforcement learning algorithms from policy evaluation to optimal control, which is of more practical interest. Specifically, in contrast to existing finite-sample analyses for two timescale methods, e.g., GTD, GTD2 and TDC, where their objective functions are convex, the objective function of the Greedy-GQ algorithm is non-convex. Moreover, the Greedy-GQ algorithm is also not a linear two-timescale stochastic approximation algorithm. Our techniques in this paper provide a general framework for finite-sample analysis of non-convex value-based reinforcement learning algorithms for optimal control.

[146]  arXiv:2005.10176 [pdf, ps, other]
Title: Representation of Developer Expertise in Open Source Software
Subjects: Software Engineering (cs.SE); Machine Learning (cs.LG)

With tens of millions of projects and developers, the OSS ecosystem is both vibrant and intimidating. On one hand, it hosts the source code for the most critical infrastructures and has the most brilliant developers as contributors, while on the other hand, poor quality or even malicious software, and novice developers abound. External contributions are critical to OSS projects, but the chances their contributions are accepted or even considered depend on the trust between maintainers and contributors. Such trust is built over repeated interactions and coding platforms provide signals of project or developer quality via measures of activity (commits), and social relationships (followers/stars) to facilitate trust. These signals, however, do not represent the specific expertise of a developer. We, therefore, aim to address this gap by defining the skill space for APIs, developers, and projects that reflects what developers know (and projects need) more precisely than could be obtained via aggregate activity counts, and more generally than pointing to individual files developers have changed in the past. Specifically, we use the World of Code infrastructure to extract the complete set of APIs in the files changed by all open source developers. We use that data to represent APIs, developers, and projects in the skill space, and evaluate if the alignment measures in the skill space can predict whether or not the developers use new APIs, join new projects, or get their pull requests accepted. We also check if the developers' representation in the skill space aligns with their self-reported expertise. Our results suggest that the proposed embedding in the skill space achieves our aims and may serve not only as a signal to increase trust (and efficiency) of open source ecosystems, but may also allow more detailed investigations of other phenomena related to developer proficiency and learning.

[147]  arXiv:2005.10180 [pdf, other]
Title: Combining Experts' Causal Judgments
Comments: A preliminary version of the paper appeared in \emph{Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18)}, 2018}
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Consider a policymaker who wants to decide which intervention to perform in order to change a currently undesirable situation. The policymaker has at her disposal a team of experts, each with their own understanding of the causal dependencies between different factors contributing to the outcome. The policymaker has varying degrees of confidence in the experts' opinions. She wants to combine their opinions in order to decide on the most effective intervention. We formally define the notion of an effective intervention, and then consider how experts' causal judgments can be combined in order to determine the most effective intervention. We define a notion of two causal models being \emph{compatible}, and show how compatible causal models can be merged. We then use it as the basis for combining experts' causal judgments. We also provide a definition of decomposition for causal models to cater for cases when models are incompatible. We illustrate our approach on a number of real-life examples.

[148]  arXiv:2005.10182 [pdf, other]
Title: The Iteration Number of Colour Refinement
Comments: 22 pages, 3 figures, full version of a paper accepted at ICALP 2020
Subjects: Discrete Mathematics (cs.DM); Computational Complexity (cs.CC); Logic in Computer Science (cs.LO); Combinatorics (math.CO)

The Colour Refinement procedure and its generalisation to higher dimensions, the Weisfeiler-Leman algorithm, are central subroutines in approaches to the graph isomorphism problem. In an iterative fashion, Colour Refinement computes a colouring of the vertices of its input graph.
A trivial upper bound on the iteration number of Colour Refinement on graphs of order n is n-1. We show that this bound is tight. More precisely, we prove via explicit constructions that there are infinitely many graphs G on which Colour Refinement takes |G|-1 iterations to stabilise. Modifying the infinite families that we present, we show that for every natural number n >= 10, there are graphs on n vertices on which Colour Refinement requires at least n-2 iterations to reach stabilisation.

[149]  arXiv:2005.10187 [pdf, ps, other]
Title: Do we need a Contact Tracing App?
Subjects: Computers and Society (cs.CY); Networking and Internet Architecture (cs.NI)

The goal of this paper is to shed some light on the usefulness of a contact tracing smartphone app for the containment of the COVID-19 pandemic. We review the basics of contact tracing during the spread of a virus, we contextualize the numbers to the case of COVID-19 and we analyse the state of the art for proximity detection using Bluetooth Low Energy. Our contribution is to assess if there are scientific evidences that suggest a contact tracing app, using present technologies, could be effective to slow down the spread of the virus. Our conclusion is that such evidences are lacking, and we should re-think the introduction of such a privacy-invasive measure.

[150]  arXiv:2005.10190 [pdf, other]
Title: Feature Purification: How Adversarial Training Performs Robust Deep Learning
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC); Machine Learning (stat.ML)

Despite the great empirical success of adversarial training to defend deep learning models against adversarial perturbations, so far, it still remains rather unclear what the principles are behind the existence of adversarial perturbations, and what adversarial training does to the neural network to remove them.
In this paper, we present a principle that we call "feature purification", where we show the existence of adversarial examples are due to the accumulation of certain "dense mixtures" in the hidden weights during the training process of a neural network; and more importantly, one of the goals of adversarial training is to remove such mixtures to "purify" hidden weights. We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a Theoretical Result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly initialized gradient descent indeed satisfies this principle.
Technically, we give, to the best of our knowledge, the first result proving that the following two can hold simultaneously for training a neural network with ReLU activation. (1) Training over the original data is indeed non-robust to small adversarial perturbations of some radius. (2) Adversarial training, even with an empirical perturbation algorithm such as FGM, can in fact be provably robust against ANY perturbations of the same radius. Finally, we also prove a complexity lower bound, showing that low complexity models such as linear classifiers, low-degree polynomials, or even the neural tangent kernel for this network, CANNOT defend against perturbations of this same radius, no matter what algorithms are used to train them.

[151]  arXiv:2005.10191 [pdf, other]
Title: A Clarified Typology of Core-Periphery Structure in Networks
Comments: 21 pages, 6 figures, 1 table
Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY); Physics and Society (physics.soc-ph)

Core-periphery structure, the arrangement of a network into a dense core and sparse periphery, is a versatile descriptor of various social, biological, and technological networks. In practice, different core-periphery algorithms are often applied interchangeably, despite the fact that they can yield inconsistent descriptions of core-periphery structure. For example, two of the most widely used algorithms, the k-cores decomposition and the classic two-block model of Borgatti and Everett, extract fundamentally different structures: the former partitions a network into a binary hub-and-spoke layout, while the latter divides it into a layered hierarchy. We introduce a core-periphery typology to clarify these differences, along with Bayesian stochastic block modeling techniques to classify networks in accordance with this typology. Empirically, we find a rich diversity of core-periphery structure among networks. Through a detailed case study, we demonstrate the importance of acknowledging this diversity and situating networks within the core-periphery typology when conducting domain-specific analyses.

[152]  arXiv:2005.10192 [pdf, other]
Title: A simple extrapolated predictor for overcoming the starting and tracking issues in the arc-length method for nonlinear structural mechanics
Comments: 19 pages
Subjects: Computational Engineering, Finance, and Science (cs.CE); Numerical Analysis (math.NA)

This paper presents a simplified implementation of the arc-length method for computing the equilibrium paths of nonlinear structural mechanics problems. In the proposed technique, the predictor is computed by extrapolating the solutions from two previously converged load steps. The extrapolation is a linear combination of the previous solutions; therefore, it is simple and inexpensive. Additionally, the proposed extrapolated predictor also serves as a means for identifying the forward movement along the equilibrium path without the need for any sophisticated techniques commonly employed for explicit tracking. The ability of the proposed technique to successfully compute complex equilibrium paths in static structural mechanics problems is demonstrated using five numerical examples involving truss and beam models.

[153]  arXiv:2005.10195 [pdf, other]
Title: Learning natural locomotion behaviors for humanoid robots using human knowledge
Subjects: Robotics (cs.RO)

This paper presents a new learning framework that leverages the knowledge from imitation learning, deep reinforcement learning, and control theories to achieve human-style locomotion that is natural, dynamic, and robust for humanoids. We proposed novel approaches to introduce human bias, i.e. motion capture data and a special Multi-Expert network structure. We used the Multi-Expert network structure to smoothly blend behavioral features, and used the augmented reward design for the task and imitation rewards. Our reward design is composable, tunable, and explainable by using fundamental concepts from conventional humanoid control. We rigorously validated and benchmarked the learning framework which consistently produced robust locomotion behaviors in various test scenarios. Further, we demonstrated the capability of learning robust and versatile policies in the presence of disturbances, such as terrain irregularities and external pushes.

[154]  arXiv:2005.10199 [pdf, ps, other]
Title: Localization & Mitigation of Cascading Failures in Power Systems, Part I: Spectral Representation & Tree Partition
Subjects: Systems and Control (eess.SY)

Cascading failures in power systems propagate non-locally, making the control of outages extremely difficult. In this work, we propose a new framework that offers strong analytical guarantees on both the localization and mitigation of cascading failures in power systems. The key component of this framework leverages the concept of tree partition, which characterizes regions of a power network inside which line failures are automatically localized. In Part I of this paper we establish a mathematical theory that underlies all the performance guarantees of tree partition, as well as its failure localization properties. This theory consists of a set of tools developed using the Laplacian matrix of the transmission network and reveals a novel perspective that precisely captures the Kirchhoff's Law in terms of topological structures. Our results show that the distribution of different families of subtrees of the transmission network plays a critical role on the patterns of power redistribution, and motivates tree partitioning of the network as a strategy to eliminate long-distance propagation of disturbances. These results are used in Parts II and III of this paper to design strategies to localize and mitigate line failures.

[155]  arXiv:2005.10200 [pdf, other]
Title: BERTweet: A pre-trained language model for English Tweets
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

We present BERTweet, the first public large-scale pre-trained language model for English Tweets. Our BERTweet is trained using the RoBERTa pre-training procedure (Liu et al., 2019), with the same model configuration as BERT-base (Devlin et al., 2019). Experiments show that BERTweet outperforms strong baselines RoBERTa-base and XLM-R-base (Conneau et al., 2020), producing better performance results than the previous state-of-the-art models on three Tweet NLP tasks: Part-of-speech tagging, Named-entity recognition and text classification. We release BERTweet to facilitate future research and downstream applications on Tweet data. Our BERTweet is available at: https://github.com/VinAIResearch/BERTweet

[156]  arXiv:2005.10203 [pdf, other]
Title: Graph Structure Learning for Robust Graph Neural Networks
Comments: Accepted by KDD 2020
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Social and Information Networks (cs.SI); Machine Learning (stat.ML)

Graph Neural Networks (GNNs) are powerful tools in representation learning for graphs. However, recent studies show that GNNs are vulnerable to carefully-crafted perturbations, called adversarial attacks. Adversarial attacks can easily fool GNNs in making predictions for downstream tasks. The vulnerability to adversarial attacks has raised increasing concerns for applying GNNs in safety-critical applications. Therefore, developing robust algorithms to defend adversarial attacks is of great significance. A natural idea to defend adversarial attacks is to clean the perturbed graph. It is evident that real-world graphs share some intrinsic properties. For example, many real-world graphs are low-rank and sparse, and the features of two adjacent nodes tend to be similar. In fact, we find that adversarial attacks are likely to violate these graph properties. Therefore, in this paper, we explore these properties to defend adversarial attacks on graphs. In particular, we propose a general framework Pro-GNN, which can jointly learn a structural graph and a robust graph neural network model from the perturbed graph guided by these properties. Extensive experiments on real-world graphs demonstrate that the proposed framework achieves significantly better performance compared with the state-of-the-art defense methods, even when the graph is heavily perturbed. We release the implementation of Pro-GNN to our DeepRobust repository for adversarial attacks and defenses (footnote: https://github.com/DSE-MSU/DeepRobust). The specific experimental settings to reproduce our results can be found in https://github.com/ChandlerBang/Pro-GNN.

[157]  arXiv:2005.10206 [pdf, other]
Title: Numerical simulations for full history recursive multilevel Picard approximations for systems of high-dimensional partial differential equations
Comments: 21 pages
Subjects: Numerical Analysis (math.NA)

One of the most challenging issues in applied mathematics is to develop and analyze algorithms which are able to approximately compute solutions of high-dimensional nonlinear partial differential equations (PDEs). In particular, it is very hard to develop approximation algorithms which do not suffer under the curse of dimensionality in the sense that the number of computational operations needed by the algorithm to compute an approximation of accuracy $\epsilon > 0$ grows at most polynomially in both the reciprocal $1/\epsilon$ of the required accuracy and the dimension $d \in \mathbb{N}$ of the PDE. Recently, a new approximation method, the so-called full history recursive multilevel Picard (MLP) approximation method, has been introduced and, until today, this approximation scheme is the only approximation method in the scientific literature which has been proven to overcome the curse of dimensionality in the numerical approximation of semilinear PDEs with general time horizons. It is a key contribution of this article to extend the MLP approximation method to systems of semilinear PDEs and to numerically test it on several example PDEs. More specifically, we apply the proposed MLP approximation method in the case of Allen-Cahn PDEs, Sine-Gordon-type PDEs, systems of coupled semilinear heat PDEs, and semilinear Black-Scholes PDEs in up to 1000 dimensions. The presented numerical simulation results suggest in the case of each of these example PDEs that the proposed MLP approximation method produces very accurate results in short runtimes and, in particular, the presented numerical simulation results indicate that the proposed MLP approximation scheme significantly outperforms certain deep learning based approximation methods for high-dimensional semilinear PDEs.

[158]  arXiv:2005.10211 [pdf, other]
Title: Anomaly Detection in Video Games
Comments: 4 pages, 3 figures, submitted to IEEE CONFERENCE ON GAMES (COG), Dataset this https URL , Code and pre-trained models this https URL
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

With the aim of designing automated tools that assist in the video game quality assurance process, we frame the problem of identifying bugs in video games as an anomaly detection (AD) problem. We develop State-State Siamese Networks (S3N) as an efficient deep metric learning approach to AD in this context and explore how it may be used as part of an automated testing tool. Finally, we show by empirical evaluation on a series of Atari games, that S3N is able to learn a meaningful embedding, and consequently is able to identify various common types of video game bugs.

[159]  arXiv:2005.10213 [pdf, other]
Title: Applying the Transformer to Character-level Transduction
Subjects: Computation and Language (cs.CL)

The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks. The model offers other benefits as well: It trains faster and has fewer parameters. Yet for character-level transduction tasks, e.g. morphological inflection generation and historical text normalization, few shows success on outperforming recurrent models with the transformer. In an empirical study, we uncover that, in contrast to recurrent sequence-to-sequence models, the batch size plays a crucial role in the performance of the transformer on character-level tasks, and we show that with a large enough batch size, the transformer does indeed outperform recurrent models. We also introduce a simple technique to handle feature-guided character-level transduction that further improves performance. With these insights, we achieve state-of-the-art performance on morphological inflection and historical text normalization. We also show that the transformer outperforms a strong baseline on two other character-level transduction tasks: grapheme-to-phoneme conversion and transliteration. Code is available at https://github.com/shijie-wu/neural-transducer.

[160]  arXiv:2005.10215 [pdf, ps, other]
Title: Unveiling the Importance of SIC in NOMA Systems: Part I -- State of the Art and Recent Findings
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

The key idea of non-orthogonal multiple access (NOMA) is to serve multiple users simultaneously at the same time and frequency, which can result in excessive multiple-access interference. As a crucial component of NOMA systems, successive interference cancelation (SIC) is key to combating this multiple-access interference, and is focused on in this letter, where an overview of SIC decoding order selection schemes is provided. In particular, selecting the SIC decoding order based on the users' channel state information (CSI) and the users' quality of service (QoS), respectively, is discussed. The limitations of these two approaches are illustrated, and then a recently proposed scheme, termed hybrid SIC, which dynamically adapts the SIC decoding order is presented and shown to achieve a surprising performance improvement that cannot be realized by the conventional SIC decoding order selection schemes individually.

[161]  arXiv:2005.10217 [pdf, ps, other]
Title: Unveiling the Importance of SIC in NOMA Systems: Part II: New Results and Future Directions
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

In most existing works on non-orthogonal multiple access (NOMA), the decoding order of successive interference cancellation (SIC) is prefixed and based on either the users' channel conditions or their quality of service (QoS) requirements. A recent work on NOMA assisted semi-grant-free transmission showed that the use of a more sophisticated hybrid SIC scheme can yield significant performance improvements. This letter illustrates how the concept of hybrid SIC can be generalized and applied to different NOMA applications. We first use NOMA assisted mobile edge computing (MEC) as an example to illustrate the benefits of hybrid SIC, where new results for delay and energy minimization are presented. Then, future directions for generalizing hybrid SIC with adaptive decoding order selection as well as its promising applications are discussed.

[162]  arXiv:2005.10219 [pdf, other]
Title: BlaBla: Linguistic Feature Extraction for Clinical Analysis in Multiple Languages
Comments: 5 pages. 1 figure. Under review
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

We introduce BlaBla, an open-source Python library for extracting linguistic features with proven clinical relevance to neurological and psychiatric diseases across many languages. BlaBla is a unifying framework for accelerating and simplifying clinical linguistic research. The library is built on state-of-the-art NLP frameworks and supports multithreaded/GPU-enabled feature extraction via both native Python calls and a command line interface. We describe BlaBla's architecture and clinical validation of its features across 12 diseases. We further demonstrate the application of BlaBla to a task visualizing and classifying language disorders in three languages on real clinical data from the AphasiaBank dataset. We make the codebase freely available to researchers with the hope of providing a consistent, well-validated foundation for the next generation of clinical linguistic research.

[163]  arXiv:2005.10220 [pdf, other]
Title: Reducing Overlearning through Disentangled Representations by Suppressing Unknown Tasks
Comments: Added appendix with additional results
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Existing deep learning approaches for learning visual features tend to overlearn and extract more information than what is required for the task at hand. From a privacy preservation perspective, the input visual information is not protected from the model; enabling the model to become more intelligent than it is trained to be. Current approaches for suppressing additional task learning assume the presence of ground truth labels for the tasks to be suppressed during training time. In this research, we propose a three-fold novel contribution: (i) a model-agnostic solution for reducing model overlearning by suppressing all the unknown tasks, (ii) a novel metric to measure the trust score of a trained deep learning model, and (iii) a simulated benchmark dataset, PreserveTask, having five different fundamental image classification tasks to study the generalization nature of models. In the first set of experiments, we learn disentangled representations and suppress overlearning of five popular deep learning models: VGG16, VGG19, Inception-v1, MobileNet, and DenseNet on PreserverTask dataset. Additionally, we show results of our framework on color-MNIST dataset and practical applications of face attribute preservation in Diversity in Faces (DiF) and IMDB-Wiki dataset.

[164]  arXiv:2005.10222 [pdf, other]
Title: Compute-Bound and Low-Bandwidth Distributed 3D Graph-SLAM
Subjects: Computer Vision and Pattern Recognition (cs.CV)

This article describes a new approach for distributed 3D SLAM map building. The key contribution of this article is the creation of a distributed graph-SLAM map-building architecture responsive to bandwidth and computational needs of the robotic platform. Responsiveness is afforded by the integration of a 3D point cloud to plane cloud compression algorithm that approximates dense 3D point cloud using local planar patches. Compute bound platforms may restrict the computational duration of the compression algorithm and low-bandwidth platforms can restrict the size of the compression result. The backbone of the approach is an ultra-fast adaptive 3D compression algorithm that transforms swaths of 3D planar surface data into planar patches attributed with image textures. Our approach uses DVO SLAM, a leading algorithm for 3D mapping, and extends it by computationally isolating map integration tasks from local Guidance, Navigation, and Control tasks and includes an addition of a network protocol to share the compressed plane clouds. The joint effect of these contributions allows agents with 3D sensing capabilities to calculate and communicate compressed map information commensurate with their onboard computational resources and communication channel capacities. This opens SLAM mapping to new categories of robotic platforms that may have computational and memory limits that prohibit other SLAM solutions.

[165]  arXiv:2005.10224 [pdf, other]
Title: The Random Feature Model for Input-Output Maps between Banach Spaces
Comments: 29 pages, 9 figures
Subjects: Numerical Analysis (math.NA); Machine Learning (cs.LG); Computational Physics (physics.comp-ph); Machine Learning (stat.ML)

Well known to the machine learning community, the random feature model, originally introduced by Rahimi and Recht in 2008, is a parametric approximation to kernel interpolation or regression methods. It is typically used to approximate functions mapping a finite-dimensional input space to the real line. In this paper, we instead propose a methodology for use of the random feature model as a data-driven surrogate for operators that map an input Banach space to an output Banach space. Although the methodology is quite general, we consider operators defined by partial differential equations (PDEs); here, the inputs and outputs are themselves functions, with the input parameters being functions required to specify the problem, such as initial data or coefficients, and the outputs being solutions of the problem. Upon discretization, the model inherits several desirable attributes from this infinite-dimensional, function space viewpoint, including mesh-invariant approximation error with respect to the true PDE solution map and the capability to be trained at one mesh resolution and then deployed at different mesh resolutions. We view the random feature model as a non-intrusive data-driven emulator, provide a mathematical framework for its interpretation, and demonstrate its ability to efficiently and accurately approximate the nonlinear parameter-to-solution maps of two prototypical PDEs arising in physical science and engineering applications: viscous Burgers' equation and a variable coefficient elliptic equation.

[166]  arXiv:2005.10228 [pdf, other]
Title: Sparsity-based audio declipping methods: overview, new algorithms, and large-scale evaluation
Authors: Clément Gaultier (PANAMA), Srđan Kitić, Rémi Gribonval (PANAMA, DANTE), Nancy Bertin (PANAMA)
Comments: arXiv admin note: text overlap with arXiv:1711.11259
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)

Recent advances in audio declipping have substantially improved the state of the art in certain saturation regimes. Yet, practitioners need guidelines to choose a method, and while existing benchmarks have been instrumental in advancing the field, larger-scale experiments are needed to guide such choices. First, we show that the saturation levels in existing small-scale benchmarks are moderate and call for benchmarks with more perceptually significant saturation levels. We then propose a general algorithmic framework for declipping that covers existing and new combinations of flavors of state-of-the-art techniques exploiting time-frequency sparsity: synthesis vs analysis sparsity, with plain or structured sparsity. Finally, we systematically compare these combinations and state-of-the-art methods. Using a large-scale numerical benchmarks and a smaller scale formal listening test, we provide guidelines for various saturation levels, both for speech and various musical "genres" from the RWC database. The code is made publicly available for the purpose of reproducible research and benchmarking.

[167]  arXiv:2005.10229 [pdf, other]
Title: Intra- and Inter-Action Understanding via Temporal Action Parsing
Comments: CVPR 2020 Poster; Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Current methods for action recognition primarily rely on deep convolutional networks to derive feature embeddings of visual and motion features. While these methods have demonstrated remarkable performance on standard benchmarks, we are still in need of a better understanding as to how the videos, in particular their internal structures, relate to high-level semantics, which may lead to benefits in multiple aspects, e.g. interpretable predictions and even new methods that can take the recognition performances to a next level. Towards this goal, we construct TAPOS, a new dataset developed on sport videos with manual annotations of sub-actions, and conduct a study on temporal action parsing on top. Our study shows that a sport activity usually consists of multiple sub-actions and that the awareness of such temporal structures is beneficial to action recognition. We also investigate a number of temporal parsing methods, and thereon devise an improved method that is capable of mining sub-actions from training data without knowing the labels of them. On the constructed TAPOS, the proposed method is shown to reveal intra-action information, i.e. how action instances are made of sub-actions, and inter-action information, i.e. one specific sub-action may commonly appear in various actions.

[168]  arXiv:2005.10232 [pdf, other]
Title: Sentence level estimation of psycholinguistic norms using joint multidimensional annotations
Comments: 5 pages, 4 figures, submitted to Interspeech 2020
Subjects: Computation and Language (cs.CL)

Psycholinguistic normatives represent various affective and mental constructs using numeric scores and are used in a variety of applications in natural language processing. They are commonly used at the sentence level, the scores of which are estimated by extrapolating word level scores using simple aggregation strategies, which may not always be optimal. In this work, we present a novel approach to estimate the psycholinguistic norms at sentence level. We apply a multidimensional annotation fusion model on annotations at the word level to estimate a parameter which captures relationships between different norms. We then use this parameter at sentence level to estimate the norms. We evaluate our approach by predicting sentence level scores for various normative dimensions and compare with standard word aggregation schemes.

[169]  arXiv:2005.10242 [pdf, other]
Title: Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Contrastive representation learning has been outstandingly successful in practice. In this work, we identify two key properties related to the contrastive loss: (1) alignment (closeness) of features from positive pairs, and (2) uniformity of the induced distribution of the (normalized) features on the hypersphere. We prove that, asymptotically, the contrastive loss optimizes these properties, and analyze their positive effects on downstream tasks. Empirically, we introduce an optimizable metric to quantify each property. Extensive experiments on standard vision and language datasets confirm the strong agreement between both metrics and downstream task performance. Remarkably, directly optimizing for these two metrics leads to representations with comparable or better performance at downstream tasks than contrastive learning.
Project Page: https://ssnl.github.io/hypersphere
Code: https://github.com/SsnL/align_uniform

[170]  arXiv:2005.10243 [pdf, other]
Title: What makes for good views for contrastive learning
Comments: submitted to ECCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Contrastive learning between multiple views of the data has recently achieved state of the art performance in the field of self-supervised representation learning. Despite its success, the influence of different view choices has been less studied. In this paper, we use empirical analysis to better understand the importance of view selection, and argue that we should reduce the mutual information (MI) between views while keeping task-relevant information intact. To verify this hypothesis, we devise unsupervised and semi-supervised frameworks that learn effective views by aiming to reduce their MI. We also consider data augmentation as a way to reduce MI, and show that increasing data augmentation indeed leads to decreasing MI and improves downstream classification accuracy. As a by-product, we also achieve a new state-of-the-art accuracy on unsupervised pre-training for ImageNet classification ($73\%$ top-1 linear readoff with a ResNet-50). In addition, transferring our models to PASCAL VOC object detection and COCO instance segmentation consistently outperforms supervised pre-training. Code:this http URL

Cross-lists for Thu, 21 May 20

[171]  arXiv:1706.00342 (cross-list from math.OC) [pdf, ps, other]
Title: On the stable recovery of deep structured linear networks under sparsity constraints
Authors: Francois Malgouyres (IMT)
Journal-ref: Mathematical and Scientific Machine Learning, Jul 2020, Princeton, United States
Subjects: Optimization and Control (math.OC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Statistics Theory (math.ST)

We consider a deep structured linear network under sparsity constraints. We study sharp conditions guaranteeing the stability of the optimal parameters defining the network. More precisely, we provide sharp conditions on the network architecture and the sample under which the error on the parameters defining the network scales linearly with the reconstruction error (i.e. the risk). Therefore, under these conditions, the weights obtained with a successful algorithms are well defined and only depend on the architecture of the network and the sample. The features in the latent spaces are stably defined. The stability property is required in order to interpret the features defined in the latent spaces. It can also lead to a guarantee on the statistical risk. This is what motivates this study. The analysis is based on the recently proposed Tensorial Lifting. The particularity of this paper is to consider a sparsity prior. This leads to a better stability constant. As an illustration, we detail the analysis and provide sharp stability guarantees for convolutional linear network under sparsity prior. In this analysis, we distinguish the role of the network architecture and the sample input. This highlights the requirements on the data in connection to parameter stability.

[172]  arXiv:2003.01113 (cross-list from eess.IV) [pdf, other]
Title: Warwick Electron Microscopy Datasets
Authors: Jeffrey M. Ede
Comments: 16 pages, 4 figures, 2 tables
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG)

Large, carefully partitioned datasets are essential to train neural networks and standardize performance benchmarks. As a result, we have set up new repositories to make our electron microscopy datasets available to the wider community. There are three main datasets containing 19769 scanning transmission electron micrographs, 17266 transmission electron micrographs, and 98340 simulated exit wavefunctions, and multiple variants of each dataset for different applications. To visualize image datasets, we have trained variational autoencoders to parameterize 64-dimensional multivariate normal distributions, which we cluster in two dimensions by t-distributed stochastic neighbor embedding. In addition, we have improved dataset visualization with variational autoencoders by introducing encoding normalization and regularization, adding an image gradient loss, and extending t-distributed stochastic neighbor embedding to account for encoded standard deviations. Our datasets, source code, pretrained models, and interactive visualization tools are openly available at https://github.com/Jeffrey-Ede/datasets.

[173]  arXiv:2005.08052 (cross-list from q-bio.PE) [pdf, other]
Title: Controlling epidemic diseases based only on social distancing level
Comments: 11 pages, 3 figures
Subjects: Populations and Evolution (q-bio.PE); Systems and Control (eess.SY); Physics and Society (physics.soc-ph)

The World Health Organization (WHO) made the assessment that COVID-19 can be characterized as a pandemic on March 11, 2020. To the COVID-19 outbreak, there is no vaccination and no treatment. The only way to control the COVID-19 outbreak is sustained physical distancing. In this work, a simple control law was proposed to keep the infected individuals during the COVID-19 outbreak below the desired number. The proposed control law keeps the value of infected individuals controlled only adjusting the social distancing level. The stability analysis of the proposed control law is done and the uncertainties in the parameters were considered. A version of the proposed controller to daily update was developed. This is a very simple approach to the developed control law and can be calculated in a spreadsheet. In the end, numerical simulations were done to show the behavior of the number of infected individuals during an epidemic disease when the proposed control law is used to adjust the social distancing level.

[174]  arXiv:2005.09638 (cross-list from physics.comp-ph) [pdf, other]
Title: Physics-informed Neural Networks for Solving Inverse Problems of Nonlinear Biot's Equations: Batch Training
Comments: arXiv admin note: text overlap with arXiv:2002.08235
Subjects: Computational Physics (physics.comp-ph); Machine Learning (cs.LG); Numerical Analysis (math.NA)

In biomedical engineering, earthquake prediction, and underground energy harvesting, it is crucial to indirectly estimate the physical properties of porous media since the direct measurement of those are usually impractical/prohibitive. Here we apply the physics-informed neural networks to solve the inverse problem with regard to the nonlinear Biot's equations. Specifically, we consider batch training and explore the effect of different batch sizes. The results show that training with small batch sizes, i.e., a few examples per batch, provides better approximations (lower percentage error) of the physical parameters than using large batches or the full batch. The increased accuracy of the physical parameters, comes at the cost of longer training time. Specifically, we find the size should not be too small since a very small batch size requires a very long training time without a corresponding improvement in estimation accuracy. We find that a batch size of 8 or 32 is a good compromise, which is also robust to additive noise in the data. The learning rate also plays an important role and should be used as a hyperparameter.

[175]  arXiv:2005.09669 (cross-list from math.ST) [pdf, other]
Title: Exponential ergodicity of mirror-Langevin diffusions
Comments: 27 pages, 10 figures
Subjects: Statistics Theory (math.ST); Machine Learning (cs.LG)

Motivated by the problem of sampling from ill-conditioned log-concave distributions, we give a clean non-asymptotic convergence analysis of mirror-Langevin diffusions as introduced in Zhang et al. (2020). As a special case of this framework, we propose a class of diffusions called Newton-Langevin diffusions and prove that they converge to stationarity exponentially fast with a rate which not only is dimension-free, but also has no dependence on the target distribution. We give an application of this result to the problem of sampling from the uniform distribution on a convex body using a strategy inspired by interior-point methods. Our general approach follows the recent trend of linking sampling and optimization, and in particular, it yields new results on the convergence of the vanilla Langevin diffusion in Wasserstein distance.

[176]  arXiv:2005.09684 (cross-list from eess.AS) [pdf, other]
Title: Exploring Transformers for Large-Scale Speech Recognition
Comments: 5 pages, 1 figure
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

While recurrent neural networks still largely define state-of-the-art speech recognition systems, the Transformer network has been proven to be a competitive alternative, especially in the offline condition. Most studies with Transformers have been constrained in a relatively small scale setting, and some forms of data argumentation approaches are usually applied to combat the data sparsity issue. In this paper, we aim at understanding the behaviors of Transformers in the large-scale speech recognition setting, where we have used around 65,000 hours of training data. We investigated various aspects on scaling up Transformers, including model initialization, warmup training as well as different Layer Normalization strategies. In the streaming condition, we compared the widely used attention mask based future context lookahead approach to the Transformer-XL network. From our experiments, we show that Transformers can achieve around 6% relative word error rate (WER) reduction compared to the BLSTM baseline in the offline fashion, while in the streaming fashion, Transformer-XL is comparable to LC-BLSTM with 800 millisecond latency constraint.

[177]  arXiv:2005.09687 (cross-list from q-bio.NC) [pdf, other]
Title: Deep learning approaches for neural decoding: from CNNs to LSTMs and spikes to fMRI
Comments: 22 pages, 3 figures
Subjects: Neurons and Cognition (q-bio.NC); Machine Learning (cs.LG)

Decoding behavior, perception, or cognitive state directly from neural signals has applications in brain-computer interface research as well as implications for systems neuroscience. In the last decade, deep learning has become the state-of-the-art method in many machine learning tasks ranging from speech recognition to image segmentation. The success of deep networks in other domains has led to a new wave of applications in neuroscience. In this article, we review deep learning approaches to neural decoding. We describe the architectures used for extracting useful features from neural recording modalities ranging from spikes to EEG. Furthermore, we explore how deep learning has been leveraged to predict common outputs including movement, speech, and vision, with a focus on how pretrained deep networks can be incorporated as priors for complex decoding targets like acoustic speech or images. Deep learning has been shown to be a useful tool for improving the accuracy and flexibility of neural decoding across a wide range of tasks, and we point out areas for future scientific development.

[178]  arXiv:2005.09751 (cross-list from physics.soc-ph) [pdf, ps, other]
Title: Phase transitions and social distancing control measures for SARS-CoV-2 on small world networks
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI)

We investigate the efficacy of three social distancing controls on the spread of SARS-CoV-2 using an agent based SIR model on a small world network structure: 1) Global social distancing with a fixed probability of adherence. 2) Individually initiated social isolation when a threshold number of contacts are infected. 3) Use of personal protective equipment (PPE) to reduce viral shedding and resultant infectivity. The primary driver of total number of infections is the viral shedding rate, with probability of social distancing being the next critical factor. These results suggest that higher compliance with PPE usage and personal hygiene has the potential to decrease the number of infections and shorten epidemic duration. Individually initiated social isolation was effective when initiated in response to a single infected contact. The combination of social isolation and PPE resulted in very low levels of infection. Our model suggests that widespread application of social distancing through government control can drastically reduce viral spread; even in the absence of widely adopted social distancing protocols, high use of PPE can also dramatically reduce the viral spread while short-duration quarantine following exposure to an infected individual was less effective.

[179]  arXiv:2005.09756 (cross-list from eess.AS) [pdf, other]
Title: Improving Proper Noun Recognition in End-to-End ASR By Customization of the MWER Loss Criterion
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)

Proper nouns present a challenge for end-to-end (E2E) automatic speech recognition (ASR) systems in that a particular name may appear only rarely during training, and may have a pronunciation similar to that of a more common word. Unlike conventional ASR models, E2E systems lack an explicit pronounciation model that can be specifically trained with proper noun pronounciations and a language model that can be trained on a large text-only corpus. Past work has addressed this issue by incorporating additional training data or additional models. In this paper, we instead build on recent advances in minimum word error rate (MWER) training to develop two new loss criteria that specifically emphasize proper noun recognition. Unlike past work on this problem, this method requires no new data during training or external models during inference. We see improvements ranging from 2% to 7% relative on several relevant benchmarks.

[180]  arXiv:2005.09762 (cross-list from eess.SP) [pdf, other]
Title: Digraph Signal Processing with Generalized Boundary Conditions
Comments: 13 pages, 22 figures
Subjects: Signal Processing (eess.SP); Discrete Mathematics (cs.DM)

Signal processing on directed graphs (digraphs) is problematic, since the graph shift, and thus associated filters, are in general not diagonalizable. Furthermore, the Fourier transform in this case is now obtained from the Jordan decomposition, which may not be computable at all for large graphs. We propose a novel and general solution for this problem based on matrix perturbation theory: We design an algorithm that adds a small number of edges to a given digraph to destroy nontrivial Jordan blocks. The obtained digraph is then diagonalizable and yields, as we show, an approximate eigenbasis and Fourier transform for the original digraph. We explain why and how this construction can be viewed as generalized form of boundary conditions, a common practice in signal processing. Our experiments with random and real world graphs show that we can scale to graphs with a few thousands nodes, and obtain Fourier transforms that are close to orthogonal while still diagonalizing an intuitive notion of convolution. Our method works with adjacency and Laplacian shift and can be used as preprocessing step to enable further processing as we show with a prototypical Wiener filter application.

[181]  arXiv:2005.09766 (cross-list from physics.med-ph) [pdf, other]
Title: Self-supervised Dynamic CT Perfusion Image Denoising with Deep Neural Networks
Comments: 13 pages, 9 figures
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Dynamic computed tomography perfusion (CTP) imaging is a promising approach for acute ischemic stroke diagnosis and evaluation. Hemodynamic parametric maps of cerebral parenchyma are calculated from repeated CT scans of the first pass of iodinated contrast through the brain. It is necessary to reduce the dose of CTP for routine applications due to the high radiation exposure from the repeated scans, where image denoising is necessary to achieve a reliable diagnosis. In this paper, we proposed a self-supervised deep learning method for CTP denoising, which did not require any high-dose reference images for training. The network was trained by mapping each frame of CTP to an estimation from its adjacent frames. Because the noise in the source and target was independent, this approach could effectively remove the noise. Being free from high-dose training images granted the proposed method easier adaptation to different scanning protocols. The method was validated on both simulation and a public real dataset. The proposed method achieved improved image quality compared to conventional denoising methods. On the real data, the proposed method also had improved spatial resolution and contrast-to-noise ratio compared to supervised learning which was trained on the simulation data

[182]  arXiv:2005.09768 (cross-list from eess.AS) [pdf, ps, other]
Title: Perceptual similarity between piano notes: Simulations with a template-based perception model
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

In this paper the auditory model developed by Dau et al. [J. Acoust. Soc. Am. 102, 2892-2905 (1997)] was used to simulate the perceptual similarity between complex sounds. For this purpose, a central processor stage was developed and attached as a back-end module to the auditory model. As complex sounds, a set of recordings of one note played on seven different pianos was used, whose similarity has been recently measured by Osses et al. [J. Acoust. Soc. Am. 146, 1024-1035 (2019)] using a 3-AFC discrimination task in noise. The auditory model has several processing stages that are to a greater or lesser extent inspired by physiological aspects of the human normal-hearing system. A set of configurable parameters in each stage affects directly the sound (internal) representations that are further processed in the developed central processor. Therefore, a comprehensive review of the model parameters is given, indicating the configuration we chose. This includes an in-depth description of the auditory adaptation stage, the adaptation loops. Simulations of the similarity task were compared with (1) existing experimental data, where they had a moderate to high correlation, and with (2) simulations using an alternative but similar background noise to that of the experiments, which were used to obtain further information about how the participants' responses were weighted as a function of frequency.

[183]  arXiv:2005.09775 (cross-list from math.PR) [pdf, ps, other]
Title: Random Toeplitz Matrices: The Condition Number under High Stochastic Dependence
Subjects: Probability (math.PR); Numerical Analysis (math.NA)

In this paper, we study the condition number of a random Toeplitz matrix. Since a Toeplitz matrix is a diagonal constant matrix, its rows or columns cannot be stochastically independent. This situation does not permit us to use the classic strategy to analyze its minimum singular value when all the entries of a random matrix are stochastically independent. Using Cauchy Interlacing Theorem as a decoupling technique, we can break the stochastic dependence of the structure of the Toeplitz matrix and reduce the problem to analyze the extreme singular values of a random circulant matrix. Among our results, we show the condition number of non--symmetric random Toeplitz matrix of dimension $n$ under the existence of moment generating function of the random entries is $\kappa\left(\mathcal{T}_n\right) = \mbox{O}\left( \frac{1}{\varepsilon} n^{\rho+1/2} \left(\log n\right)^{1/2} \right)$ with probability $1-\mbox{O}\left((\varepsilon^2 + \varepsilon) n^{-2\rho} + n^{-1/2+\scriptstyle{o}(1)}\right)$ for any $\varepsilon >0$, $\rho\in(0,1/4)$. Moreover, if the random entries only have the second moment, we have $\kappa\left(\mathcal{T}_n\right) = \mbox{O}\left( \frac{1}{\varepsilon} n^{\rho+1/2} \log n\right)$ with probability $1-\mbox{O}\left((\varepsilon^2 + \varepsilon) n^{-2\rho} + \left(\log n\right)^{-1/2}\right)$. Also, Cauchy Interlacing Theorem permits to analyze the condition number of a symmetric random Toeplitz matrix. In this case, the condition number $\kappa\left( \mathcal{T}^{sym}_n\right) = \mbox{O}\left(\frac{1}{\varepsilon} n^{1.01} \left(\log n\right)^{1/2}\right)$ with probability $1- \mbox{O}\left( \varepsilon n^{-1/10} + n^{-77/300+\scriptstyle{o}(1)}\right)$, when the random entries have moment generating function. Additionally, we show that the results on the random Toeplitz matrices hold for random Hankel matrices.

[184]  arXiv:2005.09792 (cross-list from math.OC) [pdf, other]
Title: Lie algebra structure of fitness and replicator control
Comments: 29 pages, 2 figures, preprint to journal submission
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

For over fifty years, the dynamical systems perspective has had a prominent role in evolutionary biology and economics, through the lens of game theory. In particular, the study of replicator differential equations on the standard (probability) simplex, specified by fitness maps or payoff functions, has yielded insights into the temporal behavior of such systems. However behavior is influenced by context and environmental factors with a game-changing quality (i.e., fitness maps are manipulated). This paper develops a principled geometric approach to model and understand such influences by incorporating replicator dynamics into a broader control-theoretic framework. Central to our approach is the construction of a Lie algebra structure on the space of fitness maps, mapping homomorphically to the Lie algebra of replicator vector fields. This is akin to classical mechanics, where the Poisson bracket Lie algebra of functions maps to associated Hamiltonian vector fields. We show, extending the work of Svirezhev in 1972, that a trajectory of a replicator vector field is the base integral curve of a solution to a Hamiltonian system defined on the cotangent bundle of the simplex. Further, we exploit the Lie algebraic structure of fitness maps to determine controllability properties of a class of replicator systems.

[185]  arXiv:2005.09824 (cross-list from eess.AS) [pdf, other]
Title: PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR
Comments: Submtted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

We present PyChain, a fully parallelized PyTorch implementation of end-to-end lattice-free maximum mutual information (LF-MMI) training for the so-called \emph{chain models} in the Kaldi automatic speech recognition (ASR) toolkit. Unlike other PyTorch and Kaldi based ASR toolkits, PyChain is designed to be as flexible and light-weight as possible so that it can be easily plugged into new ASR projects, or other existing PyTorch-based ASR tools, as exemplified respectively by a new project PyChain-example, and Espresso, an existing end-to-end ASR toolkit. PyChain's efficiency and flexibility is demonstrated through such novel features as full GPU training on numerator/denominator graphs, and support for unequal length sequences. Experiments on the WSJ dataset show that with simple neural networks and commonly used machine learning techniques, PyChain can achieve competitive results that are comparable to Kaldi and better than other end-to-end ASR systems.

[186]  arXiv:2005.09829 (cross-list from eess.IV) [pdf, other]
Title: Attention-based network for low-light image enhancement
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

The captured images under low light conditions often suffer insufficient brightness and notorious noise. Hence, low-light image enhancement is a key challenging task in computer vision. A variety of methods have been proposed for this task, but these methods often failed in an extreme low-light environment and amplified the underlying noise in the input image. To address such a difficult problem, this paper presents a novel attention-based neural network to generate high-quality enhanced low-light images from the raw sensor data. Specifically, we first employ attention strategy (i.e. channel attention and spatial attention modules) to suppress undesired chromatic aberration and noise. The channel attention module guides the network to refine redundant colour features. The spatial attention module focuses on denoising by taking advantage of the non-local correlation in the image. Furthermore, we propose a new pooling layer, called inverted shuffle layer, which adaptively selects useful information from previous features. Extensive experiments demonstrate the superiority of the proposed network in terms of suppressing the chromatic aberration and noise artifacts in enhancement, especially when the low-light image has severe noise.

[187]  arXiv:2005.09843 (cross-list from eess.AS) [pdf, ps, other]
Title: Jointly optimal denoising, dereverberation, and source separation
Comments: Submitted to IEEE/ACM Trans. Audio, Speech, and Language Processing on 12 Feb 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

This paper proposes methods that can optimize a Convolutional BeamFormer (CBF) for performing denoising, dereverberation, and source separation (DN+DR+SS) at the same time. Conventionally, cascade configuration composed of a Weighted Prediction Error minimization (WPE) dereverberation filter followed by a Minimum Variance Distortionless Response (MVDR) beamformer has been used as the state-of-the-art frontend of far-field speech recognition, however, overall optimality of this approach is not guaranteed. In the blind signal processing area, an approach for jointly optimizing dereverberation and source separation (DR+SS) has been proposed, however, this approach requires huge computing cost, and has not been extended for application to DN+DR+SS. To overcome the above limitations, this paper develops new approaches for optimizing DN+DR+SS in a computationally much more efficient way. To this end, we introduce two different techniques for factorizing a CBF into WPE filters and beamformers, one based on extension of the conventional joint optimization approach proposed for DR+SS and the other based on a novel factorization technique, and derive methods optimizing them for DN+DR+SS based on the maximum likelihood estimation using a neural network-supported steering vector estimation. Experiments using noisy reverberant sound mixtures show that the proposed optimization approaches greatly improve the performance of the speech enhancement in comparison with the conventional cascade configuration in terms of the signal distortion measures and ASR performance. It is also shown that the proposed approaches can greatly reduce the computing cost with improved estimation accuracy in comparison with the conventional joint optimization approach.

[188]  arXiv:2005.09850 (cross-list from physics.soc-ph) [pdf, other]
Title: Spatial Heterogeneity Can Lead to Substantial Local Variations in COVID-19 Timing and Severity
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI); Populations and Evolution (q-bio.PE)

Standard epidemiological models for COVID-19 employ variants of compartment (SIR) models at local scales, implicitly assuming spatially uniform local mixing. Here, we examine the effect of employing more geographically detailed diffusion models based on known spatial features of interpersonal networks, most particularly the presence of a long-tailed but monotone decline in the probability of interaction with distance, on disease diffusion. Based on simulations of unrestricted COVID-19 diffusion in 19 U.S cities, we conclude that heterogeneity in population distribution can have large impacts on local pandemic timing and severity, even when aggregate behavior at larger scales mirrors a classic SIR-like pattern. Impacts observed include severe local outbreaks with long lag time relative to the aggregate infection curve, and the presence of numerous areas whose disease trajectories correlate poorly with those of neighboring areas. A simple catchment model for hospital demand illustrates potential implications for health care utilization, with substantial disparities in the timing and extremity of impacts even without distancing interventions. Likewise, analysis of social exposure to others who are morbid or deceased shows considerable variation in how the epidemic can appear to individuals on the ground, potentially affecting risk assessment and compliance with mitigation measures. These results demonstrate the potential for spatial network structure to generate highly non-uniform diffusion behavior even at the scale of cities, and suggest the importance of incorporating such structure when designing models to inform healthcare planning, predict community outcomes, or identify potential disparities.

[189]  arXiv:2005.09862 (cross-list from eess.AS) [pdf, other]
Title: A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

Building a good speech recognition system usually requires large amounts of transcribed data, which is expensive to collect. To tackle this problem, many unsupervised pre-training methods have been proposed. Among these methods, Masked Predictive Coding achieved significant improvements on various speech recognition datasets with BERT-like Masked Reconstruction loss and Transformer backbone. However, many aspects of MPC have not been fully investigated. In this paper, we conduct a further study on MPC and focus on three important aspects: the effect of pre-training data speaking style, its extension on streaming model, and how to better transfer learned knowledge from pre-training stage to downstream tasks. Experiments reveled that pre-training data with matching speaking style is more useful on downstream recognition tasks. A unified training objective with APC and MPC provided 8.46% relative error reduction on streaming model trained on HKUST. Also, the combination of target data adaption and layer-wise discriminative training helped the knowledge transfer of MPC, which achieved 3.99% relative error reduction on AISHELL over a strong baseline.

[190]  arXiv:2005.09870 (cross-list from eess.IV) [pdf, other]
Title: Fast Wavelet Decomposition of Linear Operators through Product-Convolution Expansions
Authors: Paul Escande (I2M, CNRS), Pierre Weiss (CNRS)
Subjects: Image and Video Processing (eess.IV); Signal Processing (eess.SP); Numerical Analysis (math.NA)

Wavelet decompositions of integral operators have proven their efficiency in reducing computing times for many problems, ranging from the simulation of waves or fluids to the resolution of inverse problems in imaging. Unfortunately, computing the decomposition is itself a hard problem which is oftentimes out of reach for large scale problems. The objective of this work is to design fast decomposition algorithms based on another representation called product-convolution expansion. This decomposition can be evaluated efficiently assuming that a few impulse responses of the operator are available, but it is usually less efficient than the wavelet decomposition when incorporated in iterative methods. The proposed decomposition algorithms, run in quasi-linear time and we provide some numerical experiments to assess its performance for an imaging problem involving space varying blurs.

[191]  arXiv:2005.09873 (cross-list from eess.AS) [pdf, other]
Title: Consistent ICA: Determined BSS meets spectrogram consistency
Authors: Kohei Yatabe
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)

Multichannel audio blind source separation (BSS) in the determined situation (the number of microphones is equal to that of the sources), or determined BSS, is performed by multichannel linear filtering in the time-frequency domain to handle the convolutive mixing process. Ordinarily, the filter treats each frequency independently, which causes the well-known permutation problem, i.e., the problem of how to align the frequency-wise filters so that each separated component is correctly assigned to the corresponding sources. In this paper, it is shown that the general property of the time-frequency-domain representation called spectrogram consistency can be an assistant for solving the permutation problem.

[192]  arXiv:2005.09876 (cross-list from stat.ML) [pdf, other]
Title: The Inverse G-Wishart Distribution and Variational Message Passing
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Message passing on a factor graph is a powerful paradigm for the coding of approximate inference algorithms for arbitrarily graphical large models. The notion of a factor graph fragment allows for compartmentalization of algebra and computer code. We show that the Inverse G-Wishart family of distributions enables fundamental variational message passing factor graph fragments to be expressed elegantly and succinctly. Such fragments arise in models for which approximate inference concerning covariance matrix or variance parameters is made, and are ubiquitous in contemporary statistics and machine learning.

[193]  arXiv:2005.09890 (cross-list from q-bio.QM) [pdf]
Title: Interactive exploration of population scale pharmacoepidemiology datasets
Subjects: Quantitative Methods (q-bio.QM); Computer Vision and Pattern Recognition (cs.CV)

Population-scale drug prescription data linked with adverse drug reaction (ADR) data supports the fitting of models large enough to detect drug use and ADR patterns that are not detectable using traditional methods on smaller datasets. However, detecting ADR patterns in large datasets requires tools for scalable data processing, machine learning for data analysis, and interactive visualization. To our knowledge no existing pharmacoepidemiology tool supports all three requirements. We have therefore created a tool for interactive exploration of patterns in prescription datasets with millions of samples. We use Spark to preprocess the data for machine learning and for analyses using SQL queries. We have implemented models in Keras and the scikit-learn framework. The model results are visualized and interpreted using live Python coding in Jupyter. We apply our tool to explore a 384 million prescription data set from the Norwegian Prescription Database combined with a 62 million prescriptions for elders that were hospitalized. We preprocess the data in two minutes, train models in seconds, and plot the results in milliseconds. Our results show the power of combining computational power, short computation times, and ease of use for analysis of population scale pharmacoepidemiology datasets. The code is open source and available at: https://github.com/uit-hdl/norpd_prescription_analyses

[194]  arXiv:2005.09903 (cross-list from stat.ML) [pdf, other]
Title: ReLU Code Space: A Basis for Rating Network Quality Besides Accuracy
Comments: in ICLR 2020 Workshop on Neural Architecture Search (NAS 2020)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

We propose a new metric space of ReLU activation codes equipped with a truncated Hamming distance which establishes an isometry between its elements and polyhedral bodies in the input space which have recently been shown to be strongly related to safety, robustness, and confidence. This isometry allows the efficient computation of adjacency relations between the polyhedral bodies. Experiments on MNIST and CIFAR-10 indicate that information besides accuracy might be stored in the code space.

[195]  arXiv:2005.09913 (cross-list from eess.AS) [pdf, other]
Title: Statistical and Neural Network Based Speech Activity Detection in Non-Stationary Acoustic Environments
Comments: Submitted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Speech activity detection (SAD), which often rests on the fact that the noise is "more" stationary than speech, is particularly challenging in non-stationary environments, because the time variance of the acoustic scene makes it difficult to discriminate speech from noise. We propose two approaches to SAD, where one is based on statistical signal processing, while the other utilizes neural networks. The former employes sophisticated signal processing to track the noise and speech energies and is meant to support the case for a resource efficient, unsupervised signal processing approach. The latter introduces a recurrent network layer that operates on short segments of the input speech to do temporal smoothing in the presence of non-stationary noise. The systems are tested on the Fearless Steps challenge, which consists of the transmission data from the Apollo-11 space mission. The statistical SAD achieves comparable detection performance to earlier proposed neural network based SADs, while the neural network based approach leads to a decision cost function of 1.07% on the evaluation set of the 2020 Fearless Steps Challenge, which sets a new state of the art.

[196]  arXiv:2005.09921 (cross-list from eess.AS) [pdf, other]
Title: End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

End-to-end speaker diarization for an unknown number of speakers is addressed in this paper. Recently proposed end-to-end speaker diarization outperformed conventional clustering-based speaker diarization, but it has one drawback: it is less flexible in terms of the number of speakers. This paper proposes a method for encoder-decoder based attractor calculation (EDA), which first generates a flexible number of attractors from a speech embedding sequence. Then, the generated multiple attractors are multiplied by the speech embedding sequence to produce the same number of speaker activities. The speech embedding sequence is extracted using the conventional self-attentive end-to-end neural speaker diarization (SA-EEND) network. In a two-speaker condition, our method achieved a 2.69 % diarization error rate (DER) on simulated mixtures and a 8.07 % DER on the two-speaker subset of CALLHOME, while vanilla SA-EEND attained 4.56 % and 9.54 %, respectively. In unknown numbers of speakers conditions, our method attained a 15.29 % DER on CALLHOME, while the x-vector-based clustering method achieved a 19.43 % DER.

[197]  arXiv:2005.09922 (cross-list from math.CO) [pdf, ps, other]
Title: On Limit Constants in Last Passage Percolation in Transitive Tournaments
Authors: Kunal Dutta
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM); Probability (math.PR)

We investigate the \emph{last passage percolation} problem on transitive tournaments, in the case when the edge weights are independent Bernoulli random variables. Given a transitive tournament on $n$ nodes with random weights on its edges, the last passage percolation problem seeks to find the weight $X_n$ of the heaviest path, where the weight of a path is the sum of the weights on its edges. We give a recurrence relation and use it to obtain a (bivariate) generating function for the probability generating function of $X_n$. This also gives exact combinatorial expressions for $\mathbb{E}[X_n]$, which was stated as an open problem by Yuster [\emph{Disc. Appl. Math.}, 2017]. We further determine scaling constants in the limit laws for $X_n$. Define $\beta_{tr}(p) := \lim_{n\to \infty} \frac{\mathbb{E}[X_n]}{n-1}$. Using singularity analysis, we show \[ \beta_{tr}(p) = \left(\sum_{n\geq 1}(1-p)^{{n\choose 2}}\right)^{-1}. \] In particular, $\beta_{tr}(0.5) = \left(\sum_{n\geq 1} 2^{-{n\choose 2}}\right)^{-1} = 0.60914971106...$. This settles the question of determining the value of $\beta_{tr}(0.5)$, initiated by Yuster. $\beta_{tr}(p)$ is also the limiting value in the strong law of large numbers for $X_n$, given by Foss, Martin, and Schmidt [\emph{Ann. Appl. Probab.}, 2014]. We also derive the scaling constants in the functional central limit theorem for $X_n$ proved by Foss et al.

[198]  arXiv:2005.09923 (cross-list from stat.ML) [pdf, other]
Title: Tessellated Wasserstein Auto-Encoders
Authors: Kuo Gai, Shihua Zhang
Comments: 15 pages, 8 figures
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC)

Non-adversarial generative models such as variational auto-encoder (VAE), Wasserstein auto-encoders with maximum mean discrepancy (WAE-MMD), sliced-Wasserstein auto-encoder (SWAE) are relatively easy to train and have less mode collapse compared to Wasserstein auto-encoder with generative adversarial network (WAE-GAN). However, they are not very accurate in approximating the target distribution in the latent space because they don't have a discriminator to detect the minor difference between real and fake. To this end, we develop a novel non-adversarial framework called Tessellated Wasserstein Auto-encoders (TWAE) to tessellate the support of the target distribution into a given number of regions by the centroidal Voronoi tessellation (CVT) technique and design batches of data according to the tessellation instead of random shuffling for accurate computation of discrepancy. Theoretically, we demonstrate that the error of estimate to the discrepancy decreases when the numbers of samples $n$ and regions $m$ of the tessellation become larger with rates of $\mathcal{O}(\frac{1}{\sqrt{n}})$ and $\mathcal{O}(\frac{1}{\sqrt{m}})$, respectively. Given fixed $n$ and $m$, a necessary condition for the upper bound of measurement error to be minimized is that the tessellation is the one determined by CVT. TWAE is very flexible to different non-adversarial metrics and can substantially enhance their generative performance in terms of Fr\'{e}chet inception distance (FID) compared to VAE, WAE-MMD, SWAE. Moreover, numerical results indeed demonstrate that TWAE is competitive to the adversarial model WAE-GAN, demonstrating its powerful generative ability.

[199]  arXiv:2005.09940 (cross-list from eess.AS) [pdf, other]
Title: Relative Positional Encoding for Speech Recognition and Direct Translation
Comments: Submitted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

Transformer models are powerful sequence-to-sequence architectures that are capable of directly mapping speech inputs to transcriptions or translations. However, the mechanism for modeling positions in this model was tailored for text modeling, and thus is less ideal for acoustic inputs. In this work, we adapt the relative position encoding scheme to the Speech Transformer, where the key addition is relative distance between input states in the self-attention network. As a result, the network can better adapt to the variable distributions present in speech data. Our experiments show that our resulting model achieves the best recognition result on the Switchboard benchmark in the non-augmentation condition, and the best published result in the MuST-C speech translation benchmark. We also show that this model is able to better utilize synthetic data than the Transformer, and adapts better to variable sentence segmentation quality for speech translation.

[200]  arXiv:2005.09942 (cross-list from physics.ao-ph) [pdf, other]
Title: Estimating volcanic ash emissions using retrieved satellite ash columns and inverse ash transport modelling
Comments: 17 pages, 11 figures. First public draft
Subjects: Atmospheric and Oceanic Physics (physics.ao-ph); Distributed, Parallel, and Cluster Computing (cs.DC)

This paper describes the inversion procedure being used operationally at the Norwegian Meteorological Institute for estimating ash emission rates from retrieved satellite ash column amounts and a priori knowledge.
The overall procedure consists of five stages:
(1) generate a priori emission estimates;
(2) run forward simulations with unit emissions;
(3) collocate/match observations with emission simulations;
(4) build system of linear equations; and
(5) solve overdetermined system.
We go through the mathematical foundations for the inversion procedure, performance for synthetic cases, and performance for real-world cases. The novelties of this paper includes pruning of the linear system of equations used in the inversion and inclusion of observations of ash cloud top altitude.
The source code used in this work is freely available under an open source license, and is possible to use for other similar applications.

[201]  arXiv:2005.09952 (cross-list from math.AP) [pdf, other]
Title: Nodal solutions of weighted indefinite problems
Comments: 19 pages, 29 figures
Subjects: Analysis of PDEs (math.AP); Numerical Analysis (math.NA)

This paper analyzes the structure of the set of nodal solutions of a class of one-dimensional superlinear indefinite boundary values problems with an indefinite weight functions in front of the spectral parameter. Quite astonishingly, the associated high order eigenvalues might not be concave as it is the lowest one. As a consequence, in many circumstances the nodal solutions can bifurcate from three or even four bifurcation points from the trivial solution. This paper combines analytical and numerical tools. The analysis carried over on it is a paradigm of how mathematical analysis aids the numerical study of a problem, whereas simultaneously the numerical study confirms and illuminate the analysis.

[202]  arXiv:2005.09964 (cross-list from eess.IV) [pdf, other]
Title: Iterative Network for Image Super-Resolution
Comments: 12 pages, 14 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Single image super-resolution (SISR), as a traditional ill-conditioned inverse problem, has been greatly revitalized by the recent development of convolutional neural networks (CNN). These CNN-based methods generally map a low-resolution image to its corresponding high-resolution version with sophisticated network structures and loss functions, showing impressive performances. This paper proposes a substantially different approach relying on the iterative optimization on HR space with an iterative super-resolution network (ISRN). We first analyze the observation model of image SR problem, inspiring a feasible solution by mimicking and fusing each iteration in a more general and efficient manner. Considering the drawbacks of batch normalization, we propose a feature normalization (FNorm) method to regulate the features in network. Furthermore, a novel block with F-Norm is developed to improve the network representation, termed as FNB. Residual-in-residual structure is proposed to form a very deep network, which groups FNBs with a long skip connection for better information delivery and stabling the training phase. Extensive experimental results on testing benchmarks with bicubic (BI) degradation show our ISRN can not only recover more structural information, but also achieve competitive or better PSNR/SSIM results with much fewer parameters compared to other works. Besides BI, we simulate the real-world degradation with blur-downscale (BD) and downscalenoise (DN). ISRN and its extension ISRN+ both achieve better performance than others with BD and DN degradation models.

[203]  arXiv:2005.09969 (cross-list from eess.SP) [pdf, other]
Title: Federated Deep Learning Framework For Hybrid Beamforming in mm-Wave Massive MIMO
Comments: four pages four figures
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Machine Learning (cs.LG)

Machine learning (ML) for wireless communications requires the training of a global model with a large dataset collected from the users. However, the transmission of a whole dataset between the users and the base station (BS) is computationally prohibitive. In this work, we introduce a federated learning (FL) based framework where the model training is performed at the BS by collecting only the gradients from the users. In particular, we design a convolutional neural network (CNN), whose input is the channel data and it yields the analog beamformers at the output. We have evaluated the performance of the proposed framework via numerical simulations and shown that FL is more tolerant than ML to the imperfections and corruptions in the channel data as well as having less complexity.

[204]  arXiv:2005.09978 (cross-list from eess.IV) [pdf, other]
Title: AutoML Segmentation for 3D Medical Image Data: Contribution to the MSD Challenge 2018
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Fueled by recent advances in machine learning, there has been tremendous progress in the field of semantic segmentation for the medical image computing community. However, developed algorithms are often optimized and validated by hand based on one task only. In combination with small datasets, interpreting the generalizability of the results is often difficult. The Medical Segmentation Decathlon challenge addresses this problem, and aims to facilitate development of generalizable 3D semantic segmentation algorithms that require no manual parametrization. Such an algorithm was developed and is presented in this paper. It consists of a 3D convolutional neural network with encoder-decoder architecture employing residual-connections, skip-connections and multi-level generation of predictions. It works on anisotropic voxel-geometries and has anisotropic depth, i.e., the number of downsampling steps is a task-specific parameter. These depths are automatically inferred for each task prior to training. By combining this flexible architecture with on-the-fly data augmentation and little-to-no pre-- or postprocessing, promising results could be achieved. The code developed for this challenge will be available online after the final deadline at: https://github.com/ORippler/MSD_2018

[205]  arXiv:2005.09979 (cross-list from math.CO) [pdf, other]
Title: Improved bounds for some facially constrained colorings
Authors: Kenny Štorgel
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)

A facial-parity edge-coloring of a $2$-edge-connected plane graph is a facially-proper edge-coloring in which every face is incident with zero or an odd number of edges of each color. A facial-parity vertex-coloring of a $2$-connected plane graph is a facially-proper vertex-coloring in which every face is incident with zero or an odd number of vertices of each color. Czap and Jendro\v{l} (in Facially-constrained colorings of plane graphs: A survey, Discrete Math. 340 (2017), 2691--2703), conjectured that $10$ colors suffice in both colorings. We present an infinite family of counterexamples to both conjectures.
A facial $(P_{k}, P_{\ell})$-WORM coloring of a plane graph $G$ is a coloring of the vertices such that $G$ contains no rainbow facial $k$-path and no monochromatic facial $\ell$-path. Czap, Jendro\v{l} and Valiska (in WORM colorings of planar graphs, Discuss. Math. Graph Theory 37 (2017), 353--368), proved that for any integer $n\ge 12$ there exists a connected plane graph on $n$ vertices, with maximum degree at least $6$, having no facial $(P_{3},P_{3})$-WORM coloring. They also asked if there exists a graph with maximum degree $4$ having the same property. We prove that for any integer $n\ge 18$, there exists a connected plane graph, with maximum degree $4$, with no facial $(P_{3},P_{3})$-WORM coloring.

[206]  arXiv:2005.09986 (cross-list from eess.AS) [pdf, other]
Title: Evaluating Features and Metrics for High-Quality Simulation of Early Vocal Learning of Vowels
Comments: Submitted to INTERSPEECH 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

The way infants use auditory cues to learn to speak despite the acoustic mismatch of their vocal apparatus is a hot topic of scientific debate. The simulation of early vocal learning using articulatory speech synthesis offers a way towards gaining a deeper understanding of this process. One of the crucial parameters in these simulations is the choice of features and a metric to evaluate the acoustic error between the synthesised sound and the reference target. We contribute with evaluating the performance of a set of 40 feature-metric combinations for the task of optimising the production of static vowels with a high-quality articulatory synthesiser. Towards this end we assess the usability of formant error and the projection of the feature-metric error surface in the normalised F1-F2 formant space. We show that this approach can be used to evaluate the impact of features and metrics and also to offer insight to perceptual results.

[207]  arXiv:2005.10005 (cross-list from math.OC) [pdf, other]
Title: Iterative Domain Optimization
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG)

In this paper we study a new approach in optimization that aims to search a large domain D where a given function takes large, small or specific values via an iterative optimization algorithm based on the gradient. We show that the objective function used is not directly optimizable, however, we use a trick to approximate this objective by another one at each iteration to optimize it. Then we explore a use case of this algorithm in machine learning to find domains where the models output large and small values with respect of some constraints. Experiments demonstrate the efficiency of this algorithm on five cases with models trained on the titanic dataset.

[208]  arXiv:2005.10007 (cross-list from physics.soc-ph) [pdf, other]
Title: Empowering Urban Governance through Urban Science: Multi-scale Dynamics of Urban Systems Worldwide
Comments: 25 pages, 9 figures, 4 tables
Subjects: Physics and Society (physics.soc-ph); Computers and Society (cs.CY); Multiagent Systems (cs.MA)

The current science of cities can provide a useful foundation for future urban policies, provided that these proposals have been validated by correct observations of the diversity of situations in the world. However, international comparisons of the evolution of cities often produce uncertain results because national territorial frameworks are not always in strict correspondence with the dynamics of urban systems. We propose to provide various compositions of systems of cities to better take into account the dynamic networking of cities that go beyond regional and national territorial boundaries. Different models conceived for explaining city size and urban growth distributions enable to establish a correspondence between urban trajectories when observed at the level of cities and systems of cities. We test the validity and representativeness of several dynamic models of complex urban systems and their variations across regions of the world, at the macroscopic scale of systems of cities. The originality of the approach is in considering spatial interaction and evolutionary path dependence as major features in the general behavior of urban entities. The models studied include diverse and complementary processes, such as economic exchanges, diffusion of innovations and physical network flows. Complex systems' dynamics is in principle unpredictable, but contextualizing it regarding demographic, income and resource components may help in minimizing the forecasting errors.

[209]  arXiv:2005.10015 (cross-list from math.CT) [pdf, other]
Title: Comprehension and quotient structures in the language of 2-categories
Subjects: Category Theory (math.CT); Logic in Computer Science (cs.LO)

Lawvere observed in his celebrated work on hyperdoctrines that the set-theoretic schema of comprehension can be elegantly expressed in the functorial language of categorical logic, as a comprehension structure on the functor $p:\mathscr{E}\to\mathscr{B}$ defining the hyperdoctrine.
In this paper, we formulate and study a strictly ordered hierarchy of three notions of comprehension structure on a given functor $p:\mathscr{E}\to\mathscr{B}$, which we call (i) comprehension structure, (ii) comprehension structure with section, and (iii) comprehension structure with image.
Our approach is 2-categorical and we thus formulate the three levels of comprehension structure on a general morphism $p:\mathrm{\mathbf{E}}\to\mathrm{\mathbf{B}}$ in a 2-category $\mathscr{K}$.
This conceptual point of view on comprehension structures enables us to revisit the work by Fumex, Ghani and Johann on the duality between comprehension structures and quotient structures on a given functor $p:\mathscr{E}\to\mathscr{B}$.
In particular, we show how to lift the comprehension and quotient structures on a functor $p:\mathscr{E}\to\mathscr{B}$ to the categories of algebras or coalgebras associated to functors $F_{\mathscr{E}}:\mathscr{E}\to\mathscr{E}$ and $F_{\mathscr{B}}:\mathscr{B}\to\mathscr{B}$ of interest, in order to interpret reasoning by induction and coinduction in the traditional language of categorical logic, formulated in an appropriate 2-categorical way.

[210]  arXiv:2005.10018 (cross-list from math.ST) [pdf, ps, other]
Title: Revisiting Concentration of Missing Mass
Authors: Maciej Skorski
Subjects: Statistics Theory (math.ST); Machine Learning (cs.LG); Probability (math.PR); Machine Learning (stat.ML)

We revisit the problem of \emph{missing mass concentration}, deriving Bernstein-type bounds which improves upon the state of art. We also point out a mistake in the recent related result (UAI'15).

[211]  arXiv:2005.10034 (cross-list from eess.IV) [pdf, other]
Title: Data Consistent CT Reconstruction from Insufficient Data with Learned Prior Images
Comments: 10 pages, 9 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Image reconstruction from insufficient data is common in computed tomography (CT), e.g., image reconstruction from truncated data, limited-angle data and sparse-view data. Deep learning has achieved impressive results in this field. However, the robustness of deep learning methods is still a concern for clinical applications due to the following two challenges: a) With limited access to sufficient training data, a learned deep learning model may not generalize well to unseen data; b) Deep learning models are sensitive to noise. Therefore, the quality of images processed by neural networks only may be inadequate. In this work, we investigate the robustness of deep learning in CT image reconstruction by showing false negative and false positive lesion cases. Since learning-based images with incorrect structures are likely not consistent with measured projection data, we propose a data consistent reconstruction (DCR) method to improve their image quality, which combines the advantages of compressed sensing and deep learning: First, a prior image is generated by deep learning. Afterwards, unmeasured projection data are inpainted by forward projection of the prior image. Finally, iterative reconstruction with reweighted total variation regularization is applied, integrating data consistency for measured data and learned prior information for missing data. The efficacy of the proposed method is demonstrated in cone-beam CT with truncated data, limited-angle data and sparse-view data, respectively. For example, for truncated data, DCR achieves a mean root-mean-square error of 24 HU and a mean structure similarity index of 0.999 inside the field-of-view for different patients in the noisy case, while the state-of-the-art U-Net method achieves 55 HU and 0.995 respectively for these two metrics.

[212]  arXiv:2005.10040 (cross-list from stat.ML) [pdf, other]
Title: Informative Path Planning for Anomaly Detection in Environment Exploration and Monitoring
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Robotics (cs.RO)

An unmanned autonomous vehicle (UAV) is sent on a mission to explore and reconstruct an unknown environment from a series of measurements collected by Bayesian optimization. The success of the mission is judged by the UAV's ability to faithfully reconstruct any anomalous feature present in the environment (e.g., extreme topographic depressions or abnormal chemical concentrations). We show that the criteria commonly used for determining which locations the UAV should visit are ill-suited for this task. We introduce a number of novel criteria that guide the UAV towards regions of strong anomalies by leveraging previously collected information in a mathematically elegant and computationally tractable manner. We demonstrate superiority of the proposed approach in several applications.

[213]  arXiv:2005.10045 (cross-list from eess.IV) [pdf]
Title: Sparse data to structured imageset transformation
Authors: Baris Kanber
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG)

Machine learning problems involving sparse datasets may benefit from the use of convolutional neural networks if the numbers of samples and features are very large. Such datasets are increasingly more frequently encountered in a variety of different domains. We convert such datasets to imagesets while attempting to give each image structure that is amenable for use with convolutional neural networks. Experimental results on two publicly available, sparse datasets show that the approach can boost classification performance compared to other methods, which may be attributed to the formation of visually distinguishable shapes on the resultant images.

[214]  arXiv:2005.10049 (cross-list from eess.AS) [pdf, ps, other]
Title: Early Stage LM Integration Using Local and Global Log-Linear Combination
Comments: Submitted to Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)

Sequence-to-sequence models with an implicit alignment mechanism (e.g. attention) are closing the performance gap towards traditional hybrid hidden Markov models (HMM) for the task of automatic speech recognition. One important factor to improve word error rate in both cases is the use of an external language model (LM) trained on large text-only corpora. Language model integration is straightforward with the clear separation of acoustic model and language model in classical HMM-based modeling. In contrast, multiple integration schemes have been proposed for attention models. In this work, we present a novel method for language model integration into implicit-alignment based sequence-to-sequence models. Log-linear model combination of acoustic and language model is performed with a per-token renormalization. This allows us to compute the full normalization term efficiently both in training and in testing. This is compared to a global renormalization scheme which is equivalent to applying shallow fusion in training. The proposed methods show good improvements over standard model combination (shallow fusion) on our state-of-the-art Librispeech system. Furthermore, the improvements are persistent even if the LM is exchanged for a more powerful one after training.

[215]  arXiv:2005.10052 (cross-list from eess.IV) [pdf, other]
Title: Lung Segmentation from Chest X-rays using Variational Data Imputation
Comments: Source code, training data and the trained models are available here: this https URL
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)

Pulmonary opacification is the inflammation in the lungs caused by many respiratory ailments, including the novel corona virus disease 2019 (COVID-19). Chest X-rays (CXRs) with such opacifications render regions of lungs imperceptible, making it difficult to perform automated image analysis on them. In this work, we focus on segmenting lungs from such abnormal CXRs as part of a pipeline aimed at automated risk scoring of COVID-19 from CXRs. We treat the high opacity regions as missing data and present a modified CNN-based image segmentation network that utilizes a deep generative model for data imputation. We train this model on normal CXRs with extensive data augmentation and demonstrate the usefulness of this model to extend to cases with extreme abnormalities.

[216]  arXiv:2005.10058 (cross-list from math.LO) [pdf, ps, other]
Title: On embedding Lambek calculus into commutative categorial grammars
Authors: Sergey Slavnov
Subjects: Logic (math.LO); Computation and Language (cs.CL); Logic in Computer Science (cs.LO)

Abstract categorial grammars (ACG), as well as some other, closely related systems, are based on the ordinary, commutative implicational linear logic and linear $\lambda$-calculus in contrast to the better known "noncommutative" Lambek grammars and their variations. ACG seem attractive in many ways, not the least of which is the simplicity of the underlying logic. Yet it is known that ACG and their relatives behave poorly in modeling many natural language phenomena (such as, for example, coordination) compared to "noncommutative" formalisms. Therefore different solutions have been proposed in order to enrich ACG with noncommutative constructions.
Tensor grammars of this work are another example of "commutative" grammars, based on the classical, rather than intuitionistic linear logic. They can be seen as a surface representation of ACG in the sense that derivations of ACG translate to derivations of tensor grammars and this translation is isomorphic on the level of string languages. An advantage of this representation, as it seems to us, is that the syntax becomes extremely simple and a direct geometric meaning is transparent.
We address the problem of encoding noncommutative operations in our setting. This turns out possible after enriching the system with new unary operators. The resulting system allows representing both ACG and Lambek grammars as conservative fragments, while the formalism remains, as it seems to us, rather simple and intuitive.

[217]  arXiv:2005.10076 (cross-list from math.OC) [pdf, other]
Title: Data-driven learning of robust nonlocal physics from high-fidelity synthetic data
Comments: 32 pages, 10 figures, 3 tables
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG)

A key challenge to nonlocal models is the analytical complexity of deriving them from first principles, and frequently their use is justified a posteriori. In this work we extract nonlocal models from data, circumventing these challenges and providing data-driven justification for the resulting model form. Extracting provably robust data-driven surrogates is a major challenge for machine learning (ML) approaches, due to nonlinearities and lack of convexity. Our scheme allows extraction of provably invertible nonlocal models whose kernels may be partially negative. To achieve this, based on established nonlocal theory, we embed in our algorithm sufficient conditions on the non-positive part of the kernel that guarantee well-posedness of the learnt operator. These conditions are imposed as inequality constraints and ensure that models are robust, even in small-data regimes. We demonstrate this workflow for a range of applications, including reproduction of manufactured nonlocal kernels; numerical homogenization of Darcy flow associated with a heterogeneous periodic microstructure; nonlocal approximation to high-order local transport phenomena; and approximation of globally supported fractional diffusion operators by truncated kernels.

[218]  arXiv:2005.10080 (cross-list from math.LO) [pdf, ps, other]
Title: Proving P!=NP in first-order PA
Authors: Rupert McCallum
Subjects: Logic (math.LO); Computational Complexity (cs.CC)

We show that it is provable in PA that there is an arithmetically definable sequence $\{\phi_{n}:n \in \omega\}$ of $\Pi^{0}_{2}$-sentences, such that
- PRA+$\{\phi_{n}:n \in \omega\}$ is $\Pi^{0}_{2}$-sound and $\Pi^{0}_{1}$-complete
- the length of $\phi_{n}$ is bounded above by a polynomial function of $n$ with positive leading coefficient
- PRA+$\phi_{n+1}$ always proves 1-consistency of PRA+$\phi_{n}$.
One has that the growth in logical strength is in some sense "as fast as possible", manifested in the fact that the total general recursive functions whose totality is asserted by the true $\Pi^{0}_{2}$-sentences in the sequence are cofinal growth-rate-wise in the set of all total general recursive functions. We then develop an argument which makes use of a sequence of sentences constructed by an application of the diagonal lemma, which are generalisations in a broad sense of Hugh Woodin's "Tower of Hanoi" construction as outlined in his essay "Tower of Hanoi" in Chapter 18 of the anthology "Truth in Mathematics". The argument establishes the result that it is provable in PA that $P \neq NP$.

[219]  arXiv:2005.10087 (cross-list from stat.ML) [pdf, ps, other]
Title: Riemannian geometry for Compound Gaussian distributions: application to recursive change detection
Comments: 13 pages
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

A new Riemannian geometry for the Compound Gaussian distribution is proposed. In particular, the Fisher information metric is obtained, along with corresponding geodesics and distance function. This new geometry is applied on a change detection problem on Multivariate Image Times Series: a recursive approach based on Riemannian optimization is developed. As shown on simulated data, it allows to reach optimal performance while being computationally more efficient.

[220]  arXiv:2005.10089 (cross-list from eess.AS) [pdf, other]
Title: Investigation of Large-Margin Softmax in Neural Language Modeling
Comments: submitted to INTERSPEECH2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

To encourage intra-class compactness and inter-class separability among trainable feature vectors, large-margin softmax methods are developed and widely applied in the face recognition community. The introduction of the large-margin concept into the softmax is reported to have good properties such as enhanced discriminative power, less overfitting and well-defined geometric intuitions. Nowadays, language modeling is commonly approached with neural networks using softmax and cross entropy. In this work, we are curious to see if introducing large-margins to neural language models would improve the perplexity and consequently word error rate in automatic speech recognition. Specifically, we first implement and test various types of conventional margins following the previous works in face recognition. To address the distribution of natural language data, we then compare different strategies for word vector norm-scaling. After that, we apply the best norm-scaling setup in combination with various margins and conduct neural language models rescoring experiments in automatic speech recognition. We find that although perplexity is slightly deteriorated, neural language models with large-margin softmax can yield word error rate similar to that of the standard softmax baseline. Finally, expected margins are analyzed through visualization of word vectors, showing that the syntactic and semantic relationships are also preserved.

[221]  arXiv:2005.10099 (cross-list from stat.ML) [pdf, other]
Title: Nonparametric Score Estimators
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Estimating the score, i.e., the gradient of log density function, from a set of samples generated by an unknown distribution is a fundamental task in inference and learning of probabilistic models that involve flexible yet intractable densities. Kernel estimators based on Stein's methods or score matching have shown promise, however their theoretical properties and relationships have not been fully-understood. We provide a unifying view of these estimators under the framework of regularized nonparametric regression. It allows us to analyse existing estimators and construct new ones with desirable properties by choosing different hypothesis spaces and regularizers. A unified convergence analysis is provided for such estimators. Finally, we propose score estimators based on iterative regularization that enjoy computational benefits from curl-free kernels and fast convergence.

[222]  arXiv:2005.10113 (cross-list from eess.AS) [pdf, other]
Title: A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition
Comments: 4 pages, 2 figures
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

End-to-end models are gaining wider attention in the field of automatic speech recognition (ASR). One of their advantages is the simplicity of building that directly recognizes the speech frame sequence into the text label sequence by neural networks. According to the driving end in the recognition process, end-to-end ASR models could be categorized into two types: label-synchronous and frame-synchronous, each of which has unique model behaviour and characteristic. In this work, we make a detailed comparison on a representative label-synchronous model (transformer) and a soft frame-synchronous model (continuous integrate-and-fire (CIF) based model). The results on three public dataset and a large-scale dataset with 12000 hours of training data show that the two types of models have respective advantages that are consistent with their synchronous mode.

[223]  arXiv:2005.10125 (cross-list from stat.AP) [pdf, other]
Title: Modelling Grocery Retail Topic Distributions: Evaluation, Interpretability and Stability
Comments: 20 pages, 9 figures
Subjects: Applications (stat.AP); Computation and Language (cs.CL); Methodology (stat.ME)

Understanding the shopping motivations behind market baskets has high commercial value in the grocery retail industry. Analyzing shopping transactions demands techniques that can cope with the volume and dimensionality of grocery transactional data while keeping interpretable outcomes. Latent Dirichlet Allocation (LDA) provides a suitable framework to process grocery transactions and to discover a broad representation of customers' shopping motivations. However, summarizing the posterior distribution of an LDA model is challenging, while individual LDA draws may not be coherent and cannot capture topic uncertainty. Moreover, the evaluation of LDA models is dominated by model-fit measures which may not adequately capture the qualitative aspects such as interpretability and stability of topics.
In this paper, we introduce clustering methodology that post-processes posterior LDA draws to summarise the entire posterior distribution and identify semantic modes represented as recurrent topics. Our approach is an alternative to standard label-switching techniques and provides a single posterior summary set of topics, as well as associated measures of uncertainty. Furthermore, we establish a more holistic definition for model evaluation, which assesses topic models based not only on their likelihood but also on their coherence, distinctiveness and stability. By means of a survey, we set thresholds for the interpretation of topic coherence and topic similarity in the domain of grocery retail data. We demonstrate that the selection of recurrent topics through our clustering methodology not only improves model likelihood but also outperforms the qualitative aspects of LDA such as interpretability and stability. We illustrate our methods on an example from a large UK supermarket chain.

[224]  arXiv:2005.10227 (cross-list from q-bio.OT) [pdf, other]
Title: Hemogram Data as a Tool for Decision-making in COVID-19 Management: Applications to Resource Scarcity Scenarios
Comments: 14 pages, 5 figures, 2 tables, Tool Available at: this http URL
Subjects: Other Quantitative Biology (q-bio.OT); Machine Learning (cs.LG)

COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure. A critical element involves essential workforce management since current protocols recommend release from duty for symptomatic individuals, including essential personnel. Testing capacity is also problematic in several countries, where diagnosis demand outnumbers available local testing capacity. This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients and how they can be used to predict qRT-PCR test results. Methods: A Naive-Bayes model for machine learning is proposed for handling different scarcity scenarios, including managing symptomatic essential workforce and absence of diagnostic tests. Hemogram result data was used to predict qRT-PCR results in situations where the latter was not performed, or results are not yet available. Adjusts in assumed prior probabilities allow fine-tuning of the model, according to actual prediction context. Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity. Data assessment can be performed in an individual or simultaneous basis, according to desired outcome. Based on hemogram data and background scarcity context, resource distribution is significantly optimized when model-based patient selection is observed, compared to random choice. The model can help manage testing deficiency and other critical circumstances. Machine learning models can be derived from widely available, quick, and inexpensive exam data in order to predict qRT-PCR results used in COVID-19 diagnosis. These models can be used to assist strategic decision-making in resource scarcity scenarios, including personnel shortage, lack of medical resources, and testing insufficiency.

Replacements for Thu, 21 May 20

[225]  arXiv:1406.0879 (replaced) [src]
Title: Computing rank of finite algebraic structures with limited nondeterminism
Comments: Lemma 2.1 incorrectly claims composition of nondeterministic functions
Subjects: Computational Complexity (cs.CC)
[226]  arXiv:1508.07603 (replaced) [pdf, ps, other]
Title: Line-of-Sight Pursuit in Monotone and Scallop Polygons
Comments: 42 pages, 22 figures
Subjects: Computational Geometry (cs.CG)
[227]  arXiv:1604.08900 (replaced) [pdf, ps, other]
Title: Solving periodic semilinear stiff PDEs in 1D, 2D and 3D with exponential integrators
Subjects: Numerical Analysis (math.NA)
[228]  arXiv:1606.00075 (replaced) [pdf]
Title: Applications of Probabilistic Programming (Master's thesis, 2015)
Authors: Yura N Perov
Comments: Supervisor: Frank Wood. The thesis was prepared in the Department of Engineering Science at the University of Oxford
Subjects: Artificial Intelligence (cs.AI)
[229]  arXiv:1609.02000 (replaced) [pdf]
Title: Cybernetic Cities: Designing and controlling adaptive and robust urban systems
Comments: To be published in "Handbook on Complexity and Cities"
Subjects: Adaptation and Self-Organizing Systems (nlin.AO); Computers and Society (cs.CY); Physics and Society (physics.soc-ph)
[230]  arXiv:1701.05324 (replaced) [pdf, other]
Title: A Characterisation of Open Bisimilarity using an Intuitionistic Modal Logic
Subjects: Logic in Computer Science (cs.LO)
[231]  arXiv:1706.00342 (replaced) [pdf, ps, other]
Title: On the stable recovery of deep structured linear networks under sparsity constraints
Authors: Francois Malgouyres (IMT)
Journal-ref: Mathematical and Scientific Machine Learning, Jul 2020, Princeton, United States
Subjects: Optimization and Control (math.OC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Statistics Theory (math.ST)
[232]  arXiv:1801.09348 (replaced) [pdf, other]
Title: Strong Approximation of Stochastic Allen-Cahn Equation with White Noise
Journal-ref: IMA J. Numer. Anal. (40) 2020, no. 2, 1074--1093
Subjects: Numerical Analysis (math.NA); Probability (math.PR)
[233]  arXiv:1804.04713 (replaced) [pdf, other]
Title: New method of bandlimited extrapolation
Authors: Vishal Vaibhav
Subjects: Numerical Analysis (math.NA)
[234]  arXiv:1804.08547 (replaced) [pdf, other]
Title: Entropy bounds for grammar compression
Subjects: Data Structures and Algorithms (cs.DS)
[235]  arXiv:1805.02069 (replaced) [pdf, ps, other]
Title: Free Higher Groups in Homotopy Type Theory
Comments: v1: 19 pages, published version; v2: fix typo
Journal-ref: LICS '18: Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, 2018
Subjects: Logic in Computer Science (cs.LO)
[236]  arXiv:1805.07698 (replaced) [pdf, ps, other]
Title: Density-Adaptive Kernel based Efficient Reranking Approaches for Person Reidentification
Comments: 39 pages, 18 figures and 12 tables. This paper is an extended version of our preliminary work on ICPR 2018
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[237]  arXiv:1807.10848 (replaced) [pdf, other]
Title: Two Disjoint 5-Holes in Point Sets
Subjects: Combinatorics (math.CO); Computational Geometry (cs.CG)
[238]  arXiv:1808.02233 (replaced) [pdf]
Title: Robust Pricing with Refunds
Subjects: Computer Science and Game Theory (cs.GT); Theoretical Economics (econ.TH)
[239]  arXiv:1810.07382 (replaced) [pdf, other]
Title: Analysis of Railway Accidents' Narratives Using Deep Learning
Comments: accepted in IEEE International Conference on Machine Learning and Applications (IEEE ICMLA)
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
[240]  arXiv:1810.12806 (replaced) [pdf, ps, other]
Title: Computing Approximate Equilibria in Weighted Congestion Games via Best-Responses
Subjects: Computer Science and Game Theory (cs.GT)
[241]  arXiv:1811.01908 (replaced) [pdf, ps, other]
Title: Fast Non-Bayesian Poisson Factorization for Implicit-Feedback Recommendations
Authors: David Cortes
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[242]  arXiv:1811.04251 (replaced) [pdf, other]
Title: Formal Limitations on the Measurement of Mutual Information
Subjects: Information Theory (cs.IT); Machine Learning (cs.LG); Machine Learning (stat.ML)
[243]  arXiv:1811.06193 (replaced) [pdf, other]
Title: From Videos to URLs: A Multi-Browser Guide To Extract User's Behavior with Optical Character Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[244]  arXiv:1812.08664 (replaced) [pdf, other]
Title: Near-Linear Time Approximation Schemes for Clustering in Doubling Metrics
Subjects: Data Structures and Algorithms (cs.DS); Computational Geometry (cs.CG)
[245]  arXiv:1901.03155 (replaced) [pdf, ps, other]
Title: Entropy Bounds for Grammar-Based Tree Compressors
Comments: A short version of this paper appeared in the IEEE Proceedings of ISIT 2019
Subjects: Data Structures and Algorithms (cs.DS); Information Theory (cs.IT)
[246]  arXiv:1901.05049 (replaced) [pdf, other]
Title: AI Pipeline -- bringing AI to you. End-to-end integration of data, algorithms and deployment tools
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
[247]  arXiv:1901.11164 (replaced) [pdf, other]
Title: Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition
Journal-ref: 2019 International Conference on Artificial Neural Networks (ICANN)
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[248]  arXiv:1902.04347 (replaced) [pdf, ps, other]
Title: A Multilevel Monte Carlo Asymptotic-Preserving Particle Method for Kinetic Equations in the Diffusion Limit
Comments: 20 pages, 7 figures, published in Monte Carlo and Quasi-Monte Carlo Methods 2018, correction of minor typographical errors
Journal-ref: Monte Carlo and Quasi-Monte Carlo Methods 2018, 383--402 (2020)
Subjects: Numerical Analysis (math.NA)
[249]  arXiv:1904.00726 (replaced) [pdf, other]
Title: Unsupervised Multimodal Hashing for Cross-modal retrieval
Authors: Jun Yu, Xiao-Jun Wu
Comments: The paper is under consideration at IEEE Transaction on Big Data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[250]  arXiv:1904.03264 (replaced) [pdf, other]
Title: Attack-Resilient Supervisory Control of Discrete-Event Systems
Subjects: Formal Languages and Automata Theory (cs.FL); Systems and Control (eess.SY)
[251]  arXiv:1904.06998 (replaced) [pdf, other]
Title: An optimal polynomial approximation of Brownian motion
Comments: 27 pages, 8 figures
Journal-ref: SIAM Journal on Numerical Analysis, vol. 58, no. 3, pp. 1393-1421, 2020
Subjects: Numerical Analysis (math.NA); Probability (math.PR)
[252]  arXiv:1904.08067 (replaced) [pdf, other]
Title: Text Classification Algorithms: A Survey
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (stat.ML)
[253]  arXiv:1905.04197 (replaced) [pdf, other]
Title: T-Net: Nested encoder-decoder architecture for the main vessel segmentation in coronary angiography
Comments: Neural Networks, Accepted
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[254]  arXiv:1905.07444 (replaced) [pdf, other]
Title: Percival: Making In-Browser Perceptual Ad Blocking Practical With Deep Learning
Authors: Zain ul abi Din (1), Panagiotis Tigas (2), Samuel T. King (1 and 5), Benjamin Livshits (3 and 4) ((1) UC Davis, (2) Oxford University, (3) Brave Software, (4) Imperial College London, (5) Bouncer Technologies)
Comments: 13 Pages
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Machine Learning (stat.ML)
[255]  arXiv:1905.08389 (replaced) [pdf, other]
Title: Time-varying Autoregression with Low Rank Tensors
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[256]  arXiv:1905.10517 (replaced) [pdf, other]
Title: Transferable Cost-Aware Security Policy Implementation for Malware Detection Using Deep Reinforcement Learning
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
[257]  arXiv:1906.01715 (replaced) [pdf, other]
Title: $hp$-Version discontinuous Galerkin methods on essentially arbitrarily-shaped elements
Subjects: Numerical Analysis (math.NA)
[258]  arXiv:1907.00516 (replaced) [pdf, other]
Title: Learning to Blindly Assess Image Quality in the Laboratory and Wild
Comments: Accepted by ICIP2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[259]  arXiv:1907.06950 (replaced) [src]
Title: Abstract categorial grammars with island constraints and effective decidability
Authors: Sergey Slavnov
Comments: This was a premature attempt, sorry
Subjects: Logic (math.LO); Computation and Language (cs.CL)
[260]  arXiv:1907.07792 (replaced) [pdf, other]
Title: GRIP++: Enhanced Graph-based Interaction-aware Trajectory Prediction for Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[261]  arXiv:1907.12146 (replaced) [pdf, other]
Title: On Norm-Based Estimations for Domains of Attraction in Nonlinear Time-Delay Systems
Comments: 33 pages, 8 figures, "This is a pre-print of an article published in 'Nonlinear Dynamics'. The final authenticated version is available online at this https URL"
Subjects: Systems and Control (eess.SY); Chaotic Dynamics (nlin.CD)
[262]  arXiv:1908.10678 (replaced) [pdf]
Title: How Good is Artificial Intelligence at Automatically Answering Consumer Questions Related to Alzheimer's Disease?
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
[263]  arXiv:1908.11825 (replaced) [pdf, ps, other]
Title: The Communication Complexity of Set Intersection and Multiple Equality Testing
Comments: 44 pages
Journal-ref: Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms. 2020, 1715-1732
Subjects: Information Theory (cs.IT); Computational Complexity (cs.CC)
[264]  arXiv:1909.00367 (replaced) [pdf, other]
Title: Gaussian mixture model decomposition of multivariate signals
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[265]  arXiv:1909.00841 (replaced) [pdf, other]
Title: Approximate Query Service on Autonomous IoT Cameras
Subjects: Databases (cs.DB)
[266]  arXiv:1910.04980 (replaced) [pdf, other]
Title: Conversational Transfer Learning for Emotion Recognition
Comments: Information Fusion
Subjects: Computation and Language (cs.CL)
[267]  arXiv:1910.12047 (replaced) [pdf, ps, other]
Title: Comparison of Deep Reinforcement Learning and Model Predictive Control for Adaptive Cruise Control
Subjects: Systems and Control (eess.SY)
[268]  arXiv:1910.12923 (replaced) [pdf, ps, other]
Title: Ensemble Kalman Sampler: mean-field limit and convergence analysis
Authors: Zhiyan Ding, Qin Li
Subjects: Numerical Analysis (math.NA)
[269]  arXiv:1911.00660 (replaced) [pdf, ps, other]
Title: Security of Facial Forensics Models Against Adversarial Attacks
Comments: Accepted by ICIP 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[270]  arXiv:1911.02134 (replaced) [pdf, other]
Title: Asynchronous Online Federated Learning for Edge Devices
Comments: 11 pages
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[271]  arXiv:1911.02410 (replaced) [pdf, other]
Title: DISROPT: a Python Framework for Distributed Optimization
Subjects: Optimization and Control (math.OC); Mathematical Software (cs.MS)
[272]  arXiv:1912.01108 (replaced) [pdf, other]
Title: Automated Dependency Plots
Comments: Accepted to Uncertainty in Artificial Intelligence (UAI 2020)
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[273]  arXiv:1912.03800 (replaced) [pdf, other]
Title: Sequential Estimation of Network Cascades
Comments: 5 pages, 2 figures
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT); Signal Processing (eess.SP)
[274]  arXiv:1912.04154 (replaced) [pdf, ps, other]
Title: Butterfly-Net2: Simplified Butterfly-Net and Fourier Transform Initialization
Subjects: Machine Learning (cs.LG); Numerical Analysis (math.NA); Machine Learning (stat.ML)
[275]  arXiv:1912.05823 (replaced) [pdf, other]
Title: Smart Contract Repair
Comments: 32 pages. ACM Transactions on Software Engineering and Methodology (TOSEM), 2020
Subjects: Software Engineering (cs.SE); Cryptography and Security (cs.CR)
[276]  arXiv:1912.08577 (replaced) [pdf, other]
Title: A Cross-Modal Image Fusion Theory Guided by Human Visual Characteristics
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Machine Learning (cs.LG)
[277]  arXiv:1912.11264 (replaced) [pdf, other]
Title: Deep Manifold Embedding for Hyperspectral Image Classification
Comments: Submitted to ISPRS
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[278]  arXiv:2001.02753 (replaced) [pdf, other]
Title: Locating conical degeneracies in the spectra of parametric self-adjoint matrices
Comments: 17 pages, 6 figures; updated references
Subjects: Spectral Theory (math.SP); Numerical Analysis (math.NA)
[279]  arXiv:2001.04180 (replaced) [pdf, other]
Title: 'I Just Want to Hack Myself to Not Get Distracted': Evaluating Design Interventions for Self-Control on Facebook
Comments: 10 pages (excluding references), 6 figures. To appear in the Proceedings of CHI '20 CHI Conference on Human Factors in Computing Systems, April 25--30, 2020, Honolulu, HI, USA
Subjects: Human-Computer Interaction (cs.HC)
[280]  arXiv:2001.09957 (replaced) [pdf, other]
Title: Reinforcement Learning-based Autoscaling of Workflows in the Cloud: A Survey
Comments: 36 pages, 12 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Machine Learning (stat.ML)
[281]  arXiv:2002.01401 (replaced) [pdf, other]
Title: Numerical methods for nonlocal and fractional models
Comments: Revised/Improved version. 126 pages, 18 figures, review paper
Subjects: Numerical Analysis (math.NA); Analysis of PDEs (math.AP); Optimization and Control (math.OC)
[282]  arXiv:2002.06734 (replaced) [pdf, other]
Title: Automatic Frame Selection using CNN in Ultrasound Elastography
Comments: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[283]  arXiv:2002.07089 (replaced) [pdf, other]
Title: 4D Semantic Cardiac Magnetic Resonance Image Synthesis on XCAT Anatomical Model
Comments: Accepted to MIDL 2020
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[284]  arXiv:2002.07491 (replaced) [pdf, other]
Title: The Time-Triggered Wireless Architecture
Comments: Accepted at ECRTS 2020
Subjects: Networking and Internet Architecture (cs.NI)
[285]  arXiv:2002.07761 (replaced) [pdf, other]
Title: Fixed-Parameter Tractability of the Weighted Edge Clique Partition Problem
Subjects: Data Structures and Algorithms (cs.DS)
[286]  arXiv:2002.07950 (replaced) [pdf]
Title: Survey on Individual Differences in Visualization
Subjects: Human-Computer Interaction (cs.HC)
[287]  arXiv:2002.09436 (replaced) [pdf, other]
Title: Likelihood-free inference of experimental Neutrino Oscillations using Neural Spline Flows
Comments: 10 pages, 3 figures
Subjects: High Energy Physics - Phenomenology (hep-ph); Machine Learning (cs.LG); High Energy Physics - Experiment (hep-ex)
[288]  arXiv:2003.01310 (replaced) [pdf, other]
Title: Performance Optimization for Edge-Cloud Serverless Platforms via Dynamic Task Placement
Comments: 10 pages, 6 figures, 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Networking and Internet Architecture (cs.NI)
[289]  arXiv:2003.03118 (replaced) [pdf, other]
Title: Evolved Neuromorphic Control for High Speed Divergence-based Landings of MAVs
Comments: 8 pages, 5 figures, submitted to IEEE Robotics and Automation Letters, revision including additional real-world experiments and improved visualization
Subjects: Robotics (cs.RO); Neural and Evolutionary Computing (cs.NE)
[290]  arXiv:2003.03125 (replaced) [pdf, other]
Title: Weight Priors for Learning Identity Relations
Comments: Proceedings of KR2ML @ NeurIPS 2019, Vancouver, Canada
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[291]  arXiv:2003.04222 (replaced) [pdf, other]
Title: Sparse and Cosparse Audio Dequantization Using Convex Optimization
Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[292]  arXiv:2003.06971 (replaced) [pdf, ps, other]
Title: Online Algorithms for Dynamic Matching Markets in Energy Distribution Systems
Subjects: Systems and Control (eess.SY); Multiagent Systems (cs.MA)
[293]  arXiv:2003.07435 (replaced) [pdf, other]
Title: Vyper: A Security Comparison with Solidity Based on Common Vulnerabilities
Subjects: Cryptography and Security (cs.CR)
[294]  arXiv:2003.08423 (replaced) [pdf, other]
Title: Volumetric parcellation of the right ventricle for regional geometric and functional assessment
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[295]  arXiv:2003.08741 (replaced) [pdf]
Title: A Convolutional Neural Network-based Patent Image Retrieval Method for Design Ideation
Comments: 11 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
[296]  arXiv:2003.12443 (replaced) [pdf]
Title: A Computer-Aided Diagnosis System Using Artificial Intelligence for Hip Fractures -Multi-Institutional Joint Development Research-
Comments: 9 pages, 4 tables, 7 figures. / author's homepage : this https URL
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Tissues and Organs (q-bio.TO)
[297]  arXiv:2003.13503 (replaced) [pdf]
Title: An IoT Healthcare Framework for Diagnosis of Breast Cancer using Hybrid Transfer Learning
Comments: 24 pages, 11 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
[298]  arXiv:2004.01496 (replaced) [pdf, other]
Title: Company classification using machine learning
Comments: 16 pages, 6 figures, 1 table
Subjects: Statistical Finance (q-fin.ST); Machine Learning (cs.LG)
[299]  arXiv:2004.02135 (replaced) [pdf, other]
Title: A Discriminator Improves Unconditional Text Generation without Updating the Generator
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[300]  arXiv:2004.02863 (replaced) [pdf, other]
Title: Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs
Comments: Submitted to INTERSPEECH 2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[301]  arXiv:2004.03064 (replaced) [pdf, other]
Title: MGGR: MultiModal-Guided Gaze Redirection with Coarse-to-Fine Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[302]  arXiv:2004.03516 (replaced) [pdf, other]
Title: Divergent modes of online collective attention to the COVID-19 pandemic are associated with future caseload variance
Comments: 12 + 4 pages, 11 + 4 figures, code + data + figures will soon be available at this http URL
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI)
[303]  arXiv:2004.04696 (replaced) [pdf, other]
Title: BLEURT: Learning Robust Metrics for Text Generation
Comments: Accepted at ACL 2020
Subjects: Computation and Language (cs.CL)
[304]  arXiv:2004.05619 (replaced) [pdf, other]
Title: Relations among Open-loop Control Ability, Control Strategy Space and Closed-loop Performance for Linear Discrte-time Systems
Authors: Mingwang Zhao
Comments: 21 pages, 3 figures,1 table
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
[305]  arXiv:2004.06169 (replaced) [pdf]
Title: Using Reports of Own and Others' Symptoms and Diagnosis on Social Media to Predict COVID-19 Case Counts: Observational Infoveillance Study in Mainland China
Authors: Cuihua Shen (1), Anfan Chen (2), Chen Luo (3), Jingwen Zhang (1), Bo Feng (1), Wang Liao (1) ((1) University of California, Davis, (2) University of Science and Technology of China, (3) Tsinghua University)
Comments: In this version we updated data to March 31, 2020 and conducted new analyses
Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph); Populations and Evolution (q-bio.PE)
[306]  arXiv:2004.07127 (replaced) [pdf, other]
Title: Dissecting Energy Consumption of NB-IoT Devices Empirically
Comments: 18 pages, 25 figures, IEEE journal format, all Figures recreated for better readability, new section with results summary
Subjects: Networking and Internet Architecture (cs.NI)
[307]  arXiv:2004.07180 (replaced) [pdf, other]
Title: SPECTER: Document-level Representation Learning using Citation-informed Transformers
Comments: ACL 2020
Subjects: Computation and Language (cs.CL)
[308]  arXiv:2004.07657 (replaced) [pdf, other]
Title: Old is Gold: Redefining the Adversarially Learned One-Class Classifier Training Paradigm
Comments: Accepted at the Conference on Computer Vision and Pattern Recognition CVPR 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[309]  arXiv:2004.07780 (replaced) [pdf, other]
Title: Shortcut Learning in Deep Neural Networks
Comments: perspective article
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)
[310]  arXiv:2004.08799 (replaced) [pdf, other]
Title: On the Unusual Effectiveness of Type-aware Mutations for Testing SMT Solvers
Subjects: Software Engineering (cs.SE); Programming Languages (cs.PL)
[311]  arXiv:2004.09990 (replaced) [pdf]
Title: A Philosophy of Data
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[312]  arXiv:2004.10521 (replaced) [pdf, ps, other]
Title: Efficient adjustment sets in causal graphical models with hidden variables
Comments: Fixed typo in the definition of the preorder in page 7 (Z and Z' were backwards)
Subjects: Statistics Theory (math.ST); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
[313]  arXiv:2004.11757 (replaced) [pdf, other]
Title: Ultra Fast Structure-aware Deep Lane Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[314]  arXiv:2004.11988 (replaced) [pdf, other]
Title: Towards Model Checking Real-World Software-Defined Networks (version with appendix)
Subjects: Networking and Internet Architecture (cs.NI)
[315]  arXiv:2004.12224 (replaced) [pdf, ps, other]
Title: A $(1-e^{-1}-\varepsilon)$-Approximation for the Monotone Submodular Multiple Knapsack Problem
Subjects: Data Structures and Algorithms (cs.DS)
[316]  arXiv:2004.12240 (replaced) [pdf]
Title: A Smartphone enabled Approach to Manage COVID-19 Lockdown and Economic Crisis
Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
[317]  arXiv:2004.12485 (replaced) [pdf, other]
Title: Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning
Comments: added the statistics of top-100 compounds used logP metric with scaled components added values of the initial reactants to the box plots some values in tables are recalculated due to the inconsistent environments on different machines. corresponding benchmarks were rerun with the requirements on github. no significant changes in the results. corrected figures in the Appendix
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[318]  arXiv:2005.00214 (replaced) [pdf, other]
Title: The AVA-Kinetics Localized Human Actions Video Dataset
Comments: 8 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[319]  arXiv:2005.01223 (replaced) [pdf, other]
Title: Complexity of Sparse Polynomial Solving 2: Renormalization
Comments: 84 pages. Fixed a typo on p.62
Subjects: Numerical Analysis (math.NA)
[320]  arXiv:2005.01897 (replaced) [pdf, other]
Title: Fine-grained Financial Opinion Mining: A Survey and Research Agenda
Subjects: Computation and Language (cs.CL); Social and Information Networks (cs.SI)
[321]  arXiv:2005.02367 (replaced) [pdf, other]
Title: CODA-19: Reliably Annotating Research Aspects on 10,000+ CORD-19 Abstracts Using a Non-Expert Crowd
Comments: CODA-19: COVID-19 Research Aspect Dataset: this https URL
Subjects: Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)
[322]  arXiv:2005.02690 (replaced) [pdf, other]
Title: Dual-Sampling Attention Network for Diagnosis of COVID-19 from Community Acquired Pneumonia
Comments: accepted by IEEE Transactions on Medical Imaging, 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[323]  arXiv:2005.03123 (replaced) [pdf, ps, other]
Title: Rigid Matrices From Rectangular PCPs
Comments: 36 pages, 3 figures
Subjects: Computational Complexity (cs.CC)
[324]  arXiv:2005.03372 (replaced) [pdf, other]
Title: Vid2Curve: Simultaneous Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video
Comments: Accepted by SIGGRAPH 2020
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[325]  arXiv:2005.03433 (replaced) [pdf, other]
Title: Dirichlet spectral-Galerkin approximation method for the simply supported vibrating plate eigenvalues
Authors: Isaac Harris
Subjects: Numerical Analysis (math.NA)
[326]  arXiv:2005.04511 (replaced) [pdf, other]
Title: Finding Universal Grammatical Relations in Multilingual BERT
Comments: To appear in ACL 2020; Farsi typo corrected
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[327]  arXiv:2005.04774 (replaced) [pdf, other]
Title: PageRank and The K-Means Clustering Algorithm
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)
[328]  arXiv:2005.05535 (replaced) [pdf, other]
Title: DeepFaceLab: A simple, flexible and extensible face swapping framework
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[329]  arXiv:2005.05635 (replaced) [pdf, other]
Title: SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis
Comments: Accepted by ACL2020
Subjects: Computation and Language (cs.CL)
[330]  arXiv:2005.06111 (replaced) [pdf, other]
Title: RISE Video Dataset: Recognizing Industrial Smoke Emissions
Comments: Technical report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[331]  arXiv:2005.06338 (replaced) [pdf, other]
Title: Neural Architecture Search for Gliomas Segmentation on Multimodal Magnetic Resonance Imaging
Authors: Feifan Wang
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[332]  arXiv:2005.06444 (replaced) [pdf, other]
Title: Pika parsing: parsing in reverse solves the left recursion and error recovery problems
Comments: Submitted to ACM
Subjects: Programming Languages (cs.PL)
[333]  arXiv:2005.06786 (replaced) [pdf, other]
Title: Scheduling Dimension Reduction of LPV Models -- A Deep Neural Network Approach
Comments: Accepted to American Control Conference (ACC) 2020, Denver
Subjects: Systems and Control (eess.SY)
[334]  arXiv:2005.08125 (replaced) [pdf, other]
Title: Social Interaction Layers in Complex Networks for the Dynamical Epidemic Modeling of COVID-19 in Brazil
Comments: 16 pages, 7 figures, 2 tables
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI); Populations and Evolution (q-bio.PE)
[335]  arXiv:2005.08379 (replaced) [src]
Title: Towards Characterizing the COVID-19 Awareness on Twitter
Comments: Figure 1 is incorrect. Will be updated in the revision
Subjects: Social and Information Networks (cs.SI)
[336]  arXiv:2005.08768 (replaced) [pdf, other]
Title: Adapting JPEG XS gains and priorities to tasks and contents
Comments: CLIC at CVPR 2020
Subjects: Image and Video Processing (eess.IV); Neural and Evolutionary Computing (cs.NE)
[337]  arXiv:2005.08925 (replaced) [pdf, other]
Title: Portrait Shadow Manipulation
Comments: (updated version); SIGGRAPH 2020;Project webpage: this https URL Video: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[338]  arXiv:2005.08944 (replaced) [src]
Title: Saving the Sonorine: Audio Recovery Using Image Processing and Computer Vision
Authors: Kai Ji (Kevin) Feng, Adam Finkelstein
Comments: Removing a co-author. The co-author did not contribute to the preparation of the manuscript, only background information and advice
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Audio and Speech Processing (eess.AS)
[339]  arXiv:2005.08952 (replaced) [pdf, other]
Title: Visions, Values, and Videos: Revisiting Envisionings in Service of UbiComp Design for the Home
Comments: DIS'20, July 6-10, 2020, Eindhoven, Netherlands
Subjects: Human-Computer Interaction (cs.HC); Computers and Society (cs.CY)
[340]  arXiv:2005.09030 (replaced) [pdf, ps, other]
Title: Effective Learning of a GMRF Mixture Model
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[341]  arXiv:2005.09067 (replaced) [pdf, other]
Title: Question-Driven Summarization of Answers to Consumer Health Questions
Subjects: Computation and Language (cs.CL)
[342]  arXiv:2005.09174 (replaced) [pdf, other]
Title: Weibo-COV: A Large-Scale COVID-19 Social Media Dataset from Webio
Subjects: Social and Information Networks (cs.SI)
[343]  arXiv:2005.09233 (replaced) [pdf, other]
Title: SEMDOT: Smooth-Edged Material Distribution for Optimizing Topology Algorithm
Comments: Submitted to Advances in Engineering Software
Subjects: Computational Engineering, Finance, and Science (cs.CE)
[344]  arXiv:2005.09303 (replaced) [pdf]
Title: Visual GUI testing in practice: An extended industrial case study
Subjects: Software Engineering (cs.SE)
[345]  arXiv:2005.09329 (replaced) [pdf, other]
Title: Localizing Firearm Carriers by Identifying Human-Object Pairs
Comments: 5 pages, accepted in IEEE ICIP 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[346]  arXiv:2005.09413 (replaced) [pdf, ps, other]
Title: The Privacy ZEBRA: Zero Evidence Biometric Recognition Assessment
Comments: submitted to Interspeech 2020
Subjects: Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
[347]  arXiv:2005.09453 (replaced) [pdf, other]
Title: Experience Augmentation: Boosting and Accelerating Off-Policy Multi-Agent Reinforcement Learning
Comments: 10 pages, 4 figures, 4 tables
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
[348]  arXiv:2005.09489 (replaced) [pdf, other]
Title: On the Separability Problem of String Constraints
Subjects: Formal Languages and Automata Theory (cs.FL); Logic in Computer Science (cs.LO)
[349]  arXiv:2005.09521 (replaced) [pdf, other]
Title: Efficient Process-to-Node Mapping Algorithms for Stencil Computations
Comments: 18 pages, 9 Figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[350]  arXiv:2005.09617 (replaced) [pdf, other]
Title: Unlocking New York City Crime Insights using Relational Database Embeddings
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[ total of 350 entries: 1-350 ]
[ showing up to 2000 entries per page: fewer | more ]