# Electrical Engineering and Systems Science

## New submissions

[ total of 76 entries: 1-76 ]
[ showing up to 2000 entries per page: fewer | more ]

### New submissions for Fri, 27 Mar 20

[1]
Title: Asymptotic Security of Control Systems by Covert Reaction: Repeated Signaling Game with Undisclosed Belief
Subjects: Systems and Control (eess.SY)

This study investigates the relationship between resilience of control systems to attacks and the information available to malicious attackers. Specifically, it is shown that control systems are guaranteed to be secure in an asymptotic manner by rendering reactions against potentially harmful actions covert. The behaviors of the attacker and the defender are analyzed through a repeated signaling game with an undisclosed belief under covert reactions. In the typical setting of signaling games, reactions conducted by the defender are supposed to be public information and the measurability enables the attacker to accurately trace transitions of the defender's belief on existence of a malicious attacker. In contrast, the belief in the game considered in this paper is undisclosed and hence common equilibrium concepts can no longer be employed for the analysis. To surmount this difficulty, a novel framework for decision of reasonable strategies of the players in the game is introduced. Based on the presented framework, it is revealed that any reasonable strategy chosen by a rational malicious attacker converges to the benign behavior as long as the reactions performed by the defender are unobservable to the attacker. The result provides an explicit relationship between resilience and information, which indicates the importance of covertness of reactions for designing secure control systems.

[2]
Title: Learning to Correct Overexposed and Underexposed Photos
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Capturing photographs with wrong exposures remains a major source of errors in camera-based imaging. Exposure problems are categorized as either: (i) overexposed, where the camera exposure was too long, resulting in bright and washed-out image regions, or (ii) underexposed, where the exposure was too short, resulting in dark regions. Both under- and overexposure greatly reduce the contrast and visual appeal of an image. Prior work mainly focuses on underexposed images or general image enhancement. In contrast, our proposed method targets both over- and underexposure errors in photographs. We formulate the exposure correction problem as two main sub-problems: (i) color enhancement and (ii) detail enhancement. Accordingly, we propose a coarse-to-fine deep neural network (DNN) model, trainable in an end-to-end manner, that addresses each sub-problem separately. A key aspect of our solution is a new dataset of over 24,000 images exhibiting a range of exposure values with a corresponding properly exposed image. Our method achieves results on par with existing state-of-the-art methods on underexposed images and yields significant improvements for images suffering from overexposure errors.

[3]
Title: COVID-19 Image Data Collection
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)

This paper describes the initial COVID-19 open image data collection. It was created by assembling medical images from websites and publications and currently contains 123 frontal view X-rays.

[4]
Title: Cooperative Hypothesis Testing by Two Observers with Asymmetric Information
Comments: Journal Paper to be published
Subjects: Systems and Control (eess.SY)

In this paper, we consider the binary hypothesis testing problem with two observers. There are two possible states of nature (or hypotheses). Observations are collected by two observers. The observations are statistically related to the true state of nature. Given the observations, the objective of both observers is to find out what is the true state of nature. We present four different approaches to address the problem. In the first (centralized) approach, the observations collected by both observers are sent to a central coordinator where hypothesis testing is performed. In the second approach, each observer performs hypothesis testing based on locally collected observations. Then they exchange binary information to arrive at a consensus. In the third approach, each observer constructs an aggregated probability space based on the observations collected by it and the decision it receives from the alternate observer and performs hypothesis testing in the new probability space. In this approach also they exchange binary information to arrive at consensus. In the fourth approach, if observations collected by the observers are independent conditioned on the hypothesis we show the construction of the aggregated sample space can be skipped. In this case, the observers exchange real-valued information to achieve consensus. Given the same fixed number of samples, n, n sufficiently large, for the centralized (first) and decentralized (second) approaches, it has been shown that if the observations collected by the observers are independent conditioned on the hypothesis, then the minimum probability that the two observers agree and are wrong in the decentralized approach is upper bounded by the minimum probability of error achieved in the centralized approach.

[5]
Title: Covid-19: Automatic detection from X-Ray images utilizing Transfer Learning with Convolutional Neural Networks
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Medical Physics (physics.med-ph)

In this study, a dataset of X-Ray images from patients with common pneumonia, Covid-19, and normal incidents was utilized for the automatic detection of the Coronavirus. The aim of the study is to evaluate the performance of state-of-the-art Convolutional Neural Network architectures proposed over recent years for medical image classification. Specifically, the procedure called transfer learning was adopted. With transfer learning, the detection of various abnormalities in small medical image datasets is an achievable target, often yielding remarkable results. The dataset utilized in this experiment is a collection of 1427 X-Ray images. 224 images with confirmed Covid-19, 700 images with confirmed common pneumonia, and 504 images of normal conditions are included. The data was collected from the available X-Ray images on public medical repositories. With transfer learning, an overall accuracy of 97.82% in the detection of Covid-19 is achieved.

[6]
Title: Data-Driven Model Invalidation for Unknown Lipschitz Continuous Systems via Abstraction
Comments: Accepted for Publication in American Control Conference (ACC) 2020
Subjects: Systems and Control (eess.SY)

In this paper, we consider the data-driven model invalidation problem for Lipschitz continuous systems, where instead of given mathematical models, only prior noisy sampled data of the systems are available. We show that this data-driven model invalidation problem can be solved using a tractable feasibility check. Our proposed approach consists of two main components: (i) a data-driven abstraction part that uses the noisy sampled data to over-approximate the unknown Lipschitz continuous dynamics with upper and lower functions, and (ii) an optimization-based model invalidation component that determines the incompatibility of the data-driven abstraction with a newly observed length-T output trajectory. Finally, we discuss several methods to reduce the computational complexity of the algorithm and demonstrate their effectiveness with a simulation example of swarm intent identification.

[7]
Title: Energy Efficiency Maximization in Millimeter Wave Hybrid MIMO Systems for 5G and Beyond
Comments: 2020 IEEE International Conference on Communications and Networking (ComNet)
Subjects: Signal Processing (eess.SP)

At millimeter wave (mmWave) frequencies, the higher cost and power consumption of hardware components in multiple-input multiple output (MIMO) systems do not allow beamforming entirely at the baseband with a separate radio frequency (RF) chain for each antenna. In such scenarios, to enable spatial multiplexing, hybrid beamforming, which uses phase shifters to connect a fewer number of RF chains to a large number of antennas is a cost effective and energy-saving alternative. This paper describes our research on fully adaptive transceivers that adapt their behaviour on a frame-by-frame basis, so that a mmWave hybrid MIMO system always operates in the most energy efficient manner. Exhaustive search based brute force approach is computationally intensive, so we study fractional programming as a low-cost alternative to solve the problem which maximizes energy efficiency. The performance results indicate that the resulting mmWave hybrid MIMO transceiver achieves significantly improved energy efficiency results compared to the baseline cases involving analogue-only or digital-only signal processing solutions, and shows performance trade-offs with the brute force approach.

[8]
Title: Recursive Star-Identification Algorithm using an Adaptive SVD-based Angular Velocity Estimator
Comments: 15 pages, 11 figures, 6 tables
Subjects: Signal Processing (eess.SP)

This paper describes an algorithm obtained by merging a recursive star identification algorithm with a recently developed adaptive SVD-based estimator of the angular velocity vector (QuateRA). In a recursive algorithm, the more accurate the angular velocity estimate, the quicker and more robust to noise the resultant recursive algorithm is. Hence, combining these two techniques produces an algorithm capable of handling a variety of dynamics scenarios. The speed and robustness of the algorithm are highlighted in a selection of simulated scenarios. First, a speed comparison is made with the state-of-the-art lost-in-space star identification algorithm, Pyramid. This test shows that in the best case the algorithm is on average an order of magnitude faster than Pyramid. Next, the recursive algorithm is validated for a variety of dynamic cases including a ground-based "Stellar Compass" scenario, a satellite in geosynchronous orbit, a satellite during a re-orientation maneuver, and a satellite undergoing non-pure-spin dynamics.

[9]
Title: Order Effects of Measurements in Multi-Agent Hypothesis Testing
Comments: Journal Paper to be published
Subjects: Systems and Control (eess.SY); Multiagent Systems (cs.MA)

All propositions from the set of events for an agent in a multi-agent system might not be simultaneously verifiable. In this paper, we revisit the concepts of \textit{event-state-operation structure} and \textit{relationship of incompatibility} from literature and use them as a tool to study the algebraic structure of the set of events. We present an example from multi-agent hypothesis testing where the set of events does not form a Boolean algebra but forms an ortholattice. A possible construction of a 'noncommutative probability space', accounting for \textit{incompatible events} (events which cannot be simultaneously verified) is discussed. As a possible decision-making problem in such a probability space, we consider the binary hypothesis testing problem. We present two approaches to this decision-making problem. In the first approach, we represent the available data as coming from measurements modeled via projection valued measures (PVM) and retrieve the results of the underlying detection problem solved using classical probability models. In the second approach, we represent the measurements using positive operator valued measures (POVM). We prove that the minimum probability of error achieved in the second approach is the same as in the first approach.

[10]
Title: Event-Driven Receding Horizon Control For On-line Distributed Persistent Monitoring on Graphs
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)

This paper considers the optimal multi-agent persistent monitoring problem defined on a set of nodes (targets) interconnected according to a fixed graph topology (PMG). The objective is to minimize a measure of mean overall node state uncertainty evaluated over a finite time interval via controlling the motion of the team of agents. A class of threshold-based parametric controllers has been proposed in a prior work as a distributed on-line solution to this PMG problem. However, this approach involves a lengthy and computationally intensive parameter tuning process, which can still result in low performing solutions. Recent works have focused on appending a centralized off-line stage to the aforementioned parameter tuning process so as to improve its performance. However, this comes at the cost of sacrificing the on-line distributed nature of the original solution while also increasing the associated computational cost. Moreover, such parametric control approaches are slow to react to compensate for possible state perturbations. Motivated by these challenges, this paper proposes a computationally cheap novel event-driven receding horizon control (ED-RHC) approach as a distributed on-line solution to the PMG problem. In particular, the discrete-event nature of the PMG systems is exploited in this work to determine locally (i.e., both temporally and spatially) optimum trajectory decisions for each agent to make at different discrete event times on its trajectory. Numerical results obtained from this ED-RHC method show significant improvements compared to state of the art distributed on-line parametric control solutions.

[11]
Title: Comments on A New Parity Check Stopping Criterion for Turbo Decoding
Subjects: Signal Processing (eess.SP)

A parity-check stopping (PCS) criterion for turbo decoding is proposed in [1], which shows its priority compared with the stopping criteria of Sign Change Ratio (SCR), Sign Difference Ratio (SDR), Cross Entropy (CE) and improved CEbased (Yu) method. But another well-known simple stopping criterion named Hard-Decision-Aided (HDA) criterion has not been compared in [1]. In this letter, through analysis we show that using max-log-MAP algorithm, PCS is equivalent to HDA; while simulations demonstrate that using log-MAP algorithm, PCS has nearly the same performance as HDA.

[12]
Title: Range-Doppler Sidelobe Suppression for Pulsed Radar Based on Golay Complementary Codes
Subjects: Signal Processing (eess.SP)

To relieve the interference caused by range-Doppler sidelobes in pulsed radars, we propose a new method to construct Doppler resilient complementary waveforms based on Golay codes. We design both the transmit pulse train and the receive pulse weights, so that the similarity between the pulse weights and a given window function is maximized and the constraints on Doppler null points and energy are met. That is summarized as a two-way partitioning problem, and then solved by semidefinite programming and randomization techniques. The novel waveform thus obtained has its range sidelobe outright suppressed in multiple and flexibly-adjustable Doppler zones, and performs well in Doppler sidelobe suppression, Doppler resolution and SNR. It shows great promise in detecting slightly-moving weak targets with the existence of dense interference.

[13]
Title: A Sequential Subspace Method for Millimeter Wave MIMO Channel Estimation
Subjects: Signal Processing (eess.SP)

Data transmission over the mmWave in fifth-generation wireless networks aims to support very high speed wireless communications. A substantial increase in spectrum efficiency for mmWave transmission can be achieved by using advanced hybrid precoding, for which accurate channel state information is the key. Rather than estimating the entire channel matrix, directly estimating subspace information, which contains fewer parameters, does have enough information to design transceivers. However, the large channel use overhead and associated computational complexity in the existing channel subspace estimation techniques are major obstacles to deploy the subspace approach for channel estimation. In this paper, we propose a sequential two-stage subspace estimation method that can resolve the overhead issues and provide accurate subspace information. Utilizing a sequential method enables us to avoid manipulating the entire high-dimensional training signal, which greatly reduces the complexity. Specifically, in the first stage, the proposed method samples the columns of channel matrix to estimate its column subspace. Then, based on the obtained column subspace, it optimizes the training signals to estimate the row subspace. For a channel with $N_r$ receive antennas and $N_t$ transmit antennas, our analysis shows that the proposed technique only requires $O(N_t)$ channel uses, while providing a guarantee of subspace estimation accuracy. By theoretical analysis, it is shown that the similarity between the estimated subspace and the true subspace is linearly related to the signal-to-noise ratio (SNR), i.e., $O(\text{SNR})$, at high SNR, while quadratically related to the SNR, i.e., $O(\text{SNR}^2)$, at low SNR. Simulation results show that the proposed sequential subspace method can provide improved subspace accuracy, normalized mean squared error, and spectrum efficiency over existing methods.

[14]
Title: An Online Learning Methodology for Performance Modeling of Graphics Processors
Journal-ref: U. Gupta et al., "An Online Learning Methodology for Performance Modeling of Graphics Processors," in IEEE Transactions on Computers, vol. 67, no. 12, pp. 1677-1691, 1 Dec. 2018
Subjects: Systems and Control (eess.SY); Signal Processing (eess.SP)

Approximately 18 percent of the 3.2 million smartphone applications rely on integrated graphics processing units (GPUs) to achieve competitive performance. Graphics performance, typically measured in frames per second, is a strong function of the GPU frequency, which in turn has a significant impact on mobile processor power consumption. Consequently, dynamic power management algorithms have to assess the performance sensitivity to the frequency accurately to choose the operating frequency of the GPU effectively. Since the impact of GPU frequency on performance varies rapidly over time, there is a need for online performance models that can adapt to varying workloads. This paper presents a light-weight adaptive runtime performance model that predicts the frame processing time of graphics workloads at runtime without apriori characterization. We employ this model to estimate the frame time sensitivity to the GPU frequency, i.e., the partial derivative of the frame time with respect to the GPU frequency. The proposed model does not rely on any parameter learned offline. Our experiments on commercial platforms with common GPU benchmarks show that the mean absolute percentage error in frame time and frame time sensitivity prediction are 4.2 and 6.7 percent, respectively.

[15]
Title: Non-parallel Voice Conversion System with WaveNet Vocoder and Collapsed Speech Suppression
Comments: 13 pages, 13 figures, 1 table, accepted to publish in IEEE Access
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

In this paper, we integrate a simple non-parallel voice conversion (VC) system with a WaveNet (WN) vocoder and a proposed collapsed speech suppression technique. The effectiveness of WN as a vocoder for generating high-fidelity speech waveforms on the basis of acoustic features has been confirmed in recent works. However, when combining the WN vocoder with a VC system, the distorted acoustic features, acoustic and temporal mismatches, and exposure bias usually lead to significant speech quality degradation, making WN generate some very noisy speech segments called collapsed speech. To tackle the problem, we take conventional-vocoder-generated speech as the reference speech to derive a linear predictive coding distribution constraint (LPCDC) to avoid the collapsed speech problem. Furthermore, to mitigate the negative effects introduced by the LPCDC, we propose a collapsed speech segment detector (CSSD) to ensure that the LPCDC is only applied to the problematic segments to limit the loss of quality to short periods. Objective and subjective evaluations are conducted, and the experimental results confirm the effectiveness of the proposed method, which further improves the speech quality of our previous non-parallel VC system submitted to Voice Conversion Challenge 2018.

[16]
Title: Sub-pixel detection in hyperspectral imaging with elliptically contoured $t$-distributed background
Subjects: Signal Processing (eess.SP)

Detection of a target with known spectral signature when this target may occupy only a fraction of the pixel is an important issue in hyperspectral imaging. We recently derived the generalized likelihood ratio test (GLRT) for such sub-pixel targets, either for the so-called replacement model where the presence of a target induces a decrease of the background power, due to the sum of abundances equal to one, or for a mixed model which alleviates some of the limitations of the replacement model. In both cases, the background was assumed to be Gaussian distributed. The aim of this short communication is to extend these detectors to the broader class of elliptically contoured distributions, more precisely matrix-variate $t$-distributions with unknown mean and covariance matrix. We show that the generalized likelihood ratio tests in the $t$-distributed case coincide with their Gaussian counterparts, which confers the latter an increased generality for application. The performance as well as the robustness of these detectors are evaluated through numerical simulations.

[17]
Title: Mitigating Fiber Nonlinearities by Short-length Probabilistic Shaping
Journal-ref: Optical Fiber Conference (OFC) 2020
Subjects: Signal Processing (eess.SP)

We show that short-length probabilistic shaping reduces nonlinear interference in optical fiber transmission. SNR improvements of up to 0.8 dB are obtained. The shaping gain vanishes when interleaving is employed and not undone before transmission.

[18]
Title: Iterative learning control in prosumer-based microgrids with hierarchical control
Comments: accepted for IFAC World Congress 2020
Subjects: Systems and Control (eess.SY); Adaptation and Self-Organizing Systems (nlin.AO)

Power systems are subject to fundamental changes due to the increasing infeed of renewable energy sources. Taking the accompanying decentralization of power generation into account, the concept of prosumer-based microgrids gives the opportunity to rethink structuring and operation of power systems from scratch. In a prosumer-based microgrid, each power grid node can feed energy into the grid and draw energy from the grid. The concept allows for spatial aggregation such that also an interaction between microgrids can be represented as a prosumer-based microgrid. The contribution of this work is threefold: (i) we propose a decentralized hierarchical control approach in a network including different time scales, (ii) we use iterative learning control to compensate periodic demand patterns and save lower layer control energy and (iii) we assure asymptotic stability and monotonic convergence in the iteration domain for the linearized dynamics and validate the performance by simulating the nonlinear dynamics.

[19]
Title: On-Line Permissive Supervisory Control of Discrete Event Systems for scLTL Specifications
Journal-ref: in IEEE Control Systems Letters, vol. 4, no. 3, pp. 530-535, July 2020
Subjects: Systems and Control (eess.SY); Formal Languages and Automata Theory (cs.FL)

We propose an on-line supervisory control scheme for discrete event systems (DESs), where a control specification is described by a fragment of linear temporal logic. On the product automaton of the DES and an acceptor for the specification, we define a ranking function that returns the minimum number of steps required to reach an accepting state from each state. In addition, we introduce a permissiveness function that indicates a time-varying permissive level. At each step during the on-line control scheme, the supervisor refers to the permissiveness function as well as the ranking function in order to guarantee the control specification while handling the tradeoff between its permissiveness and acceptance of the specification. The proposed scheme is demonstrated in a surveillance problem for a mobile robot.

[20]
Title: Payload-agnostic Decoupling and Hybrid Vibration Isolation Control for a Maglev Platform with Redundant Actuation
Comments: This is a preprint which has been submitted to Mechanical Systems and Signal Processing
Subjects: Systems and Control (eess.SY)

Payload-specific vibration control may be suitable for a particular task but lacks generality and transferability required for adapting to the various payload. Self-decoupling and robust vibration control are the crucial problems to achieve payload-agnostic vibration control. However, there are problems still unsolved.
In this article, we present a maglev vibration isolation platform (MVIP), which aims to attenuate vibration in the payload-agnostic task under a dynamic environment. Since efforts trying to suppress disturbance will encounter inevitable coupling problems, we analyzed the reasons resulting in it and proposed unique and effective solutions.
To achieve payload-agnostic vibration control, we proposed a new control strategy, which is the main contribution of this article. It consists of a self-construct radial basis function neural network inversion (SRBFNNI) decoupling scheme and hybrid adaptive feed-forward internal model control (HAFIMC). The former one enables the MVIP to create a self inverse model with little prior knowledge and achieving self-decoupling. For the unique structure of MVIP, the vibration control problem is stated and addressed by the proposed HAFIMC, which utilizes the adaptive part to deal with the periodical disturbance and the internal mode part to deal with the stability.

[21]
Title: Weakly-supervised 3D coronary artery reconstruction from two-view angiographic images
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

The reconstruction of three-dimensional models of coronary arteries is of great significance for the localization, evaluation and diagnosis of stenosis and plaque in the arteries, as well as for the assisted navigation of interventional surgery. In the clinical practice, physicians use a few angles of coronary angiography to capture arterial images, so it is of great practical value to perform 3D reconstruction directly from coronary angiography images. However, this is a very difficult computer vision task due to the complex shape of coronary blood vessels, as well as the lack of data set and key point labeling. With the rise of deep learning, more and more work is being done to reconstruct 3D models of human organs from medical images using deep neural networks. We propose an adversarial and generative way to reconstruct three dimensional coronary artery models, from two different views of angiographic images of coronary arteries. With 3D fully supervised learning and 2D weakly supervised learning schemes, we obtained reconstruction accuracies that outperform state-of-art techniques.

[22]
Title: Coronary Artery Segmentation in Angiographic Videos Using A 3D-2D CE-Net
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Coronary angiography is an indispensable assistive technique for cardiac interventional surgery. Segmentation and extraction of blood vessels from coronary angiography videos are very essential prerequisites for physicians to locate, assess and diagnose the plaques and stenosis in blood vessels. This article proposes a new video segmentation framework that can extract the clearest and most comprehensive coronary angiography images from a video sequence, thereby helping physicians to better observe the condition of blood vessels. This framework combines a 3D convolutional layer to extract spatial--temporal information from a video sequence and a 2D CE--Net to accomplish the segmentation task of an image sequence. The input is a few continuous frames of angiographic video, and the output is a mask of segmentation result. From the results of segmentation and extraction, we can get good segmentation results despite the poor quality of coronary angiography video sequences.

[23]
Title: Speech Quality Factors for Traditional and Neural-Based Low Bit Rate Vocoders
Comments: 6 pages, 11 figures, conference
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

This study compares the performances of different algorithms for coding speech at low bit rates. In addition to widely deployed traditional vocoders, a selection of recently developed generative-model-based coders at different bit rates are contrasted. Performance analysis of the coded speech is evaluated for different quality aspects: accuracy of pitch periods estimation, the word error rates for automatic speech recognition, and the influence of speaker gender and coding delays. A number of performance metrics of speech samples taken from a publicly available database were compared with subjective scores. Results from subjective quality assessment do not correlate well with existing full reference speech quality metrics. The results provide valuable insights into aspects of the speech signal that will be used to develop a novel metric to accurately predict speech quality from generative-model-based coders.

[24]
Title: Stability Analysis of Droop-Controlled Inverter-Based Power Grids via Timescale Separation
Subjects: Systems and Control (eess.SY)

We consider the problem of stability analysis for distribution grids with droop-controlled inverters and dynamic distribution power lines. The inverters are modeled as voltage sources with controllable frequency and amplitude. This problem is very challenging for large networks as numerical simulations and detailed eigenvalue analysis are impactical. Motivated by the above limitations, we present in this paper a systematic and computationally efficient framework for stability analysis of inverter-based distribution grids. To design our framework, we use tools from singular perturbation and Lyapunov theories. Interestingly, we show that stability of the fast dynamics of the power grid depends only on the voltage droop gains of the inverters while, stability of the slow dynamics, depends on both voltage and frequency droop gains. Finally, by leveraging these timescale separation properties, we derive sufficient conditions on the frequency and voltage droop gains of the inverters that warrant stability of the full system. We illustrate our theoretical results through a numerical example on the IEEE 13-bus distribution grid.

[25]
Title: Bounded state Estimation over Finite-State Channels: Relating Topological Entropy and Zero-Error Capacity
Subjects: Systems and Control (eess.SY); Information Theory (cs.IT)

We investigate bounded state estimation of linear systems over finite-state erasure and additive noise channels in which the noise is governed by a finite-state machine without any statistical structure. Upper and lower bounds on their zero-error capacities are derived, revealing a connection with the topological entropy of the channel dynamics. Some examples are introduced and separate capacity bounds based on their specific features are derived and compared with bounds from topological entropy. Necessary and sufficient conditions for linear state estimation with bounded errors via such channels are then obtained, by extending previous results for nonstochastic memoryless channels to finite-state channels. These estimation conditions bring together the topological entropies of the linear system and the discrete channel.

[26]
Title: Rigorous State Evolution Analysis for Approximate Message Passing with Side Information
Subjects: Signal Processing (eess.SP); Machine Learning (stat.ML)

A common goal in many research areas is to reconstruct an unknown signal x from noisy linear measurements. Approximate message passing (AMP) is a class of low-complexity algorithms that can be used for efficiently solving such high-dimensional regression tasks. Often, it is the case that side information (SI) is available during reconstruction. For this reason, a novel algorithmic framework that incorporates SI into AMP, referred to as approximate message passing with side information (AMP-SI), has been recently introduced. In this work, we provide rigorous performance guarantees for AMP-SI when there are statistical dependencies between the signal and SI pairs and the entries of the measurement matrix are independent and identically distributed Gaussian. The AMP-SI performance is shown to be provably tracked by a scalar iteration referred to as state evolution. Moreover, we provide numerical examples that demonstrate empirically that the SE can predict the AMP-SI mean square error accurately.

[27]
Title: Hybrid Precoding For Millimeter Wave MIMO Systems: A Matrix Factorization Approach
Subjects: Signal Processing (eess.SP)

This paper investigates the hybrid precoding design for millimeter wave (mmWave) multiple-input multiple-output (MIMO) systems with finite-alphabet inputs. The precoding problem is a joint optimization of analog and digital precoders, and we treat it as a matrix factorization problem with power and constant modulus constraints. Our work presents three main contributions: First, we present a sufficient condition and a necessary condition for hybrid precoding schemes to realize unconstrained optimal precoders exactly when the number of data streams Ns satisfies Ns = minfrank(H);Nrfg, where H represents the channel matrix and Nrf is the number of radio frequency (RF) chains. Second, we show that the coupled power constraint in our matrix factorization problem can be removed without loss of optimality. Third, we propose a Broyden-Fletcher-Goldfarb-Shanno (BFGS)-based algorithm to solve our matrix factorization problem using gradient and Hessian information. Several numerical results are provided to show that our proposed algorithm outperforms existing hybrid precoding algorithms.

[28]
Title: In defence of metric learning for speaker recognition
Comments: The code can be found at
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

The objective of this paper is 'open-set' speaker recognition of unseen speakers, where ideal embeddings should be able to condense information into a compact utterance-level representation that has small intra-class (same speaker) and large inter-class (different speakers) distance.
A popular belief in speaker recognition is that networks trained with classification objectives outperform metric learning methods. In this paper, we present an extensive evaluation of most recent loss functions for speaker recognition on the VoxCeleb dataset. We demonstrate that even the vanilla triplet loss shows competitive performance compared to classification-based losses, and those trained with our angular metric learning objective outperform state-of-the-art methods.

[29]
Title: Partially Observed Discrete-Time Risk-Sensitive Mean Field Games
Comments: 29 pages. arXiv admin note: substantial text overlap with arXiv:1705.02036, arXiv:1808.03929
Subjects: Systems and Control (eess.SY)

In this paper, we consider discrete-time partially observed mean-field games with the risk-sensitive optimality criterion. We introduce risk-sensitivity behaviour for each agent via an exponential utility function. In the game model, each agent is weakly coupled with the rest of the population through its individual cost and state dynamics via the empirical distribution of states. We establish the mean-field equilibrium in the infinite-population limit using the technique of converting the underlying original partially observed stochastic control problem to a fully observed one on the belief space and the dynamic programming principle. Then, we show that the mean-field equilibrium policy, when adopted by each agent, forms an approximate Nash equilibrium for games with sufficiently many agents. We first consider finite-horizon cost function, and then, discuss extension of the result to infinite-horizon cost in the next-to-last section of the paper.

[30]
Title: Supervisory model predictive control for PV battery and heat pump system with phase change slurry thermal storage
Subjects: Systems and Control (eess.SY)

We present the design, implementation and experimental validation of a supervisory predictive control approach for an electrical heating system featuring a phase change slurry as heat storage and transfer medium. The controller optimizes the energy flows that are used as set points for the heat generation and energy distribution components. The optimization handles the thermal and electrical subsystems simultaneously and is able to switch between different objectives. We show the control can be implemented on low-cost embedded hardware and validate it with an experimental test bed comprising an installation of the complete heating system, including all hydraulic and all electrical components. Experimental results demonstrate the feasibility of both, a heat pump heating system with a phase change slurry, and the optimal control approach. The main control objectives, i.e., thermal comfort and maximum self-consumption of solar energy, can be met. In addition, the system and its controller provide a load shifting potential.

[31]
Title: Multi-Lead ECG Classification via an Information-Based Attention Convolutional Neural Network
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Machine Learning (cs.LG); Machine Learning (stat.ML)

Objective: A novel structure based on channel-wise attention mechanism is presented in this paper. Embedding with the proposed structure, an efficient classification model that accepts multi-lead electrocardiogram (ECG) as input is constructed. Methods: One-dimensional convolutional neural networks (CNN) have proven to be effective in pervasive classification tasks, enabling the automatic extraction of features while classifying targets. We implement the Residual connection and design a structure which can learn the weights from the information contained in different channels in the input feature map during the training process. An indicator named mean square deviation is introduced to monitor the performance of a particular model segment in the classification task on the two out of the five ECG classes. The data in the MIT-BIH arrhythmia database is used and a series of control experiments is conducted. Results: Utilizing both leads of the ECG signals as input to the neural network classifier can achieve better classification results than those from using single channel inputs in different application scenarios. Models embedded with the channel-wise attention structure always achieve better scores on sensitivity and precision than the plain Resnet models. The proposed model exceeds the performance of most of the state-of-the-art models in ventricular ectopic beats (VEB) classification, and achieves competitive scores for supraventricular ectopic beats (SVEB). Conclusion: Adopting more lead ECG signals as input can increase the dimensions of the input feature maps, helping to improve both the performance and generalization of the network model. Significance: Due to its end-to-end characteristics, and the extensible intrinsic for multi-lead heart diseases diagnosing, the proposed model can be used for the real-time ECG tracking of ECG waveforms for Holter or wearable devices.

[32]
Title: Experimental evaluation of beamforming on UAVs in cellular systems
Subjects: Signal Processing (eess.SP)

The usage of beamforming in Unmanned Aerial Vehicles (UAVs) has the potential of significantly improving the air-to-ground link quality. This paper presents the outcome of experimental trial of such a UAV-based beamforming system over live cellular networks. A testbed with directional antennas has been built for the experiments. It is shown that beamforming can extend the signal coverage due to antenna gain, as well as spatially reduce interference leading to higher signal quality. Moreover, it has a positive impact on the mobility performance of a flying UAV by reducing handover occurrences. It is also discussed, in which situations beamforming should translate into the uplink throughput gain.

[33]
Title: Adaptive machine learning strategies for network calibration of IoT smart air quality monitoring devices
Comments: Submitted to Pattern Recognition Letters
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Machine Learning (stat.ML)

Air Quality Multi-sensors Systems (AQMS) are IoT devices based on low cost chemical microsensors array that recently have showed capable to provide relatively accurate air pollutant quantitative estimations. Their availability permits to deploy pervasive Air Quality Monitoring (AQM) networks that will solve the geographical sparseness issue that affect the current network of AQ Regulatory Monitoring Systems (AQRMS). Unfortunately their accuracy have shown limited in long term field deployments due to negative influence of several technological issues including sensors poisoning or ageing, non target gas interference, lack of fabrication repeatability, etc. Seasonal changes in probability distribution of priors, observables and hidden context variables (i.e. non observable interferents) challenge field data driven calibration models which short to mid term performances recently rose to the attention of Urban authorithies and monitoring agencies. In this work, we address this non stationary framework with adaptive learning strategies in order to prolong the validity of multisensors calibration models enabling continuous learning. Relevant parameters influence in different network and note-to-node recalibration scenario is analyzed. Results are hence useful for pervasive deployment aimed to permanent high resolution AQ mapping in urban scenarios as well as for the use of AQMS as AQRMS backup systems providing data when AQRMS data are unavailable due to faults or scheduled mainteinance.

[34]
Title: TRACER: A Framework for Facilitating Accurate and Interpretable Analytics for High Stakes Applications
Comments: A version of this preprint will appear in ACM SIGMOD 2020
Subjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Applications (stat.AP); Machine Learning (stat.ML)

In high stakes applications such as healthcare and finance analytics, the interpretability of predictive models is required and necessary for domain practitioners to trust the predictions. Traditional machine learning models, e.g., logistic regression (LR), are easy to interpret in nature. However, many of these models aggregate time-series data without considering the temporal correlations and variations. Therefore, their performance cannot match up to recurrent neural network (RNN) based models, which are nonetheless difficult to interpret. In this paper, we propose a general framework TRACER to facilitate accurate and interpretable predictions, with a novel model TITV devised for healthcare analytics and other high stakes applications such as financial investment and risk management. Different from LR and other existing RNN-based models, TITV is designed to capture both the time-invariant and the time-variant feature importance using a feature-wise transformation subnetwork and a self-attention subnetwork, for the feature influence shared over the entire time series and the time-related importance respectively. Healthcare analytics is adopted as a driving use case, and we note that the proposed TRACER is also applicable to other domains, e.g., fintech. We evaluate the accuracy of TRACER extensively in two real-world hospital datasets, and our doctors/clinicians further validate the interpretability of TRACER in both the patient level and the feature level. Besides, TRACER is also validated in a high stakes financial application and a critical temperature forecasting application. The experimental results confirm that TRACER facilitates both accurate and interpretable analytics for high stakes applications.

[35]
Title: Effects of number of digits in large-scale multilateration
Journal-ref: Precision Engineering, Elsevier, 2020, 64, pp.1-6
Subjects: Signal Processing (eess.SP)

Since many years ago, multilateration has been used in precision engineering notably in machine tool and coordinate measuring machine calibration. This technique needs, first, the use of laser trackers or tracking interferometers, and second, the use of nonlinear optimization algorithms to determine point coordinates. Research works have shown the influence of the experimental configuration on measure precision in multilateration. However, the impact of floating-point precision in computations on large-scale multilateration precision has not been addressed. In this work, the effects of numerical errors (rounding and cancellation effects) due to floating-point precision (number of digits) were studied. In order to evaluate these effects in large-scale multilateration, a multilateration measurement system was simulated. This protocol is illustrated with a case study where large distances ($\le$20 m) between pairs of target points were simulated. The results show that the use of multi-precision libraries is recommended to control the propagation of uncertainties during the multilateration computation.

[36]
Title: Photonic convolutional neural networks using integrated diffractive optics
Subjects: Signal Processing (eess.SP); Optics (physics.optics)

With recent rapid advances in photonic integrated circuits, it has been demonstrated that programmable photonic chips can be used to implement artificial neural networks. Convolutional neural networks (CNN) are a class of deep learning methods that have been highly successful in applications such as image classification and speech processing. We present an architecture to implement a photonic CNN using the Fourier transform property of integrated star couplers. We show, in computer simulation, high accuracy image classification using the MNIST dataset. We also model component imperfections in photonic CNN and show that the performance degradation can be recovered in a programmable chip. Our proposed architecture provides a large reduction in physical footprint compared to current implementations as it utilizes the natural advantages of optics and hence offers a scalable pathway towards integrated photonic deep learning processors.

[37]
Title: Real-Time Video Content Popularity Detection Based on Mean Change Point Analysis
Subjects: Signal Processing (eess.SP)

Video content is responsible for more than 70% of the global IP traffic. Consequently, it is important for content delivery infrastructures to rapidly detect and respond to changes in content popularity dynamics. In this paper, we propose the employment of on-line change point (CP) analysis to implement real-time, autonomous and low-complexity video content popularity detection. Our proposal, denoted as real-time change point detector (RCPD), estimates the existence, the number and the direction of changes on the average number of video visits by combining: (i) off-line and on-line CP detection algorithms; (ii) an improved time-series segmentation heuristic for the reliable detection of multiple CPs; and (iii) two algorithms for the identification of the direction of changes. The proposed detector is validated against synthetic data, as well as a large database of real YouTube video visits. It is demonstrated that the RCPD can accurately identify changes in the average content popularity and the direction of change. In particular, the success rate of the RCPD over synthetic data is shown to exceed 94% for medium and large changes in content popularity. Additionally,the dynamic time warping distance, between the actual and the estimated changes, has been found to range between20sampleson average, over synthetic data, to52samples, in real data.The rapid responsiveness of the RCPD is instrumental in the deployment of real-time, lightweight load balancing solutions, as shown in a real example.

[38]
Title: Intelligent Reflecting Surface Assisted Beam Index-Modulation for Millimeter Wave Communication
Subjects: Signal Processing (eess.SP)

Millimeter wave communication is eminently suitable for high-rate wireless systems, which may be beneficially amalgamated with intelligent reflecting surfaces (IRS), relying on beam-index modulation. Explicitly, we propose three different architectures based on IRSs for beam-index modulation in millimeter wave communication, which circumvent the line-of-sight blockage of millimeter wave frequencies. We conceive both the optimal maximum likelihood detector and a low-complexity compressed sensing detector for the proposed schemes. Finally, the schemes conceived are evaluated through extensive simulations, which are compared to our analytically obtained bounds.

### Cross-lists for Fri, 27 Mar 20

[39]  arXiv:2003.11562 (cross-list from cs.CL) [pdf, other]
Title: Finnish Language Modeling with Deep Transformer Models
Authors: Abhilash Jain
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)

Transformers have recently taken the center stage in language modeling after LSTM's were considered the dominant model architecture for a long time. In this project, we investigate the performance of the Transformer architectures-BERT and Transformer-XL for the language modeling task. We use a sub-word model setting with the Finnish language and compare it to the previous State of the art (SOTA) LSTM model. BERT achieves a pseudo-perplexity score of 14.5, which is the first such measure achieved as far as we know. Transformer-XL improves upon the perplexity score to 73.58 which is 27\% better than the LSTM model.

[40]  arXiv:2003.11566 (cross-list from cs.LG) [pdf, other]
Title: Interval Neural Networks: Uncertainty Scores
Comments: LO and CH contributed equally
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Machine Learning (stat.ML)

We propose a fast, non-Bayesian method for producing uncertainty scores in the output of pre-trained deep neural networks (DNNs) using a data-driven interval propagating network. This interval neural network (INN) has interval valued parameters and propagates its input using interval arithmetic. The INN produces sensible lower and upper bounds encompassing the ground truth. We provide theoretical justification for the validity of these bounds. Furthermore, its asymmetric uncertainty scores offer additional, directional information beyond what Gaussian-based, symmetric variance estimation can provide. We find that noise in the data is adequately captured by the intervals produced with our method. In numerical experiments on an image reconstruction task, we demonstrate the practical utility of INNs as a proxy for the prediction error in comparison to two state-of-the-art uncertainty quantification methods. In summary, INNs produce fast, theoretically justified uncertainty scores for DNNs that are easy to interpret, come with added information and pose as improved error proxies - features that may prove useful in advancing the usability of DNNs especially in sensitive applications such as health care.

[41]  arXiv:2003.11658 (cross-list from physics.med-ph) [pdf, other]
Title: Artificial Intelligence in Quantitative Ultrasound Imaging: A Review
Subjects: Medical Physics (physics.med-ph); Image and Video Processing (eess.IV)

Quantitative ultrasound (QUS) imaging is a reliable, fast and inexpensive technique to extract physically descriptive parameters for assessing pathologies. Despite its safety and efficacy, QUS suffers from several major drawbacks: poor imaging quality, inter- and intra-observer variability which hampers the reproducibility of measurements. Therefore, it is in great need to develop automatic method to improve the imaging quality and aid in measurements in QUS. In recent years, there has been an increasing interest in artificial intelligence (AI) applications in ultrasound imaging. However, no research has been found that surveyed the AI use in QUS. The purpose of this paper is to review recent research into the AI applications in QUS. This review first introduces the AI workflow, and then discusses the various AI applications in QUS. Finally, challenges and future potential AI applications in QUS are discussed.

[42]  arXiv:2003.11720 (cross-list from cs.IT) [pdf, ps, other]
Title: Generalized Wireless-Powered Communications: When to Activate Wireless Power Transfer?
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

Wireless-powered communication network (WPCN) is a key technology to power energy-limited massive devices, such as on-board wireless sensors in autonomous vehicles, for Internet-of-Things (IoT) applications. Conventional WPCNs rely only on dedicated downlink wireless power transfer (WPT), which is practically inefficient due to the significant energy loss in wireless signal propagation. Meanwhile, ambient energy harvesting is highly appealing as devices can scavenge energy from various existing energy sources (e.g., solar energy and cellular signals). Unfortunately, the randomness of the availability of these energy sources cannot guarantee stable communication services. Motivated by the above, we consider a generalized WPCN where the devices can not only harvest energy from a dedicated multiple-antenna power station (PS), but can also exploit stored energy stemming from ambient energy harvesting. Since the dedicated WPT consumes system resources, if the stored energy is sufficient, WPT may not be needed to maximize the weighted sum rate (WSR). To analytically characterize this phenomenon, we derive the condition for WPT activation and reveal how it is affected by the different system parameters. Subsequently, we further derive the optimal resource allocation policy for the cases that WPT is activated and deactivated, respectively. In particular, it is found that when WPT is activated, the optimal energy beamforming at the PS does not depend on the devices' stored energy, which is shown to lead to a new unfairness issue. Simulation results verify our theoretical findings and demonstrate the effectiveness of the proposed optimal resource allocation.

[43]  arXiv:2003.11774 (cross-list from cs.CV) [pdf, other]
Title: Image Generation Via Minimizing Fréchet Distance in Discriminator Feature Space
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

For a given image generation problem, the intrinsic image manifold is often low dimensional. We use the intuition that it is much better to train the GAN generator by minimizing the distributional distance between real and generated images in a small dimensional feature space representing such a manifold than on the original pixel-space. We use the feature space of the GAN discriminator for such a representation. For distributional distance, we employ one of two choices: the Fr\'{e}chet distance or direct optimal transport (OT); these respectively lead us to two new GAN methods: Fr\'{e}chet-GAN and OT-GAN. The idea of employing Fr\'{e}chet distance comes from the success of Fr\'{e}chet Inception Distance as a solid evaluation metric in image generation. Fr\'{e}chet-GAN is attractive in several ways. We propose an efficient, numerically stable approach to calculate the Fr\'{e}chet distance and its gradient. The Fr\'{e}chet distance estimation requires a significantly less computation time than OT; this allows Fr\'{e}chet-GAN to use much larger mini-batch size in training than OT. More importantly, we conduct experiments on a number of benchmark datasets and show that Fr\'{e}chet-GAN (in particular) and OT-GAN have significantly better image generation capabilities than the existing representative primal and dual GAN approaches based on the Wasserstein distance.

[44]  arXiv:2003.11797 (cross-list from cs.CV) [pdf]
Title: Neural encoding and interpretation for high-level visual cortices based on fMRI using image caption features
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Neurons and Cognition (q-bio.NC)

On basis of functional magnetic resonance imaging (fMRI), researchers are devoted to designing visual encoding models to predict the neuron activity of human in response to presented image stimuli and analyze inner mechanism of human visual cortices. Deep network structure composed of hierarchical processing layers forms deep network models by learning features of data on specific task through big dataset. Deep network models have powerful and hierarchical representation of data, and have brought about breakthroughs for visual encoding, while revealing hierarchical structural similarity with the manner of information processing in human visual cortices. However, previous studies almost used image features of those deep network models pre-trained on classification task to construct visual encoding models. Except for deep network structure, the task or corresponding big dataset is also important for deep network models, but neglected by previous studies. Because image classification is a relatively fundamental task, it is difficult to guide deep network models to master high-level semantic representations of data, which causes into that encoding performance for high-level visual cortices is limited. In this study, we introduced one higher-level vision task: image caption (IC) task and proposed the visual encoding model based on IC features (ICFVEM) to encode voxels of high-level visual cortices. Experiment demonstrated that ICFVEM obtained better encoding performance than previous deep network models pre-trained on classification task. In addition, the interpretation of voxels was realized to explore the detailed characteristics of voxels based on the visualization of semantic words, and comparative analysis implied that high-level visual cortices behaved the correlative representation of image content.

[45]  arXiv:2003.11815 (cross-list from physics.flu-dyn) [pdf, other]
Title: Fluid Dynamics-Based Distance Estimation Algorithm for Macroscale Molecular Communication
Comments: Submitted to IEEE Transactions on Nanobioscience on 13th of March 2020, 17 pages, 6 figures
Subjects: Fluid Dynamics (physics.flu-dyn); Signal Processing (eess.SP)

Many species, from single-cell bacteria to advanced animals, use molecular communication (MC) to share information with each other via chemical signals. Although MC is mostly studied in microscale, new practical applications emerge in macroscale. It is essential to derive an estimation method for channel parameters such as distance for practical macroscale MC systems which include a sprayer emitting molecules as a transmitter (TX) and a sensor as the receiver (RX). In this paper, a novel approach based on fluid dynamics is proposed for the derivation of the distance estimation in practical MC systems. According to this approach, transmitted molecules are considered as moving droplets in the MC channel. With this approach, the Fluid Dynamics-Based Distance Estimation (FDDE) algorithm which predicts the propagation distance of the transmitted droplets by updating the diameter of evaporating droplets at each time step is proposed. FDDE algorithm is validated by experimental data. The results reveal that the distance can be estimated by the fluid dynamics approach which introduces novel parameters such as the volume fraction of droplets in a mixture of air and liquid droplets and the beamwidth of the TX. Furthermore, the effect of the evaporation is shown with the numerical results.

[46]  arXiv:2003.11816 (cross-list from cs.CV) [pdf, other]
Title: Do Deep Minds Think Alike? Selective Adversarial Attacks for Fine-Grained Manipulation of Multiple Deep Neural Networks
Comments: 9 pages, submitted to ICML 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Optimization and Control (math.OC); Machine Learning (stat.ML)

Recent works have demonstrated the existence of {\it adversarial examples} targeting a single machine learning system. In this paper we ask a simple but fundamental question of "selective fooling": given {\it multiple} machine learning systems assigned to solve the same classification problem and taking the same input signal, is it possible to construct a perturbation to the input signal that manipulates the outputs of these {\it multiple} machine learning systems {\it simultaneously} in arbitrary pre-defined ways? For example, is it possible to selectively fool a set of "enemy" machine learning systems but does not fool the other "friend" machine learning systems? The answer to this question depends on the extent to which these different machine learning systems "think alike". We formulate the problem of "selective fooling" as a novel optimization problem, and report on a series of experiments on the MNIST dataset. Our preliminary findings from these experiments show that it is in fact very easy to selectively manipulate multiple MNIST classifiers simultaneously, even when the classifiers are identical in their architectures, training algorithms and training datasets except for random initialization during training. This suggests that two nominally equivalent machine learning systems do not in fact "think alike" at all, and opens the possibility for many novel applications and deeper understandings of the working principles of deep neural networks.

[47]  arXiv:2003.11883 (cross-list from cs.CV) [pdf, other]
Title: DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Neural Architecture Search (NAS) has shown great potentials in automatically designing scalable network architectures for dense image predictions. However, existing NAS algorithms usually compromise on restricted search space and search on proxy task to meet the achievable computational demands. To allow as wide as possible network architectures and avoid the gap between target and proxy dataset, we propose a Densely Connected NAS (DCNAS) framework, which directly searches the optimal network structures for the multi-scale representations of visual information, over a large-scale target dataset. Specifically, by connecting cells with each other using learnable weights, we introduce a densely connected search space to cover an abundance of mainstream network designs. Moreover, by combining both path-level and channel-level sampling strategies, we design a fusion module to reduce the memory consumption of ample search space. We demonstrate that the architecture obtained from our DCNAS algorithm achieves state-of-the-art performances on public semantic image segmentation benchmarks, including 83.6% on Cityscapes, and 86.9% on PASCAL VOC 2012 (track w/o additional data). We also retain leading performances when evaluating the architecture on the more challenging ADE20K and Pascal Context dataset.

[48]  arXiv:2003.11951 (cross-list from math.OC) [pdf, ps, other]
Title: On the Complexity and Approximability of Optimal Sensor Selection and Attack for Kalman Filtering
Subjects: Optimization and Control (math.OC); Computational Complexity (cs.CC); Systems and Control (eess.SY)

Given a linear dynamical system affected by stochastic noise, we consider the problem of selecting an optimal set of sensors (at design-time) to minimize the trace of the steady state a priori or a posteriori error covariance of the Kalman filter, subject to certain selection budget constraints. We show the fundamental result that there is no polynomial-time constant-factor approximation algorithm for this problem. This contrasts with other classes of sensor selection problems studied in the literature, which typically pursue constant-factor approximations by leveraging greedy algorithms and submodularity (or supermodularity) of the cost function. Here, we provide a specific example showing that greedy algorithms can perform arbitrarily poorly for the problem of design-time sensor selection for Kalman filtering. We then study the problem of attacking (i.e., removing) a set of installed sensors, under predefined attack budget constraints, to maximize the trace of the steady state a priori or a posteriori error covariance of the Kalman filter. Again, we show that there is no polynomial-time constant-factor approximation algorithm for this problem, and show specifically that greedy algorithms can perform arbitrarily poorly.

[49]  arXiv:2003.11959 (cross-list from cs.RO) [pdf]
Title: Pedestrian Models for Autonomous Driving Part II: high level models of human behaviour
Comments: Submitted to IEEE Transactions on Intelligent Transportation Systems
Subjects: Robotics (cs.RO); Human-Computer Interaction (cs.HC); Systems and Control (eess.SY)

Autonomous vehicles (AVs) must share space with human pedestrians, both in on-road cases such as cars at pedestrian crossings and off-road cases such as delivery vehicles navigating through crowds on high-streets. Unlike static and kinematic obstacles, pedestrians are active agents with complex, interactive motions. Planning AV actions in the presence of pedestrians thus requires modelling of their probable future behaviour as well as detection and tracking which enable such modelling. This narrative review article is Part II of a pair which together survey the current technology stack involved in this process, organising recent research into a hierarchical taxonomy ranging from low level image detection to high-level psychological models, from the perspective of an AV designer. This self-contained Part II covers the higher levels of this stack, consisting of models of pedestrian behaviour, from prediction of individual pedestrians' likely destinations and paths, to game theoretic models of interactions between pedestrians and autonomous vehicles. This survey clearly shows that, although there are good models for optimal walking behaviour, high-level psychological and social modelling of pedestrian behaviour still remains an open research question that requires many conceptual issues to be clarified by the community. At these levels, early work has been done on descriptive and qualitative models of behaviour, but much work is still needed to translate them into quantitative algorithms for practical AV control.

[50]  arXiv:2003.11994 (cross-list from physics.optics) [pdf]
Title: Single-Shot 3D Widefield Fluorescence Imaging with a Computational Miniature Mesoscope
Subjects: Optics (physics.optics); Image and Video Processing (eess.IV)

Fluorescence imaging is indispensable to biology and neuroscience. The need for large-scale imaging in freely behaving animals has further driven the development in miniaturized microscopes (miniscopes). However, conventional microscopes / miniscopes are inherently constrained by their limited space-bandwidth-product, shallow depth-of-field, and the inability to resolve 3D distributed emitters. Here, we present a Computational Miniature Mesoscope (CM$^2$) that overcomes these bottlenecks and enables single-shot 3D imaging across an 8 $\times$ 7-mm$^2$ field-of-view and 2.5-mm depth-of-field, achieving 7-$\mu$m lateral and 250-$\mu$m axial resolution. Notably, the CM$^2$ has a compact lightweight design that integrates a microlens array for imaging and an LED array for excitation in a single platform. Its expanded imaging capability is enabled by computational imaging that augments the optics by algorithms. We experimentally validate the mesoscopic 3D imaging capability on volumetrically distributed fluorescent beads and fibers. We further quantify the effects of bulk scattering and background fluorescence on phantom experiments.

[51]  arXiv:2003.12040 (cross-list from cs.CV) [pdf, other]
Title: Pseudo-Labeling for Small Lesion Detection on Diabetic Retinopathy Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Diabetic retinopathy (DR) is a primary cause of blindness in working-age people worldwide. About 3 to 4 million people with diabetes become blind because of DR every year. Diagnosis of DR through color fundus images is a common approach to mitigate such problem. However, DR diagnosis is a difficult and time consuming task, which requires experienced clinicians to identify the presence and significance of many small features on high resolution images. Convolutional Neural Network (CNN) has proved to be a promising approach for automatic biomedical image analysis recently. In this work, we investigate lesion detection on DR fundus images with CNN-based object detection methods. Lesion detection on fundus images faces two unique challenges. The first one is that our dataset is not fully labeled, i.e., only a subset of all lesion instances are marked. Not only will these unlabeled lesion instances not contribute to the training of the model, but also they will be mistakenly counted as false negatives, leading the model move to the opposite direction. The second challenge is that the lesion instances are usually very small, making them difficult to be found by normal object detectors. To address the first challenge, we introduce an iterative training algorithm for the semi-supervised method of pseudo-labeling, in which a considerable number of unlabeled lesion instances can be discovered to boost the performance of the lesion detector. For the small size targets problem, we extend both the input size and the depth of feature pyramid network (FPN) to produce a large CNN feature map, which can preserve the detail of small lesions and thus enhance the effectiveness of the lesion detector. The experimental results show that our proposed methods significantly outperform the baselines.

[52]  arXiv:2003.12063 (cross-list from cs.CV) [pdf, other]
Title: Memory Enhanced Global-Local Aggregation for Video Object Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

How do humans recognize an object in a piece of video? Due to the deteriorated quality of single frame, it may be hard for people to identify an occluded object in this frame by just utilizing information within one image. We argue that there are two important cues for humans to recognize objects in videos: the global semantic information and the local localization information. Recently, plenty of methods adopt the self-attention mechanisms to enhance the features in key frame with either global semantic information or local localization information. In this paper we introduce memory enhanced global-local aggregation (MEGA) network, which is among the first trials that takes full consideration of both global and local information. Furthermore, empowered by a novel and carefully-designed Long Range Memory (LRM) module, our proposed MEGA could enable the key frame to get access to much more content than any previous methods. Enhanced by these two sources of information, our method achieves state-of-the-art performance on ImageNet VID dataset. Code is available at \url{https://github.com/Scalsol/mega.pytorch}.

### Replacements for Fri, 27 Mar 20

[53]  arXiv:1901.08460 (replaced) [pdf, other]
Title: Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs
Comments: To appear in the proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Systems and Control (eess.SY); Machine Learning (stat.ML)
[54]  arXiv:1907.10554 (replaced) [pdf]
Title: Development of a Real-time Indoor Location System using Bluetooth Low Energy Technology and Deep Learning to Facilitate Clinical Applications
Comments: 20 pages, 6 figures, submitted to Physics in Medicine & Biology
Subjects: Signal Processing (eess.SP); Computers and Society (cs.CY)
[55]  arXiv:1908.04284 (replaced) [pdf, other]
Title: Personal VAD: Speaker-Conditioned Voice Activity Detection
Comments: To appear in Speaker Odyssey 2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Machine Learning (stat.ML)
[56]  arXiv:1909.10091 (replaced) [pdf, other]
Title: Flying batteries: In-flight battery switching to increase multirotor flight time
Comments: UPDATE: The paper has been accepted to ICRA-2020. The newest version is a post-peer-review version; Paper submitted to RA-L with ICRA-2020 on 2019-09-10. Paper info: 7 pages (6 content + 1 references), 8 figures, 2 tables
Subjects: Systems and Control (eess.SY); Robotics (cs.RO)
[57]  arXiv:1910.10187 (replaced) [pdf, other]
Title: Fast and Automatic Periacetabular Osteotomy Fragment Pose Estimation Using Intraoperatively Implanted Fiducials and Single-View Fluoroscopy
Comments: Revised article to address reviewer comments. Under review for Physics in Medicine and Biology. Supplementary video at
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[58]  arXiv:1911.00420 (replaced) [pdf, other]
Subjects: Signal Processing (eess.SP); Applications (stat.AP)
[59]  arXiv:1911.03315 (replaced) [pdf, other]
Title: Online learning-based Model Predictive Control with Gaussian Process Models and Stability Guarantees
Comments: 20 pages, 12 figures, 3 tables, 1 algorithm, revision submitted to International Journal of Robust and Nonlinear Control
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG)
[60]  arXiv:1911.07349 (replaced) [pdf, other]
Title: Putting visual object recognition in context
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[61]  arXiv:1911.09887 (replaced) [pdf, other]
Title: UAV-enabled Secure Communication with Finite Blocklength
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)
[62]  arXiv:1911.11251 (replaced) [pdf, other]
Title: Hexagonal Image Processing in the Context of Machine Learning: Conception of a Biologically Inspired Hexagonal Deep Learning Framework
Comments: Accepted for: 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA)
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Machine Learning (stat.ML)
[63]  arXiv:1912.05622 (replaced) [pdf, other]
Title: CARP: Compression through Adaptive Recursive Partitioning for Multi-dimensional Images
Subjects: Image and Video Processing (eess.IV); Applications (stat.AP)
[64]  arXiv:1912.12023 (replaced) [src]
Title: Monaural Speech Enhancement Using Deep Multi-Branch Residual Network with 1-D Causal Dilated Convolutions
Comments: make major revisions for there was an error
Subjects: Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[65]  arXiv:2001.04669 (replaced) [pdf, other]
Title: Reinforcement Learning of Control Policy for Linear Temporal Logic Specifications Using Limit-Deterministic Generalized Büchi Automata
Comments: 7 pages, 6 figures; an extended version of a manuscript accepted to IEEE L-CSS
Subjects: Systems and Control (eess.SY); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Logic in Computer Science (cs.LO)
[66]  arXiv:2002.00451 (replaced) [pdf, other]
Title: Fair Allocation Based Soft Load Shedding
Comments: Accepted to Intelligent Systems Conference (IntelliSys) 2020
Subjects: Signal Processing (eess.SP); Optimization and Control (math.OC)
[67]  arXiv:2002.11936 (replaced) [pdf, other]
Title: Weak Supervision in Convolutional Neural Network for Semantic Segmentation of Diffuse Lung Diseases Using Partially Annotated Dataset
Comments: Accepted at SPIE Medical Imaging 2020: Computer-Aided Diagnosis
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[68]  arXiv:2003.05962 (replaced) [pdf, other]
Title: Tube-based Robust Model Predictive Control for a Distributed Parameter System Modeled as a Polytopic LPV (extended version)
Comments: 8 Pages, American Control Conference, 2020
Journal-ref: American Control Conference, 2020
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)
[69]  arXiv:2003.06211 (replaced) [pdf]
Title: High-Accuracy Facial Depth Models derived from 3D Synthetic Data
Subjects: Image and Video Processing (eess.IV)
[70]  arXiv:2003.06268 (replaced) [pdf, other]
Title: Data Set Description: Identifying the Physics Behind an Electric Motor -- Data-Driven Learning of the Electrical Behavior (Part II)
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG); Signal Processing (eess.SP)
[71]  arXiv:2003.07273 (replaced) [pdf, other]
Title: Data Set Description: Identifying the Physics Behind an Electric Motor -- Data-Driven Learning of the Electrical Behavior (Part I)
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG); Signal Processing (eess.SP); Optimization and Control (math.OC)
[72]  arXiv:2003.07937 (replaced) [pdf, ps, other]
Title: Finite-time Identification of Stable Linear Systems: Optimality of the Least-Squares Estimator
Subjects: Statistics Theory (math.ST); Machine Learning (cs.LG); Systems and Control (eess.SY); Machine Learning (stat.ML)
[73]  arXiv:2003.08413 (replaced) [pdf, other]
Title: Oral-3D: Reconstructing the 3D Bone Structure of Oral Cavity from 2D Panoramic X-ray
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[74]  arXiv:2003.09677 (replaced) [pdf, ps, other]
Title: UAV-Assisted Secure Communications in Terrestrial Cognitive Radio Networks: Joint Power Control and 3D Trajectory Optimization
Subjects: Signal Processing (eess.SP)
[75]  arXiv:2003.10778 (replaced) [pdf, other]
Title: PanNuke Dataset Extension, Insights and Baselines