## You are here

# Program

## Monday, September 9

### 19:00 - 20:00

#### MoA: Claude Shannon

### 20:30 - 22:15

#### MoB: Welcome Reception

## Tuesday, September 10

### 08:40 - 09:40

#### TuA: Tuesday's Plenary

While statistics and information theory have a lengthy shared history, the modern era of massive data sets leads to various new challenges. In this talk, we discuss several problem domains which have witnessed fruitful interactions between the two fields, including large-scale social network analysis, nonparametric regression and sparsity in high dimensions, and oracle complexity of convex optimization. Such areas provide opportunities for information-theoretic tools to be brought to bear in interesting and non-orthodox ways.

### 09:50 - 11:10

#### TuB1: Polar Coding 1

*Polar Codes with Dynamic Frozen Symbols and Their Decoding by Directed Search*- A novel construction of polar codes with dynamic frozen symbols is proposed. The proposed codes are subcodes of extended BCH codes, which ensure sufficiently high minimum distance. Furthermore, a decoding algorithm is proposed, which employs estimates of the not-yet-processed bit channel error probabilities to perform directed search in code tree, reducing thus the total number of iterations.
*A Lower Bound on Achievable Rates by Polar Codes with Mismatch Polar Decoding*- In this paper we show that mismatched polar codes over B-DMCs can achieve positive rates of at least $I(W, V)$ whenever $I(W, V) > 0$, where $W$ denotes the communication channel, $V$ the mismatched channel used in the code design, and $I(W, V) = \displaystyle\sum_{y}\sum_{x\in\{0,1\}}\frac{1}{2}W(y|x) \log{\frac{V(y|x)}{\frac{1}{2}V(y|0)+\frac{1}{2}V(y|1)}}.$
*Scaling Exponent of List Decoders with Applications to Polar Codes*- Motivated by the significant performance gains which polar codes experience when they are decoded with successive cancellation list decoders, we study how the scaling exponent changes as a function of the list size $L$. In particular, we fix the block error probability $P_e$ and we analyze the tradeoff between the blocklength $N$ and the back-off from capacity $C − R$ using scaling laws. By means of a Divide and Intersect procedure, we provide a lower bound on the error probability under MAP decoding with list size $L$ for any binary-input memoryless output-symmetric channel and for any class of linear codes such that their minimum distance is unbounded as the blocklength grows large. We show that, although list decoding can significantly improve the involved constants, the scaling exponent itself, i.e., the speed at which capacity is approached, stays unaffected. This result applies in particular to polar codes, since their minimum distance tends to infinity as $N$ increases. Some considerations are also pointed out for the genie-aided successive cancellation decoder when transmission takes place over the binary erasure channel.
*Polar Coding for Fading Channels*- A polar coding scheme for fading channels is proposed in this paper. More specifically, the focus is on the Gaussian fading channel with a BPSK modulation, where the equivalent channel is modeled as a binary symmetric channel with varying cross-over probabilities. To deal with variable channel states, a coding scheme of hierarchically utilizing polar codes is proposed. In particular, by observing the polarization of different binary symmetric channels over different fading blocks, each channel use corresponding to a different polarization is modeled as a binary erasure channel such that polar codes could be adopted to encode over blocks. It is shown that the proposed coding scheme, without instantaneous channel state information at the transmitter, achieves the capacity of the corresponding fading binary symmetric channel.

#### TuB2: Networks: Compression and Coding

*On Symmetric Multiple Description Coding*- We derive a single-letter lower bound on the minimum sum rate of multiple description coding with symmetric distortion constraints. For the binary uniform source with the Hamming distortion measure, this lower bound can be evaluated with the aid of a certain minimax theorem. A similar minimax theorem is established in the quadratic Gaussian setting, which is further leveraged to analyze the special case where the minimum sum rate subject to two levels of distortion constraints (with the second level imposed on the complete set of descriptions) is attained; in particular, we determine the minimum achievable distortions at the intermediate levels.
*Characterising Correlation via Entropy Functions*- Characterising the capacity region for a network can be extremely difficult. Even with independent sources, deter- mining the capacity region can be as hard as the open problem of characterising all information inequalities. The majority of computable outer bounds in the literature are relaxations of the Linear Programming bound which involves entropy functions of random variables related to the sources and link messages. When sources are not independent, the problem is even more compli- cated. Extension of Linear Programming bounds to networks with correlated sources is largely open. Source dependence is usually specified via a joint probability distribution, and one of the main challenges in extending linear program bounds is the difficulty (or impossibility) of characterising arbitrary depen- dencies via entropy functions. This paper tackles the problem by answering the question of how well entropy functions can characterise correlation among sources. We show that by using carefully chosen auxiliary random variables, the characterisation can be fairly ``accurate''.
*Erasure/List Exponents for Slepian-Wolf Decoding*- We analyze random coding error exponents associated with erasure/list Slepian-Wolf decoding using two different methods and then compare the resulting bounds. The first method follows the well known techniques of Gallager and Forney and the second method is based on a technique of distance enumeration, or more generally, type class enumeration, which is rooted in the statistical mechanics of a disordered system that is related to the random energy model (REM). The second method is guaranteed to yield exponent functions which are at least as tight as those of the first method, and it is demonstrated that for certain combinations of coding rates and thresholds, the bounds of the second method are strictly tighter than those of the first method, by an arbitrarily large factor. In fact, the second method may even yield an infinite exponent at regions where the first method gives finite values. We also discuss the option of variable-rate Slepian-Wolf encoding and demonstrate how it can improve on the resulting exponents.
*Operational Extremality of Gaussianity in Network Compression, Communication, and Coding*- Among other extremal properties, Gaussian sources are hardest to compress and communicate over. We review the main results of Shomorony and Avestimehr, and Asnani, Shomorony, Avestimehr, and Weissman, exhibiting the generality in which such extremal properties hold in compression, communication and coding over networks. These properties are established via operational arguments, bypassing elusive characterizations of fundamental performance limits: schemes tailored for the Gaussian case are harnessed for constructions of schemes that provably do essentially as well under any other source of the same covariance. The talk will highlight the main ideas behind these constructions and how the results, which were established for memoryless sources and channels, carry over to the presence of memory.

#### TuB3: Secrecy, Privacy and Cryptography

*Spectrum Bandit Optimization*- We consider the problem of allocating radio channels to links in a wireless network. Links interact through interference, modelled as a conflict graph (i.e., two interfering links cannot be simultaneously active on the same channel). We aim at identifying the channel allocation maximizing the total network throughput over a finite time horizon. Should we know the average radio conditions on each channel and on each link, an optimal allocation would be obtained by solving an Integer Linear Program (ILP). When radio conditions are unknown a priori, we look for a sequential channel allocation policy that converges to the optimal allocation while minimizing on the way the throughput loss or regret due to the need for exploring suboptimal allocations. We formulate this problem as a generic linear bandit problem, and analyze it in a stochastic setting where radio conditions are driven by a i.i.d.\ stochastic process, and in an adversarial setting where radio conditions can evolve arbitrarily. We provide, in both settings, algorithms whose regret upper bounds outperform those of existing algorithms.
*Physical-Layer Cryptography Through Massive MIMO*- We propose the new technique of physical-layer cryptography based on using a massive MIMO channel as a key between the sender and desired receiver, which need not be secret. The goal is for low-complexity encoding and decoding by the desired transmitter-receiver pair, whereas decoding by an eavesdropper is hard in terms of prohibitive complexity. The massive MIMO system has a channel gain matrix that is drawn i.i.d.\ according to a Gaussian distribution, subject to additive white Gaussian noise. The decoding complexity is analyzed by mapping the massive MIMO system to a lattice. We show that the eavesdropper's decoder for the MIMO system with M-PAM modulation is equivalent to solving standard lattice problems that are conjectured to be of exponential complexity for both classical and quantum computers. Hence, under the widely-held conjecture that standard lattice problems are of worst-case complexity, the proposed encryption scheme has security that exceeds that of the most common encryption methods used today such as RSA and Diffie-Hellman. Additionally, we show that this scheme could be used to securely communicate without a pre-shared secret key and little computational overhead. In particular, a standard parallel channel decomposition allows the desired transmitter-receiver pair to encode and decode transmissions over the MIMO channel based on the singular value decomposition of the channel, while decoding remains computationally hard for an eavesdropper with an independent channel gain matrix, even if it knows the channel gain matrix between the desired transmitter and receiver. Thus, the massive MIMO system provides for low-complexity encryption commensurate with the most sophisticated forms of application-layer encryption by exploiting the physical layer properties of the radio channel.
*Secrecy & Rate Adaptation for Secure HARQ Protocols*- This paper is dedicated to the study of HARQ protocols under a secrecy constraint. An encoder sends information to a legitimate decoder while keeping it secret from the eavesdropper. Our objective is to provide a coding scheme that satisfies both reliability and confidentiality conditions. This problem has been investigated in the literature using a coding scheme that involves a unique secrecy parameter. The uniqueness of this parameter is sub-optimal for the throughput criteria and we propose a new coding scheme that introduces additional degrees of freedom. Our code involves Secrecy Adaptation and Rate Adaptation and we called it SARA-code. The first contribution is to prove that the SARA-code has small error probability and small information leakage rate. The second contribution is to show, over a numerical example, that the SARA-code improves the secrecy throughput.
*Protecting Data Against Unwanted Inferences*- We study the competing goals of utility and privacy as they arise when a provider delegates the processing of its personal information to a recipient who is better able to handle this data. We formulate our goals in terms of the inferences which can be drawn using the shared data. A whitelist describes the inferences that are desirable, i.e., providing utility. A blacklist describes the unwanted inferences which the provider wants to keep private. We formally define utility and privacy parameters using elementary information-theoretic notions and derive a bound on the region spanned by these parameters. We provide constructive schemes for achieving certain boundary points of this region. Finally, we improve the region by sharing data over aggregated time slots.

### 11:30 - 12:50

#### TuC1: Polar Coding 2

*Polar Coding for Secret-Key Generation*- Practical implementations of secret-key generation are often based on sequential strategies, which handle reliability and secrecy in two successive steps, called reconciliation and privacy amplification. In this paper, we propose an alternative scheme based on polar coding that jointly deals with reliability and secrecy. We study a binary degraded symmetric discrete memoryless source model with uniform marginals, and assume one-way rate-limited public communication between two legitimate users. Specifically, we propose secret-key capacity-achieving polar coding schemes, in which users rely on pre-shared secret seed of negligible rate. For the model studied, we thus provide the first example of low-complexity secret-key capacity-achieving scheme that handles vector quantization, for rate-limited public communication. Furthermore, we provide examples for which no seed is required.
*On Polarization for the Linear Operator Channel*- We address the problem of reliably transmitting information through a network where the nodes perform random linear network coding and where an adversary potentially injects malicious packets into the network. A good model for such a channel is a linear operator channel, where in this work we employ a combined multiplicative and additive matrix channel. We show that this adversarial channel behaves like a subspace-based symmetric discrete memoryless channel (DMC) under subspace insertions and deletions and typically has an input alphabet with non-prime cardinality. This facilitates the recent application of channel polarization results for DMCs with arbitrary input alphabets by providing a suitable one-to-one mapping from input matrices to subspaces. As a consequence, we show that polarization for this adversarial linear operator channel can be obtained via an element-wise encoder mapping for the input matrices, which replaces the finite field summation in the channel combining step for Ar{\i}kan's classical polar codes.
*Channel Polarization with Higher Order Memory*- We introduce the design of a class of code sequences $\{\mathcal{C}_n^{(m)}, n = 1,2,\dotsc\}$ with memory level $m = 1,2,\dotsc$ based on the channel polarization idea, where $\{\mathcal{C}_n^{(1)}\}$ coincides with the polar codes presented by Arıkan in [1]. The new codes achieve the symmetric capacity of arbitrary binary-input discrete memoryless channels. We derive bounds on the polarization performance as scaled with $m$. We show that $\{\mathcal{C}_n^{(m)}\}$ offers monotonically decreasing encoding and decoding complexity with growing $m$.

#### TuC2: IT & Coding for Contemporary Video

*Some Coding and Information Theoretic Problems in Contemporary (Video) Content Delivery*- Information and coding theory have traditionally been used in point-to-point scenarios, to compute and achieve the transmission channel capacity as well as to compute and achieve the optimal compression rate vs. source distortion trade-off. In today's networks, the same (video) data is often transmitted to multiple users, simultaneously over diverse channels. The users may differ not only in the size and resolution of their displays and computing power, but may also be interested in different video scenes, with different levels of distortion, or even with different distortion measures. This paper describes several transmission and compression problems that arise in such heterogeneous network scenarios, and discusses how information and coding theory could be used to address them.
*Source Broadcasting over Erasure Channels: Distortion Bounds and Code Design*- We study a lossy source-broadcasting problem involving the transmission of a binary source over a two-receiver erasure broadcast channel. The motivation of our work stems from the problem faced by a server that wishes to broadcast content to a diverse set of users with fractional source reconstruction requirements. In this problem, the server wishes to minimize the overall network latency incurred (measured by the number of channel uses per source symbol) when faced with users of heterogeneous channel qualities, computing capabilities, content demand etc. We provide two complementary approaches to this problem. The first approach is to consider the problem from a joint source-channel coding formulation. Under this formulation, we provide both inner and outer bounds for the network latency under an erasure distortion criterion. Alternatively, the second approach employs rateless coding and formulates an optimization problem so as to find a degree distribution that minimizes the network latency. We compare both approaches with numerical simulations.
*Impact of Random and Burst Packet Losses on H.264 Scalable Video Coding*- This paper studies the impact of packet loss on H.264 scalable video coding (SVC). A Markov Chain (MC) with $2^N$ states is developed to describe the error propagation process inside a group of pictures (GOP). The characteristic of different packet loss events is captured by the initial state vector of the MC. Based on the proposed model, the performance of different GOP structures can be evaluated under both random and burst packet losses.
*Network Coding Designs Suited for the Real World: What works, What doesn't, What's Promising*- Network coding (NC) has attracted tremendous attention from the research community due to its potential to significantly improve networks' throughput, delay, and energy performance as well as a means to simplify protocol design and naturally providing security support. The possibilities in code design have produced a large influx of new ideas and approaches to harness the power of NC. But, which of these designs are truly successful in practice? and which designs will not live up to their promised theoretical gains due to real--world constraints? Without attempting a comprehensive view of all practical pitfalls, this paper seeks to identify key ingredients to a successful design, critical and common limitations to most intra--session NC systems as well as promising techniques and ideas to guide future models and research problems grounded on practical concerns.

#### TuC3: Information-Theoretic Secrecy

*Exploiting Common Randomness: a Resource for Network Secrecy*- We investigate the problem of secure communication in a simple network with three communicating parties, two distributed sources who communicate over orthogonal channels to one destination node. The cooperation between the sources is restricted to a rate limited common random source they both observe. The communication channels are erasure channels with strictly causal channel state information of the destination available publicly. A passive adversary is present in the system eavesdropping on any one of the channels. We design a linear scheme that ensures secrecy against the eavesdropper. By deriving an outer bound for the problem we prove that the scheme is optimal in certain special cases.
*Secrecy in Cascade Networks*- We consider a cascade network where a sequence of nodes each send a message to their downstream neighbor to enable coordination, the first node having access to an information signal. An adversary also receives all the communication as well as additional side-information. The performance of the system is measured by a payoff function over all actions produced by the nodes as well as the adversary. The challenge is to maintain secrecy from the adversary in order thwart his attempt to reduce the payoff. We obtain inner and outer bounds on performance, and give examples where they are tight. From these bounds, we also derive the optimal equivocation that can be achieved in this setting, as a special case.
*Secure Degrees of Freedom of MIMO X-Channels with Output Feedback and Delayed CSI*- We investigate the problem of secure transmission over a two-user multi-input multi-output (MIMO) X-channel with noiseless local feedback and delayed channel state information (CSI) available at transmitters. The transmitters are equipped with $M$ antennas each, and the receivers are equipped with $N$ antennas each. For this model, we characterize the optimal sum secure degrees of freedom (SDoF) region. We show that, in presence of local feedback and delayed CSI, the sum SDoF region of the MIMO X-channel is same as the SDoF region of a two-user MIMO BC with $2M$ antennas at the transmitter and $N$ antennas at each receiver. This result shows that, upon availability of feedback and delayed CSI, there is no performance loss in sum SDoF due to the distributed nature of the transmitters. Next, we show that this result also holds if only global feedback is conveyed to the transmitters. We also study the case in which only local feedback is provided to the transmitters, i.e., without CSI, and derive a lower bound on the sum SDoF for this model. Furthermore, we specialize our results to the case in which there are no security constraints. In particular, similar to the setting with security constraints, we show that the optimal sum degrees of freedom (sum DoF) region of the $(M,M,N,N)$-MIMO X-channel is same of the DoF region of a two-user MIMO BC with $2M$ antennas at the transmitter and $N$ antennas at each receiver. We illustrate our results with some numerical examples.
*The Secrecy Capacity of a Compound MIMO Gaussian Channel*- The compound MIMO Gaussian wiretap channel is studied, where the channel to the legitimate receiver is known and the eavesdropper channel is not known to the transmitter but is known to have a bounded spectral norm (channel gain). The compound secrecy capacity is established without the degradedness assumption and the optimal signaling is identified: the compound capacity equals the worst-case channel capacity thus establishing the saddle-point property, the optimal signaling is Gaussian and on the eigenvectors of the legitimate channel and the worst-case eavesdropper is isotropic. The eigenmode power allocation somewhat resembles the standard water-filling but is not identical to it.

### 14:00 - 15:00

#### TuD: Tuesday's Plenaritas

"Polar versus Spatial Coupling: The Good, the Bad, and the Ugly"

Rüdiger Urbanke

Abstract:

Once upon a time it was deemed to be difficult to construct coding schemes that allow reliable transmission close to capacity at low complexity. But these days we have several such schemes. How do they compare? I will discuss in particular polar codes and spatially coupled codes. Which wins when we consider their scaling behavior, their complexity, the achievable throughput, universality, or robustness? As we will see, much is known, but many questions are still open.

(Joint work with Hamed Hassani)

"Managing Bursty Interference with Feedback"

Suhas Diggavi

Abstract:

Feedback is beneficial for managing bursty interference in the physical layer, where the burstiness of interference is caused by the lack of coordination in the upper layers. In our previous work [1], we investigated a two-user bursty interference channel, where the presence of interference on the two interfering links is governed by a single Bernoulli random state, i.e., they are on and off simultaneously. In this work we extend the results in [1] to the case where the two interfering links can be on and off separately, governed by two different Bernoulli states. When the marginal probability distributions of the two states are the same, we recover the results in [1]. It turns out that the capacity characterization does not depend on the joint distributions of the two states.

(Joint work with I-Hsiang Wang)

### 15:10 - 16:30

#### TuE1: Spatial Coupling

*Thresholds of Spatially Coupled Systems via Lyapunov's Method*- The threshold, or saturation phenomenon of spatially coupled systems is revisited in the light of Lyapunov's theory of dynamical systems. It is shown that an application of Lyapunov's direct method can be used to quantitatively describe the threshold phenomenon, prove convergence, and compute threshold values. This provides a general proof methodology for the various systems recently studied.
*Displacement Convexity -- A Useful Framework for the Study of Spatially Coupled Codes*- Spatial coupling has recently emerged as a powerful paradigm to construct graphical models that work well under low-complexity message-passing algorithms. Although much progress has been made on the analysis of spatially coupled models under message passing, there is still room for improvement, both in terms of simplifying existing proofs as well as in terms of proving additional properties. We introduce one further tool for the analysis, namely the concept of displacement convexity. This concept plays a crucial role in the theory of optimal transport and it is also well suited for the analysis of spatially coupled systems. In cases where the concept applies, displacement convexity allows functionals of distributions which are not convex to be represented in an alternative form, so that they are convex with respect to the new parametrization. The alternative convex structure can then often be used to prove the uniqueness of the minimizer of this functional. As a proof of concept we consider spatially coupled $(l,r)$-regular Gallager ensembles when transmission takes place over the binary erasure channel. In particular, we first show the existence of an optimal profile which minimizes the potential functional governing this system. This profile characterizes the ``decoding wave'' of the spatially coupled system. We then show that the potential function of the coupled system is displacement convex. Due to some translational degrees of freedom the convexity by itself falls short of establishing the uniqueness of the minimizing profile. But as we will discuss it is an important step in this direction.
*A Closed-Form Scaling Law for Convolutional LDPC Codes Over the BEC*- We propose a scaling law for the error probability of convolutional LDPC ensembles when transmission takes place over the binary erasure channel. We discuss how the parameters of the scaling law are connected to fundamental quantities appearing in the asymptotic analysis of these ensembles and we verify that the predictions of the scaling law fit well with data derived from simulations over a wide range of parameters.
*A Finite Length Performance Analysis of LDPC Codes Constructed by Connecting Spatially Coupled Chains*- The finite length performance of codes on graphs constructed by connecting spatially coupled low-density parity-check (SC-LDPC) code chains is analyzed. Successive (peeling) decoding is considered for the binary erasure channel (BEC). The evolution of the undecoded portion of the bipartite graph remaining after each iteration is analyzed as a dynamical system. It is shown that, in addition to superior iterative decoding thresholds, connected chain ensembles have better performance than single chain ensembles of the same rate and length.

#### TuE2: Ranking and Group Testing

*Aggregating Rankings with Positional Constraints*- We consider the problem of rank aggregation, where the goal is to assemble ordered lists into one consensus order. Our contributions consist of proposing a new family of distance measures that allow for incorporating practical ranking constraints into the aggregation problem formulation; showing how such distance measures arise from a generalization of Kemeny's axioms of the Kendall $\tau$ distance; and proving that special classes of the proposed distances may be computed in polynomial time.
*Iterative Similarity Inference via Message Passing in Factor Graphs for Collaborative Filtering*- In this paper, we develop a Belief Propagation (BP) algorithm for similarity computation to improve the recommendation accuracy of the neighborhood method, which is one of the most popular Collaborative Filtering (CF) recommendation algorithms. We formulate a probabilistic inference problem as to compute the marginal posterior distributions of similarity variables from their joint posterior distribution given the observed ratings. However, direct computation is prohibitive in large-scale recommender systems. Therefore, we introduce an appropriate chosen factor graph to express the factorization of the joint distribution function, and utilize the BP algorithm that operates in the factor graph to exploit the factorization for efficient inference. In addition, since the high degree at the factor node incurs an exponential increase in computational complexity, we also propose a complexity-reduction technique. The overall complexity of the proposed BP algorithm on a factor graph is linear in the number of variables, which ensures scalability. Finally, through experiments on the MovieLens dataset, we show the superior prediction accuracy of the proposed BP-based similarity computation algorithm for recommendation.
*Stochastic Threshold Group Testing*- We formulate and analyze a stochastic threshold group testing problem motivated by biological applications. Here a set of $n$ items contains a subset of $d \ll n$ defective items. Subsets (pools) of the $n$ items are tested. The test outcomes are negative if the number of defectives in a pool is no larger than $l$; positive if the pool contains more than $u$ defectives, and stochastic (negative/positive with some probability) if the number of defectives in the pool is in the interval $[l, u]$. The goal of a stochastic threshold group testing scheme is to identify the set of $d$ defective items via a ``small'' number of such tests with high probability. In the regime that $l = o(d)$ we present schemes that are computationally feasible to design and implement, and require near-optimal number of tests. Our schemes are robust to a variety of models for probabilistic threshold group testing.
*Optimal Binary Measurement Matrices for Compressed Sensing*- We explicitly construct binary measurement matrices with good sparse approximation guarantees. Specifically, our measurement matrices have an order optimal number of measurements and have $\ell_1/\ell_1$ approximation guarantee. Our construction uses the progressive edge growth technique. We apply coding theoretic results and rely on a recent connection of compressed sensing to LP relaxation for channel decoding.

#### TuE3: Multi-terminal Communication

*Fundamental Limits of Distributed Caching in D2D Wireless Networks*- We consider a wireless Device-to-Device (D2D) network where communication is restricted to be single-hop, users make arbitrary requests from a finite library of possible files and user devices cache information in the form of carefully designed sets of packets from all files in the library. We consider the combined effect of coding in the delivery phase, achieving ``coded multicast gain'', and of spatial reuse due to local short-range D2D communication. Somewhat counterintuitively, we show that the coded multicast gain and the spatial reuse gain do not cumulate, in terms of the throughput scaling laws. In particular, the spatial reuse gain shown in our previous work on uncoded random caching and the coded multicast gain shown in this paper yield the same scaling laws behavior, but no further scaling law gain can be achieved by using both coded caching and D2D spatial reuse.
*Analog Index Coding over Block-Fading MISO Broadcast Channels with Feedback*- We define an ``analog'' index coding problem where a transmitter wishes to send a set of analog sources to be reconstructed with a desired distortion level at $K$ receivers, each of which is characterized by a set of desired sources and by its own side information. The transmission channel is a multi-input single-output (MISO) broadcast channel with independent fading, perfect channel state information at the receiver and (strictly causal) feedback at the transmitter. An outer bound on the rate-distortion region is derived, revealing an interesting tradeoff between the communication rate (source samples per channel use) and the distortion exponential decay with SNR in dB. We focus on the high-SNR and low-distortion regime, in which the proposed outer bound is shown to be tight for some cases of particular interest.
*Equivalence of Inner Regions for Broadcast Channel Coding*- The aim of this paper is to investigate relationship between inner regions for broadcast channel codes transmitting common and private messages. One of the regions is known as the Marton inner region, another region was derived by Gel'fand and Pinsker, and yet another is obtained from the saturation property and collision-resistance property. Although at first glance the Marton inner region appears to be larger than the other regions, they are in fact equivalent.
*A General Framework for Statistically Characterizing the Dynamics of MIMO Channels*- In multiple-input multiple-output (MIMO) systems the communication channel can be split into s parallel single-input single-output eigenchannels. The channel gains associated with these eigenchannels depend on the magnitude of the eigenvalues of the complex random matrix that characterizes the MIMO channel. We present a general framework for the characterization of the dynamics of MIMO channels. In addition, we provide new analytical results for the distribution of the largest eigenvalue of two correlated Wishart matrices, which enable a direct evaluation of different system performance metrics such as the probability of two consecutive outages separated by a given time window, the level crossing rate and the average fade duration.

### 16:50 - 18:10

#### TuF1: LDPC Codes

*Design of Masking Matrix for QC-LDPC Codes*- The masking matrix plays an important role in constructing new classes of regular and irregular quasi-cyclic low density parity check (QC-LDPC) codes. By coupling two identical graphs in a special way, we present a new structure of the masking matrix, whose Tanner graph can be seen as a protograph. From this perspective, we propose a Gaussian Approximation algorithm for protograph-based LDPC codes to analyze the asymptotic performance of this class of codes. It is shown that, although the proposed structure of the masking matrix has almost the same convergence threshold as the conventional one produced randomly by progressive edge growth (PEG) algorithm, the former converges faster than the latter. Simulation results illustrate that the QC-LDPC code constructed by the proposed structure of the masking matrix has about 0.2 dB coding gains compared with the conventional one.
*Nonbinary LDPC-Coded Differential Modulation: Performance and Decoding Algorithm*- This paper is concerned with the design and performance of nonbinary LDPC-coded differential modulation systems. A low-complexity joint detection/decoding method for non-coherent demodulation is proposed, in which the hard-message-passing strategy is used for a joint factor graph. It combines trellis-based differential detection aided with channel prediction and the reliability-based decoding of nonbinary LDPC codes introduced in [1]. The Max-Log-MAP algorithm with soft-in hard-out is used for the differential detection. Simulation results show that the proposed method can offer good performances with a greatly reduced complexity.
*Design of Non-Binary Quasi-Cyclic LDPC codes by ACE optimization*- An algorithm for constructing Tanner graphs of non-binary irregular quasi-cyclic LDPC codes is introduced. It employs a new method for selection of edge labels allowing control over the code's non-binary ACE spectrum and resulting in low error-floor. The efficiency of the algorithm is demonstrated by generating good codes of short to moderate length over small fields, outperforming codes generated by the known methods.
*Improved decoding for binary source coding with coded side information*- This paper presents a new iterative decoding algorithm for the source coding with coded side information problem. Side information (SI) is compressed to an index by a many-to-one (quantization) function. Instead of using the reconstruction corresponding to the quantization index as a single representative SI word to aid the main decoder, one can modify it by projecting an intermediate estimate of the source word onto the Voronoi cell associated to the SI index. The hope is that the projection brings the representative SI word closer to the source word, and thus accelerates iterative decoding. Simulations using LDPC syndrome coding in the main branch and trellis-coded quantization in the SI branch show that for a fixed number of decoder iterations, this method indeed increases the number of correctly decoded source words. In fact, the decoding threshold is shifted, which may be attributed to a partial compensation of the suboptimality of the quantizer.

#### TuF2: Fundamental Limits of Learning and Inference

*An Impossibility Result for High Dimensional Supervised Learning*- We study high-dimensional asymptotic performance limits of binary supervised classification problems where the class conditional densities are Gaussian with unknown means and covariances and the number of signal dimensions scales faster than the number of labeled training samples. We show that the Bayes error, namely the minimum attainable error probability with complete distributional knowledge and equally likely classes, can be arbitrarily close to zero and yet the limiting minimax error probability of every supervised learning algorithm is no better than a random coin toss. In contrast to related studies where the classification difficulty (Bayes error) is made to vanish, we hold it constant when taking high-dimensional limits. In contrast to VC-dimension based minimax lower bounds that consider the worst case error probability over all distributions that have a fixed Bayes error, our worst case is over the family of Gaussian distributions with constant Bayes error. We also show that a nontrivial asymptotic minimax error probability can only be attained for parametric subsets of zero measure (in a suitable measure space). These results expose the fundamental importance of prior knowledge and suggest that unless we impose strong structural constraints, such as sparsity, on the parametric space, supervised learning may be ineffective in high dimensional small sample settings.
*Information-theoretic Limits on the Classification of Gaussian Mixtures: Classification on the Grassmann Manifold*- Motivated by applications in high-dimensional signal processing, we derive fundamental limits on the performance of compressive linear classifiers. By analogy with Shannon theory, we define the classification capacity, which quantifies the maximum number of classes that can be discriminated with low probability of error, and the diversity-discrimination tradeoff, which quantifies the tradeoff between the number of classes and the probability of classification error. For classification of Gaussian mixture models, we identify a duality between classification and communications over non-coherent multiple- antenna channels. This duality allows us to characterize the classification capacity and diversity-discrimination tradeoff using existing results from multiple-antenna communication. We also identify the easiest possible classification problems, which correspond to low-dimensional subspaces drawn from an appropriate Grassmann manifold.
*Asymptotically Minimax Regret by Bayes Mixtures for Non-exponential Families*- We study the problems of data compression, gambling and prediction of a sequence $x_n = x_1 x_2 \dotsc x_n$ from an alphabet $\mathcal{X}$, in terms of regret with respect to various families of probability distributions. It is known that the regret of the Bayes mixture with respect to a general exponential families asymptotically achieves the minimax value when variants of Jeffreys prior are used, under the condition that the maximum likelihood estimate is in the interior of the parameter space. We discuss a modification of Jeffreys prior which has measure outside the given family of densities, to achieve minimax regret with respect to non-exponential type families, e.g.\ curved exponential families and mixture families. These results also provide characterization of Rissanen's stochastic complexity for those classes.
*Asymptotically equivalent sequences of matrices and relative entropy*- In this paper we prove the asymptotic formula that has been recently used as a numerical integration method to approximate the relative entropy (or Kullback-Leibler distance) between two probability density functions with bounded support in terms of functions of Hermitian Toeplitz matrices. To prove that asymptotic formula we use the Gray concept of asymptotically equivalent sequences of matrices.

#### TuF3: Multiple Access Channels

*Achievable Rates for Intermittent Multi-Access Communication*- We formulate a model for intermittent multi-access communication for two users that captures the bursty transmission of the codeword symbols for each user and the possible asynchronism between the receiver and the transmitters as well as between the transmitters themselves. By making different assumptions for the intermittent process, we specialize the system to a random access system with or without collisions. For each model, we characterize the performance of the system in terms of achievable rate regions. The intermittency of the system comes with a significant cost in our achievable schemes.
*Gaussian Many-Access Channels: Definition and Symmetric Capacity*- This paper studies communication networks with a very large number of users simultaneously communicating with an access point. A new notion of many-access channel (MnAC) is introduced, which is the same as a multiaccess channel except that the number of users increases unboundedly with the coding block length. Unlike the conventional multiaccess channel with a fixed number of users, the joint typicality decoding technique is not directly applicable to establish the achievability of the capacity. It is shown that, as long as the number of users grows sublinearly with the coding block length, random coding with Feinstein's threshold decoding is sufficient to achieve the symmetric capacity of the Gaussian MnAC.
*MAC Capacity Under Distributed Scheduling of Multiple Users and Linear Decorrelation*- Consider the problem of a multiple-antenna Multiple-Access Channel at the limit of large number of users. Clearly, in practical scenarios, only a small subset of the users can be scheduled to utilize the channel simultaneously. Thus, a problem of user selection arises. Since solutions which collect Channel State Information (CSI) from all users and decide on the best subset to transmit in each slot do not scale when the number of users is large, distributed algorithms for user selection are advantageous. In this paper, we suggest distributed user selection algorithms which select a group of users to transmit without coordinating between all users and without all users sending CSI to the base station. These threshold-based algorithms are analyzed, and their expected capacity in the limit of large number of users is investigated. It is shown that for large number of users a distributed algorithm can achieve the same scaling laws as the optimal centralized scheme.
*On the Weakest Resource for Coordination in AV-MACs with Conferencing Encoders*- If the senders and the receiver of an Arbitrarily Varying Multiple-Access Channel (AV-MAC) have access to the outputs of discrete correlated memoryless sources, the same rate region is achievable as if common randomness were available. This reduces the necessary amount of cooperation in an AV-MAC considerably. Moreover, to transmit blocklength-$n$ words, no more than order $\log n$ source outputs are required.

### 20:45 - 22:00

#### TuG: Guided Tour of Seville's Alcázar Palace

## Wednesday, September 11

### 08:40 - 09:40

#### WeA: Wednesday's Plenary

Biology has been transformed by new technologies that provide detailed descriptions of the molecular changes that occur in diseases. However, it is difficult to use these data to reveal new therapeutic insights for several reasons. Despite their power, each of these methods still only captures a small fraction of the cellular response. Moreover, when different assays are applied to the same problem, they provide apparently conflicting answers. I will show that network modeling reveals the underlying consistency of the data by identifying small, functionally coherent pathways linking the disparate observations. We have used these methods to analyze how oncogenic mutations alter signaling and transcription and to prioritize experiments aimed at discovering therapeutic targets.

### 09:50 - 11:10

#### WeB1: Real Number Codes

*A new design criterion for spherically-shaped division algebra-based space-time codes*- This work considers normalized inverse determinant sums as a tool for analyzing the performance of division algebra based space-time codes for multiple antenna wireless systems. A general union bound based code design criterion is obtained as a main result. In our previous work, the behavior of inverse determinant sums was analyzed using point counting techniques for Lie groups; it was shown that the asymptotic growth exponents of these sums correctly describe the diversity-multiplexing gain trade-off of the space-time code for some multiplexing gain ranges. This paper focuses on the constant terms of the inverse determinant sums, which capture the coding gain behavior. Pursuing the Lie group approach, a tighter asymptotic bound is derived, allowing to compute the constant terms for several classes of space-time codes appearing in the literature. The resulting design criterion suggests that the performance of division algebra based codes depends on several fundamental algebraic invariants of the underlying algebra.
*Projections, Dissections and Bandwidth Expansion Mappings*- We address the problem of constructing explicit mappings from a $k$-dimensional continuous alphabet source to an $n$-dimensional Gaussian channel. The source is assumed to be uniformly distributed on the unit cube $[0,1)^k$. The scheme considered is based on a family of piecewise linear mappings and its performance is shown to be related to specific projected lattices of $\mathbb{Z}^n$. We study sufficient conditions for the mean squared error of such mappings to scale optimally with the signal-to-noise ratio of the channel and present an explicit construction for the case $k = n − 1$. However, in some other cases our scheme requires the source to be uniformly distributed over a fundamental region of a specific lattice, that may be not congruent to $[0,1)^k$. A dissection technique is presented in order to overcome the source support mismatch and the MSE degradation of such a transformation is analyzed. An example construction of a $2:n$ expansion mapping using the dissection technique is presented and is shown to exhibit optimal scaling of the MSE with the channel SNR.
*Robust error correction for real-valued signals via message-passing decoding and spatial coupling*- We revisit the error correction scheme of real-valued signals when the codeword is corrupted by gross errors on a fraction of entries and a small noise on all the entries. Combining the recent developments of approximate message passing and the spatially-coupled measurement matrix in compressed sensing we show that the error correction and its robustness towards noise can be enhanced considerably. We discuss the performance in the large signal limit using previous results on state evolution, as well as for finite size signals through numerical simulations. Even for relatively small sizes, the approach proposed here outperforms convex-relaxation-based decoders.
*Packing Tubes on Tori: An Efficient Method for Low SNR Analog Error Correction*- In this study we introduce a new class of bandwidth-expansion source-channel codes for analog error correction that can work in any even dimensional space and show superior performance in the low SNR region. Our codes are constructed as geodesics on flat tori as previous studies have done, but we use a tube radius construct derived from the global circumradius function. We prove that geodesics on flat tori lead to constant generalized curvature curves and exploit that property to give a single-argument form for the circumradius function. We optimize encoder parameters by minimizing the global radius of curvature subject to harmonic frequency structure which leads to closed twisted tubes with maximal tube radius on the tori. We exploit the isometry of the flat torus with the hyperrectangle to derive simple closed-form decoders based on torus projections that come with 2 dB of the maximum likelihood decoder.

#### WeB2: Inference, Information and Complexity

*Reconstruction in the Labeled Stochastic Block Model*- The labeled stochastic block model is a random graph model representing networks with community structure and interactions of multiple types. In its simplest form, it consists of two communities of approximately equal size, and the edges are drawn and labeled at random with probability depending on whether their two endpoints belong to the same community or not. It has been conjectured in [1] that this model exhibits a phase transition: reconstruction (i.e. identification of a partition positively correlated with the ``true partition'' into the underlying communities) would be feasible if and only if a model parameter exceeds a threshold. We prove one half of this conjecture, i.e., reconstruction is impossible when below the threshold. In the converse direction, we introduce a suitably weighted graph. We show that when above the threshold by a specific constant, reconstruction is achieved by (1) minimum bisection, and (2) a spectral method combined with removal of nodes of high degree.
*Information-Preserving Markov Aggregation*- We present a sufficient condition for a non-injective function of a Markov chain to be a second-order Markov chain with the same entropy rate as the original chain. This permits an information-preserving state space reduction by merging states or, equivalently, lossless compression of a Markov source on a sample-by-sample basis. The cardinality of the reduced state space is bounded from below by the node degrees of the transition graph associated with the original Markov chain. We also present an algorithm listing all possible information- preserving state space reductions, for a given transition graph. We illustrate our results by applying the algorithm to a bi-gram letter model of an English text.
*On the Noise Sensitivity and Mutual Information of (Nested-) Canalizing Boolean Functions*- We investigate the mutual information of Boolean functions with noisy inputs. Therefore, we derive a relation between the noise sensitivity and the mutual information. Further, we apply Fourier analysis to give upper bounds on the noise sensitivity and lower bounds on the mutual information for canalizing and nested canalizing functions. From these bounds we conjecture the optimality of these classes of functions.
*Coupled Neural Associative Memories*- We propose a novel architecture to design a neural associative memory that is capable of learning a large number of patterns and recalling them later in presence of noise. It is based on dividing the neurons into local clusters and parallel plains, an architecture that is similar to the visual cortex of macaque brain. The common features of our proposed model with those of spatially-coupled codes enable us to show that the performance of such networks in eliminating noise is drastically better than the previous approaches while maintaining the ability of learning an exponentially large number of patterns. Previous work either failed in providing good performance during the recall phase or in offering large pattern retrieval (storage) capacities. We also present computational experiments that lend additional support to the theoretical analysis.

#### WeB3: Network Coding

*Linear Network Coding for Multiple Groupcast Sessions: An Interference Alignment Approach*- We consider the problem of linear network coding over communication networks, representable by directed acyclic graphs, with multiple groupcast sessions: the network comprises of multiple destination nodes, each desiring messages from multiple sources. We adopt an interference alignment perspective, providing new insights into designing practical network coding schemes as well as the impact of network topology on the complexity of the alignment scheme. In particular, we show that under certain (polynomial-time checkable) constraints on networks with $K$ sources, it is possible to achieve a rate of $1/(L+d+1)$ per source using linear network coding coupled with interference alignment, where each destination receives messages from $L$ sources ($0 < K$), and $d$ is a parameter, solely dependent on the network topology, that satisfies $0 \leq d < K - L$.
*On the capacity of $ms/3t$ and $3s/nt$ sum-networks*- We consider directed acyclic networks where each terminal requires sum of all the sources. Such a class of networks has been termed as \textit{sum-networks} in the literature. A sum-network having $m$ sources and $n$ terminals has been termed as a $ms/nt$ sum-network. There has been previous works on the capacity of sum-networks, specifically, it has been shown that the capacity of a $3s/3t$ sum-network is either $0, 2/3$ or $\geq 1.$ In this paper, we consider some generalizations of $3s/3t$ sum-networks, namely, $ms/3t$ and $3s/nt$ sum-networks, where $m,n \geq 3$. For $ms/3t$ and $3s/nt$ sum-networks, where $m,n \geq 3$, if the min-cut between each source and each terminal is at least $1$, the capacity is known to be at least $2/3$. In this paper, we show that there exist $ms/3t$ and $3s/nt$ sum-networks whose capacities lie between $2/3$ and $1$. Specifically, we show that for any positive integer $k \geq 2$, there exists a $ms/3t$ sum-network (and also a $3s/nt$ sum-network) whose capacity is $\frac{k}{k+1}$. We conjecture that the capacity of a $ms/3t$ sum-network, where $m > 3$ (and also of a $3s/nt$ sum-network, where $n >3$) is either $0$, $\geq 1$ or of the form $\frac{k}{k+1}$, where $k$ is a positive integer greater than or equal to 2.
*Combinatorial Flow over Cyclic Linear Networks*- A combinatorial notion of flow is identified for time-invariant linear coding over non-layered deterministic linear networks that may contain cycles, broadcast and interference links. It reveals the matroidal structure for efficient code construction, and enables a seamless extension of the classical network coding results. In addition to a max-flow min-cut theorem for multicast, time-invariant routing is shown to be optimal for unicast.
*Outer Bounds and a Functional Study of the Edge Removal Problem*- In this paper, we investigate the impact of a single edge on the capacity region of a network of error-free, point-to-point links. A family of networks and edges is said to exhibit the ``edge removal property'' if for any network and edge in the family, removing a $\delta$-capacity edge changes the capacity region by at most $\delta$ in each dimension. We derive a sufficient condition on network coding functions to guarantee that the edge removal property holds when the network is operated using functions satisfying the condition. Also, we extend the family of network capacity bounds for which it is known that removing a single edge of capacity $\delta$ changes the capacity bound by at most $f(\delta)$ in each dimension. Specifically, we show that removing a single $\delta$-capacity edge changes the Generalized Network Sharing outer bound by at most $\delta$ in each dimension and the Linear Programming outer bound by at most a constant times $\delta$ in each dimension.

### 11:30 - 12:50

#### WeC1: Lattice Codes

*Lattice Codes Based on Product Constructions over $\mathbb{F}_q^2$ with Applications to Compute-and-Forward*- A novel construction of lattices is proposed. This construction can be thought of as Construction A with linear codes that can be represented as the Cartesian product of two linear codes over $\mathbb{F}_q$; hence, is referred to as the product construction. The existence of a sequence of Poltyrev-good lattices generated by the product construction under some conditions is shown. This family of lattices is then used to generate signal constellations with $q^2$ elements which can be used in conjunction with multilevel coding with channel codes over $\mathbb{F}_{q}$ instead of $\mathbb{F}_{q^2}$ to design good coded modulation schemes for compute-and-forward.
*Precoded Integer-Forcing Universally Achieves the MIMO Capacity to Within a Constant Gap*- An open-loop single-user multiple-input multiple-output communication scheme is considered where a transmitter, equipped with multiple antennas, encodes the data into independent streams all taken from the same linear code. The coded streams are then linearly precoded using the encoding matrix of a perfect linear dispersion space-time code. At the receiver side, integer-forcing equalization is applied, followed by standard single-stream decoding. It is shown that this communication architecture achieves the capacity of any Gaussian multiple-input multiple-output channel up to a gap that depends only on the number of transmit antennas.
*Enabling Multiplication in Lattice Codes via Construction A*- As a first step towards distributed computations in a wireless network, we introduce ideal lattices, that is lattices built over an ideal of a ring of integers in a number field, as a tool for constructing lattice codes at the physical layer. These lattices are not only additive groups as all lattices, but they are also equipped with a multiplication, which enables polynomial operations at each node of the wireless network. In this paper, we show how some of these ideal lattices can be constructed from polynomial codes (generalization of cyclic codes) via Construction A, and illustrate how these lattices enable multiplication.
*Spatially-Coupled Low Density Lattices based on Construction A with Applications to Compute-and-Forward*- We consider a class of lattices built using Construction A, where the underlying code is a non-binary spatially-coupled low density parity check code. We refer to these lattices as spatially-coupled LDA (SCLDA) lattices. SCLDA lattices can be constructed over integers, Gaussian integers and Eisenstein integers. We empirically study the performance of SCLDA lattices under belief propagation (BP) decoding. Ignoring the rate loss from termination, simulation results show that the BP thresholds of SCLDA lattices over integers is 0.11 dB (0.34 dB with the rate loss) and the BP thresholds for SCLDA lattices over Eisenstein integers are 0.08 dB from the Poltyrev limit (0.19 dB with the rate loss). Motivated by this result, we use SCLDA lattice codes over Eisenstein integers for implementing a compute-and-forward protocol. For the examples considered in this paper, the thresholds for the proposed lattice codes are within 0.28 dB from the achievable rate of this coding scheme and within 1.06 dB from the achievable computation rate of Nazer and Gastpar's coding scheme in [6] extended to Eisenstein integers.

#### WeC2: Entropy and Statistics of Sequences

*Information Theory for Atypical Sequences*- One characteristic of the information age is the exponential growth of information, and the ready availability of this information through networks, including the internet -- ``Big Data.'' The question is what to do with this enormous amount of information. One possibility is to characterize it through statistics -- think averages. The perspective in this paper is the opposite, namely that most of the value in the information is in the parts that deviate from the average, that are unusual, atypical. Think of art: the valuable paintings or writings are those that deviate from the norms, that are atypical. The same could be true for venture development and scientific research. The paper first discusses what exactly should be understood by ``atypical.'' This is by no means straightforward. It has to be a well defined theoretical concept corresponding to some intuitive idea of atypicality, which when applied gives useful results. This is followed by a simple example of iid binary sequences. This example is simple enough that complete algorithms can be developed and analyzed, which give insights into atypicality. We finally develop a more general algorithm based on the Context Tree Weighing algorithm and apply that to heart rate variability.
*A Phase Transition for the Uniform Distribution in the Pattern Maximum Likelihood Problem*- In this paper, we consider the setting of the pattern maximum likelihood (PML) problem studied by Orlitsky et al. We present a well-motivated heuristic algorithm for deciding the question of when the PML distribution of a given pattern is uniform. The algorithm is based on the concept of a ``uniform threshold''. This is a threshold at which the uniform distribution exhibits an interesting phase transition in the PML problem, going from being a local maximum to being a local minimum.
*Shannon Entropy Estimation from Convergence Results in the Countable Alphabet Case*- In this paper new results for the Shannon entropy estimation and estimation of distributions, consistently in information divergence, are presented in the countable alphabet case. Sufficient conditions for the entropy convergence are adopted, including scenarios with both finitely and infinitely supported distributions. From this approach, new estimates, strong consistency results and rate of convergences are derived for various plug-in histogram-based schemes.
*The Entropy of Sums and Rusza's Divergence on Abelian Groups*- Motivated by a series of recently discovered inequalities for the sum and difference of discrete or continuous random variables (by Tao and the authors), we argue that the most natural, general form of these results is in terms of a special case of a mutual information, which we call the Ruzsa divergence between two probability distributions. This can be defined for arbitrary pairs of random variables taking values in any discrete (countable) set, on n-dimensional Euclidean space, or in fact on any locally compact Hausdorff abelian group. We study the basic properties of the Rusza divergence and derive numerous consequences. In particular, we show that all of the inequalities mentioned can be stated and proved in a unified way, extending their validity to the present general setting. For example, consequences of the basic properties of the Ruzsa divergence developed here include the fact that the entropies of the sum and the difference of two independent random vectors severely constrain each other, as well as entropy analogues of a number of results in additive combinatorics. Although the setting is quite general, the results are already of interest (and new) in the case of real-valued random vectors. For instance, another consequence is an entropic analogue (in the setting of log-concave distributions) of the Rogers-Shephard inequality for convex bodies.

#### WeC3: Cooperation and Interference

*Cooperative Energy Harvesting Communications with Relaying and Energy Sharing*- This paper considers two-hop communication networks where the transmitters harvest their energy in an intermittent fashion. In this network, communication is carried out by signal cooperation, i.e., relaying. Additionally, the transmitters have the option of transferring energy to one another, i.e., energy cooperation. Energy is partially lost during transfer, exposing a trade-off between energy cooperation and use of harvested energy for transmission. A multi-access relay model is considered and transmit power allocation and energy transfer policies that jointly maximize the sum-rate are found. It is shown that a class of power policies achieves the optimal sum-rate, allowing a separation of optimal energy transfer and optimal power allocation problems. The optimal energy transfer policy is shown to be an ordered node selection, where nodes with better energy transfer efficiency and worse channels transfer all their energy to the relay or other source nodes via the relay. For the special case of single source, the optimal policy requires the direction of energy transfer to remain unchanged unless either node depletes all of its energy. Overall, the findings provide the insight that cooperation of the source nodes by sharing energy with the relay node leads to them indirectly cooperating with each other, and that such cooperation can be carried out in a last-minute fashion.
*Zero vs.\ $\varepsilon$ Error in Interference Channels*- Traditional studies of multi-source, multi-terminal interference channels typically allow a vanishing probability of error in communication. Motivated by the study of network coding, this work addresses the task of quantifying the loss in rate when insisting on zero error communication in the context of interference channels.
*The Sum-Capacity of different $K$-user Cognitive Interference Channels in Strong Interference*- This work considers different $K$-user extensions of the two-user cognitive interference channel model. The models differ by the cognitive abilities of the transmitters. In particular, the primary message sharing model, in which only one user is cognitive and knows all messages, and the cumulative message sharing model, in which a user knows the messages of all users with lesser index, are analyzed. The central contribution is the characterization of the sum-capacity of both models under a strong interference condition, which amounts to having one receiver in the network that can decode all transmitted signals without loss of optimality. The sum-capacity is evaluated for the Gaussian noise channel, as well as the conditions on the channel gains that grant strong interference.
*On Degrees-of-Freedom of Full-Duplex Uplink/Downlink Channel*- Feasibility of full-duplex opens up the possibility of applying it to cellular networks to operate uplink and downlink simultaneously for multiple users. However, simultaneous operation of uplink and downlink poses a new challenge of intra-cell inter-node interference. In this paper, we identify scenarios where inter-node interference can be managed to provide significant gain in degrees of freedom over the conventional half-duplex cellular design.

### 14:00 - 15:30

#### WeD: Wednesday's Plenaritas

"Information and Control in Sensing-Acting systems: Predictive information and the emergence of hierarchical representations"

Naftali Tishby

Abstract:

A major theoretical challenge for understanding intelligent systems is the connection between control and information processing. While it is clear that optimal control theory utilizes sensory information for state estimation, optimal control fails to properly describe active sensing, exploration and learning, in which information gains - rather than physical value - should be part of the utility function. I will describe a new theoretical framework for sensing-acting systems which correctly combines information gathering, learning, and control. It extends the standard optimal control paradigm (AKA reinforcement learning) by including information constraints on sensory, memory, and control channels. We achieve this by applying large deviation bounds to the POMDP formulation of optimal control, and show that a proper formulation of the problem should balance between "value to-go" and "information to-go" in way that precisely incorporates the capacity and actionable information constraints. I then argue that when there is learning, the "predictive information" - information between the past and future of the process - becomes sub-extensive (grows sub linearly with time) and the above principle must be modified to remain asymptotically consistent. One intriguing possible modification is the rescaling/renormalization of time with information. I will show that such time-renormaization of the optimal control (Bellman) equations can explain the discounting of rewards, as well as the emergence of hierarchical representations in both sensory perception and planning.

"Source Coding, Lists, and Rényi Entropy"

Amos Lapidoth

Abstract:

A sequence produced by a memoryless source is to be described using a fixed number of bits that is proportional to its length. Based on the description, a list that is guaranteed to contain the sequence must be produced. The trade-off between the description length and the moments of the listsize is studied when the sequence's length tends to infinity. It is characterized by the source's R\'enyi entropy. Extensions to scenarios with side information are also studied, where the key is conditional R\'enyi entropy. The lossy case where at least one of the elements of the list must be within a specified distortion from the source sequence is also solved.

(Joint work with Christoph Bunte)

"The role of the hypercontractivity ribbon in information theory"

Venkat Anantharam

Abstract:

A communication channel defines a linear map from functions on the output alphabet to those on the input alphabet given by taking the conditional expectation. For each q >= 1 this maps Lq functions to Lq functions and is a contraction. However, generally speaking, more is true, in that this map is a contraction from Lq functions to Lp functions for a range of p > q. This property, which has been studied by mathematicians for many years in the broader context of linear operators between functions on abstract spaces, is called hypercontractivity. Versions of this theory are also available q in (0,1), and it has also begun to be extended recently to negative q, though of course Lq spaces for q smaller than 1 behave quite differently from those for q >= 1.

The set of p \geq q \geq 1 for which Lq functions are mapped contractively to Lp functions is called the forward part of the hypercontractivity ribbon associated to the channel (the full ribbon also has a backward part, for q \leq p \leq 1). The hypercontractivity ribbon is thus a characteristic of the channel. It turns out that the hypercontractivity ribbon has deep and intimate relevance to the traditional concerns of information theory over channels.

The talk will attempt to justify the preceding sentence, for discrete memoryless channels. In particular, we will discuss the role of the hypercontractivity ribbon in the distributed simulation of joint probability distributions, as well as its role in strong data processing inequalities.

(Parts of this talk are based on joint work with Sudeep Kamath, Chandra Nair, and Amin Aminzadeh Gohari)

## Thursday, September 12

### 08:40 - 09:40

#### ThA: Thursday's Plenary

In this talk we survey discrete time, multi period, sequential investment strategies for financial markets. Under memoryless assumption on the underlying market process of relative prices the best constantly rebalanced portfolio is studied, called log-optimal portfolio, which achieves the maximal asymptotic average growth rate. For generalized dynamic portfolio selection, when the market process is stationary and ergodic, universal, growth optimal empirical methods are shown, using current principles of nonparametric regression estimation and machine learning algorithms. The empirical performance of the methods are illustrated for NYSE data.

### 09:50 - 11:10

#### ThB1: Algebraic Codes

*Computing the Camion's multivariate BCH bound*- The P.\ Camion's apparent distance of an abelian code is a generalization of the notion of the BCH bound of cyclic codes [3]. In this work, we present a method of computation of the apparent distance in multivariate abelian codes, based on manipulations of hypermatrices. Our algorithm needs fewer computations than any other, up to our knowledge; in fact, in the case of two dimensional abelian codes it has linear complexity. We give two applications. First, we construct abelian codes that multiply the dimension of a given cyclic code and equal its BCH bound. The second one is an approximation to a notion of BCH multivariate code.
*Flag Orbit Codes and Their Expansion to Stiefel Codes*- We discuss group orbits codes in homogeneous spaces for the unitary group, known as flag manifolds. The distances used to describe the codes arise from embedding the flag manifolds into Euclidean hyperspheres, providing a generalization of the spherical embedding of Grassmann manifolds equipped with the so-called chordal distance. Flag orbits are constructed by acting with a unitary representation of a finite group. In the construction, the center of the finite group has no effect, and thus it is sufficient to consider its inner automorphism group. Accordingly, some explicit constructions from projective unitary representations of finite groups in 2 and 4 dimensions are described. We conclude with examples of codes on the Stiefel manifold constructed as orbits of the linear representation of the projective groups, and thus expansion of the flag codes considered.
*New Geometrical Spectra of Linear Codes with Applications to Performance Analysis*- In this paper, new enumerating functions for linear codes are defined, including the triangle enumerating function and the tetrahedron enumerating function, both of which can be computed using a trellis-based algorithm over polynomial rings. The computational complexity is dominated by the complexity of the trellis. In addition, we show that these new enumerating functions can be used to improve existing performance bounds on the maximum likelihood decoding.
*Concatenated Permutation Block Codes based on Set Partitioning for Substitution and Deletion Error-Control*- A new class of permutation codes is presented where, instead of considering one permutation as a codeword, codewords consist of a sequence of permutations. The advantage of using permutations, i.e.\ their favourable symbol diversity properties, is preserved. Additionally, using sequences of permutations as codewords, code rates close to the optimum rate can be achieved. Firstly, the complete set of permutations is divided into subsets by using set partitioning. Binary data is then mapped to permutations from these subsets. These permutations, together with a parity permutation, will form the codeword. Two constructions will be presented: one capable of detecting and correcting substitution errors and the other capable of detecting and correcting either substitution or deletion errors.

#### ThB2: Statistics and Learning

*From Bandits to Experts: A Tale of Domination and Independence*- We consider the partial observability model for multi-armed bandits, introduced by Mannor and Shamir. Our main result is a characterization of regret in the directed observability model in terms of the dominating and independence numbers of the observability graph. We also show that in the undirected case, the learner can achieve optimal regret without even accessing the observability graph before selecting an action. Both results are shown using variants of the Exp3 algorithm operating on the observability graph in a time-efficient manner.
*On Semi-Probabilistic Universal Prediction*- We describe two scenarios of universal prediction, as well as some recent advances in the study of minimax regret and algorithmic development. We then propose an intermediate scenario, the Semi-Probabilistic Setting, and make progress towards understanding the associated minimax regret.
*Learning Joint Quantizers for Reconstruction and Prediction*- We consider the problem of empirical design of variable-rate quantizers for reconstruction and prediction. When a discriminative model (conditional distribution of the unobserved output given the observed input) is known or can be accurately estimated from a separate training set, we show that this problem reduces to designing a certain type of a generalized quantizer by means of empirical risk minimization on unlabeled input samples only. We derive a high-probability upper bound on the resulting expected performance of such a quantizer in terms of the training sample size and the complexity parameters of the reconstruction and the prediction problems. We also discuss two illustrative examples: binary classification with absolute loss and the information bottleneck.
*Prediction of Individual Sequences with Finite-Memory Universal Predictors*- Universal prediction of individual sequences is a well studied subject with applications in universal coding, portfolio theory, estimation and so on. The main theme of the problem assumes a constrained, non- universal predictor that observes the entire sequence, yet since it is constrained, e.g., to be a finite-state machine, its performance is limited. Universal prediction theory shows that in some cases a non-anticipating universal predictor can attain this performance of the non-universal predictor, for all individual sequences. The presented work considers the case where the universal predictor is also constrained to be a finite state machine.

#### ThB3: Interference and Relay Channels

*Multilevel Topological Interference Management*- The robust principles of treating interference as noise (TIN) when it is sufficiently weak, and avoiding it when it is not, form the background for this work. Combining TIN with the topological interference management (TIM) framework that identifies optimal interference avoidance schemes, a baseline TIM-TIN approach is proposed which decomposes a network into TIN and TIM components, allocates the signal power levels to each user in the TIN component, allocates signal vector space dimensions to each user in the TIM component, and guarantees that the product of the two is an achievable number of signal dimensions available to each user in the original network.
*On the Interference Channel with Common Messages and the role of Rate-Sharing*- The capacity region of the interference channel with common messages is studied. This channel model is a variation of the classical two user interference channel modified such that each encoder has both a private message as well as a common message to be decoded at both receivers. Achievable rates for this channel model can be characterized by the classical Han-Kobayashi achievable rate region for the interference channel in which, at each encoder, the codeword embedding the private message is superimposed over the codeword for the common message. We show that the achievable rates for this region can be improved upon by rate-sharing, which consist of transmitting part of the private message into the common codeword. This improved region is shown to approach the capacity for a class of injective semi-deterministic channels. Moreover, we show that the Fourier-Motzkin elimination of the Han-Kobayashi with rate-sharing contains less rate bounds than the one without rate-sharing. This approach provides an alternative proof of the simplification of the Han-Kobayashi originally shown by Chong et al. This result is particularly interesting as it shows that simplifications in the spirit of Chong et al.\ for general channels can be performed through rate-sharing and Fourier-Motzkin elimination. This approach can be easily implemented algorithmically and is relevant in the context of the automatic derivation of achievable rate regions.
*The Stability Region of the Two-User Interference Channel*- The stable throughput region of the two-user interference channel is investigated here. First, the stability region for the general case is characterized. Second, we study the cases where the receivers treat interference as noise or perform successive interference cancelation. Finally, we provide conditions for the convexity/concavity of the stability region and for which a certain interference management strategy leads to broader stability region.
*On the Reliable Transmission of Correlated Sources over Two-Relay Network*- In this paper, we investigate reliable transmission of three correlated discrete memoryless sources over a two-relay network. In our considered model, one of the sources is available at the sender whereas, the other two sources are known to the first and the second relay. We present both joint and separate source-channel coding schemes, and derive the corresponding sets of sufficient conditions for reliable sources transmission. The manner of cooperation in both schemes is Decode-and-Forward strategy. In the joint approach, we generalize the correlation preserving mapping technique to our model using nested backward decoding. Our proposed separate approach is based on Slepian-Wolf source coding and irregular encoding/successive decoding strategy. Furthermore, we obtain necessary conditions for reliable sources transmission over the network. Our results can be reduced to the several known results in the literature.

### 11:30 - 12:50

#### ThC1: Source Coding

*Nonasymptotic Noisy Source Coding*- This paper shows new general nonasymptotic achievability and converse bounds and performs their dispersion analysis for the lossy compression problem in which the compressor observes the source through a noisy channel. While this problem is asymptotically equivalent to a noiseless lossy source coding problem with a modified distortion function, nonasymptotically there is a difference in how fast their minimum achievable coding rates approach the rate-distortion function, providing yet another example where at finite blocklengths one must put aside traditional asymptotic thinking.
*The Heegard-Berger Problem with Common Receiver Reconstructions*- The variant of the Heegard-Berger problem where the receivers require identical reconstructions is studied, and the rate-distortion function for the following three settings is derived: (a) the encoder is also required to generate the reconstructions common to the receivers; (b) the side information at the receivers are physically degraded; and (c) the side information at the receivers are stochastically degraded, and the source satisfies a particular full-support condition. The characterizations indicate that the receiver side information can be fully exploited to perform Wyner-Ziv-style binning. However, the reconstruction functions can depend only on the randomness common to the receivers in the Gács-Körner sense.
*Lattice Quantization Noise Revisited*- Dithered quantization is widely used in signal processing and coding due to its many desirable properties in theory and in practice. In this paper, we show that dither is unnecessary for Gaussian sources when the flatness factor of the quantization lattice is small. This means that the quantization noise behaves much like that in dithered quantization. In particular, it tends to be uniformly distributed over any fundamental region of the lattice and be uncorrelated with the signal; further, for optimum lattice quantizers, it approaches the rate-distortion bound of Gaussian sources with minimum mean-square error (MMSE) estimation.
*The Likelihood Encoder for Source Coding*- The likelihood encoder with a random codebook is demonstrated as an effective tool for source coding. Coupled with a soft covering lemma (associated with channel resolvability), likelihood encoders yield simple achievability proofs for known results, such as rate-distortion theory. They also produce a tractable analysis for secure rate-distortion theory and strong coordination.

#### ThC2: Information and Estimation

*Non-Parametric Prediction of the Mid-Price Dynamics in a Limit Order Book*- Many securities markets are organized as double auctions where each incoming limit order---i.e., an order to buy or sell at a specific price---is stored in a data structure called the limit order book. A trade happens whenever a marketable order arrives. This order flow is visible to every market participant in real time. We propose a novel non-parametric approach to short-term forecasting of the mid-price change in a limit order book (i.e., of the change in the average of the best offer and the best bid prices). We construct a state characterizing the order book at each time instant and compute a feature vector for each value of the state. The features get updated during the course of a trading day, as new order flow information arrives. Our prediction rules at every time instant during the trading day are based on the feature vector of the state observed at that time instant. The distinction of our approach from the previous ones is that it does not impose a restrictive parametric model. Implicit assumptions of our method are very mild. Initial experiments with real order book data from NYSE suggest that our algorithms show promise. We illustrate their usage in a practical application of executing a large trade.
*One-shot bounds for various information theoretic problems using smooth min and max Rényi divergences*- One-shot analogues for various information theory results known in the asymptotic case are proven using smooth min and max Rényi divergences. In particular, we prove that smooth min Rényi divergence can be used to prove one-shot analogue of the Stein's lemma. Using smooth min Rényi divergence we prove a special case of packing lemma in the one-shot setting. Furthermore, we prove a one-shot analogue of covering lemma using smooth max Rényi divergence. We also propose one-shot achievable rate for source coding under maximum distortion criterion. This achievable rate is quantified in terms of smooth max Rényi divergence.
*Signals that can be easily time-frequency synchronized from their ambiguity function*- Delay and Doppler determination of a time-delayed and frequency-shifted signal is one of fundamental problems in communication. The two-parameter estimation is reduced to two one-parameter estimation problems in time- and frequency-domain signals. Motivated by Gabor's communication theory, we proceed further parallelism between time- and frequency-domain signals and solve the two problems individually and cooperatively. Simulation results without prescribed information are reported.
*Novel Tight Classification Error Bounds under Mismatch Conditions based on $f$-Divergence*- By default, statistical classification/multiple hypothesis testing is faced with the model mismatch introduced by replacing the true distributions in Bayes decision rule by model distributions estimated on training samples. Although a large number of statistical measures exist w.r.t.\ to the mismatch introduced, these works rarely relate to the mismatch in accuracy, i.e.\ the difference between model error and Bayes error. In this work, the accuracy mismatch between the ideal Bayes decision rule/Bayes test and a mismatched decision rule in statistical classification/multiple hypothesis testing is investigated explicitly. A proof of a novel generalized tight statistical bound on the accuracy mismatch is presented. This result is compared to existing statistical bounds related to the total variational distance that can be extended to bounds of the accuracy mismatch. The analytic results are supported by distribution simulations.

#### ThC3: Interference Networks

*Degrees of Freedom of the Rank-Deficient MIMO X channel*- The multiple-input multiple-output (MIMO) X channel with rank-deficient channels is considered. We characterize the total degrees of freedom (DoF) by defining an outer bound which is attained by the proposed precoding scheme. The total DoF are described in closed-form when all transmitters and receivers have $M$ and $N$ antennas, respectively. For an arbitrary number of antennas, the total DoF are obtained as solution of a linear programming problem. Results elucidate that, unlike point-to-point MIMO systems where DoF increase with the rank of the channel, the total DoF in a multipoint-to-multipoint MIMO system can increase up to a certain point if channels are rank-deficient. Hence, we corroborate the relevance of knowing the rank-deficiency in the MIMO X channel, a property already observed in the MIMO Interference channel.
*Interference Neutralization using Lattice Codes*- Deterministic approach of [1] models the interaction between the bits that are received at the same signal level by the modulo 2 sum of the bits where the carry-overs that would happen with real addition are ignored. By this model in a multi-user setting, the receiver can distinguish most significant bits (MSBs) of the stronger user without any noise. A faithful implementation of the deterministic model requires one to ``neutralize interference'' from previous carry over digits. This paper proposes a new implementation of ``interference neutralization'' [2] using structured lattice codes. We first present our implementation strategy and then, as an application, apply this strategy to a symmetric half-duplex Gaussian butterfly network. In this network, two transmitters communicate their information to two destinations via a half-duplex relay. For duplexing factor 0.5 and under certain conditions, we show that our proposed scheme based on superposition of nested lattice codes can achieve the capacity region in high SNR. Also, regardless of all channel parameters, the gap between its achievable rate region and the outer bound is at most 0.293 bits/sec/Hz.
*Cognitive Cooperative Communications on the Multiple Access Channel*- In this paper, we investigate a three-user cognitive communication network where a primary two-user Multiple Access Channel (MAC) suffers interference from a secondary point-to-point (P2P) channel, sharing the same medium. While the P2P channel transmitter —--transmitter 3--— causes an interference at the primary MAC receiver, we assume that the primary channel transmitters —--transmitters 1 and 2--— do not cause any interference at the P2P receiver. It is assumed that one of the MAC transmitters has cognitive capabilities and cribs causally from the other MAC transmitter. Furthermore, we assume that the cognitive transmitter knows the message of transmitter 3 in a noncausal manner, thus introducing the three- user Multiple Access Cognitive Z-Interference Channel (MA- CZIC). We obtain inner and outer bounds on the capacity region of the three-user MA-CZIC for both strictly and nonstrictly causal cribbing cognitive encoders.
*State-Dependent Gaussian Z-Channel with Mismatched Side-Information and Interference*- A state-dependent Gaussian Z-interference channel model is investigated in the regime of high state power, in which transmitters 1 and 2 communicate with receivers 1 and 2, and only receiver 2 is interfered by transmitter 1's signal and a random state sequence. The state sequence is known noncausally only to transmitter 1, not to the corresponding transmitter 2. A layered coding scheme is designed for transmitter 1 to help interference cancelation at receiver 2 (using a cognitive dirty paper coding) and to transmit its own message to receiver 1. Inner and outer bounds are derived, and are further analyzed to characterize the boundary of the capacity region either fully or partially for all Gaussian channel parameters. Our results imply that the capacity region of such a channel with mismatched transmitter-side state cognition and receiver-side state interference is strictly less than that of the corresponding channel without state, which is in contrast to Costa type of dirty channels, for which dirty paper coding achieves the capacity of the corresponding channels without state.

### 14:00 - 15:00

#### ThD: Thursday's Plenaritas

"Detection of correlations and random geometric graphs"

Gabor Lugosi

Abstract:

In this talk we explore the the limits of testing whether a correlation matrix of a multivariate normal population is the identity matrix. We consider on sparse classes of alternatives where only a few entries are nonzero and, in fact, positive. We derive a general lower bound applicable to various classes and study the performance of some near-optimal tests. The detection problem naturally leads to the study of the clique number of high-dimensional random geometric graphs. Interestingly, lower bounds for the risk of the likelihood ratio test lead to nontrivial bounds for the clique number.

(Joint work with S\'ebastien Bubeck (Princeton) and Ery Arias-Castro (UCSD))

"Coding for Chip-to-Chip Communication"

Amin Shokrollahi - Kandou Bus, and EPFL

Abstract:

Modern electronic devices consist of a multitude of IC components: the processor, the memory, the Modem (in wireless devices), the graphics processor, are only some examples of components scattered throughout a device. The increase of the volume of digital data that needs to be accessed and processed by such devices calls for ever faster communication between these IC's. Faster communication, however, often translates to higher susceptibility to various types of noise, and inevitably to a higher power consumption in order to combat noise. This increase in the power consumption is, for the most part, far from linear, and cannot be easily compensated for by Moore's Law. In this talk I will give a short overview of problems encountered in chip-to-chip communication, and will advocate the use of novel coding techniques to solve those problems. I will also briefly talk about Kandou Bus, and some of the approaches the company is taking to design, implement, and market such solutions.

### 15:10 - 16:30

#### ThE1: Codes for Storage

*Exact-Regenerating Codes between MBR and MSR Points*- In this paper we study distributed storage systems with exact repair. We give a construction for regenerating codes between the minimum storage regenerating (MSR) and the minimum bandwidth regenerating (MBR) points and show that in the case that the parameters $n$, $k$, and $d$ are close to each other our constructions are close to optimal when comparing to the known capacity when only functional repair is required. We do this by showing that when the distances of the parameters $n$, $k$, and $d$ are fixed but the actual values approach to infinity, the fraction of the performance of our codes with exact repair and the known capacity of codes with functional repair approaches to one.
*Prefixless $q$-ary Balanced Codes with ECC*- We present a Knuth-like method for balancing $q$-ary codewords, which is characterized by the absence of a prefix that carries the information of the balancing index. Look-up tables for coding and decoding the prefix are avoided. We also show that this method can be extended to include error correction of single channel errors.
*Constrained Rank Modulation Schemes*- Rank modulation schemes for non-volatile memories (NVMs) represent information by the relative rankings of cell charge levels. This approach has several benefits; in particular, the scheme resolves the ``write-asymmetry'' limitation that NVMs suffer from. However, cell writing is still affected by a common NVM problem: inter-cell coupling, which can result in inadvertently increasing the charge level of neighboring cells. This is a potential source of error in the rank modulation scheme. In this paper, we explore the idea of constrained coding over permutations. These constraints minimize the impact of inter-cell coupling while still allowing the use of the rank modulation scheme. We study various constraints and their resulting rates, capacities, and other properties, and introduce an explicit constrained rank modulation code construction.
*Maximum Likelihood Associative Memories*- Associative memories are structures that store data in such a way that it can later be retrieved given only a part of its content -— a sort-of error/erasure-resilience property. They are used in applications ranging from caches and memory management in CPUs to database engines. In this work we study associative memories built on the maximum likelihood principle. We derive minimum residual error rates when the data stored comes from a uniform binary source. Second, we determine the minimum amount of memory required to store the same data. Finally, we bound the computational complexity for message retrieval. We then compare these bounds with two existing associative memory architectures: the celebrated Hopfield neural networks and a neural network architecture introduced more recently by Gripon and Berrou.

#### ThE2: Big Data: Statistics and Information Theory in High Dimensions and Undersampled Regimes

*Informational Confidence Bounds for Self-Normalized Averages and Applications*- We present deviation bounds for self-normalized averages and applications to estimation with a random number of observations. The results rely on a peeling argument in exponential martingale techniques that represents an alternative to the method of mixture. The motivating examples of bandit problems and context tree estimation are detailed.
*MCUIUC - A New Framework for Metagenomic Read Compression*- Metagenomics is an emerging field of molecular biology concerned with analyzing the genomes of environmental samples comprising many different diverse organisms. Given the nature of metagenomic data, one usually has to sequence the genomic material of all organisms in a batch, leading to a mix of reads coming from different DNA sequences. In deep high-throughput sequencing experiments, the volume of the raw reads is extremely high, frequently exceeding 600 Gb. With an ever increasing demand for storing such reads for future studies, the issue of efficient metagenomic compression becomes of paramount importance. We present the first known approach to metagenome read compression, termed MCUIUC (Metagenomic Compression at UIUC). The gist of the proposed algorithm is to perform classification of reads based on unique organism identifiers, followed by reference-based alignment of reads for individually identified organisms, and metagenomic assembly of unclassified reads. Once assembly and classification are completed, lossless reference based compression is performed via positional encoding. We evaluate the performance of the algorithm on moderate sized synthetic metagenomic samples involving 15 randomly selected organisms and describe future directions for improving the proposed compression method.
*Sparse Regression Codes: Recent Results and Future Directions*- Sparse Superposition or Sparse Regression codes were recently introduced by Barron and Joseph for communication over the AWGN channel. The code is defined in terms of a design matrix; codewords are linear combinations of subsets of columns of the matrix. These codes achieve the AWGN channel capacity with computationally feasible decoding. We have shown that they also achieve the optimal rate-distortion function for Gaussian sources. Further, the sparse regression codebook has a partitioned structure that facilitates random binning and superposition. In this paper, we review existing results concerning Sparse Regression codes and discuss directions for future research.
*Estimation of transition and stationary probabilities of slow mixing Markov processes*- We observe a length-$n$ sample generated by an unknown, stationary ergodic Markov process (model) over a finite alphabet $\mathcal{A}$. Motivated by applications in what are known as backplane channels, given any string w of symbols from $\mathcal{A}$ we want estimates of the conditional probability distribution of symbols following w (model parameters), as well as the stationary probability of w. Two distinct problems that complicate estimation in this setting are (i) long memory, and (ii) slow mixing which could happen even with only one bit of memory. Any consistent estimator can only converge pointwise over the class of all ergodic Markov models. Namely, given any estimator and any sample size $n$, the underlying model could be such that the estimator performs poorly on a sample of size $n$ with high probability. But can we look at a length-$n$ sample and identify if an estimate is likely to be accurate? Since the memory is unknown a-priori, a natural approach is to estimate a potentially coarser model with memory $k_n = O(\log n)$. As $n$ grows, estimates get refined and this approach is consistent with the above scaling of $k_n$ also known to be essentially optimal. But while effective asymptotically, the situation is quite different when we want the best answers possible with a length-$n$ sample, rather than just consistency. Combining results in universal compression with Aldous' coupling arguments, we obtain sufficient conditions on the length-$n$ sample (even for slow mixing models) to identify when naive (i) estimates of the model parameters and (ii) estimates related to the stationary probabilities are accurate; and also bound the deviations of the naive estimates from true values.

#### ThE3: Source Coding in Networks

*On Existence of Optimal Linear Encoders over Non-field Rings for Data Compression with Application to Computing*- This note proves that, for any finite set of correlated discrete i.i.d.\ sources, there always exists a sequence of linear encoders over some finite non-field rings which achieves the data compression limit, the Slepian-Wolf region. Based on this, we address a variation of the data compression problem which considers recovering some discrete function of the data. It is demonstrated that linear encoder over non-field ring strictly outperforms its field counterpart for encoding some function in terms of achieving strictly larger achievable region with strictly smaller alphabet size.
*Sampling versus Random Binning for Multiple Descriptions of a Bandlimited Source*- Random binning is an efficient, yet complex, coding technique for the symmetric $L$-description source coding problem. We propose an alternative approach, that uses the quantized samples of a bandlimited source as ``descriptions''. By the Nyquist condition, the source can be reconstructed if enough samples are received. We examine a coding scheme that combines sampling and noise-shaped quantization for a scenario in which only $K < L$ descriptions or all $L$ descriptions are received. Some of the received $K$-sets of descriptions correspond to uniform sampling while others to non-uniform sampling. This scheme achieves the optimum rate-distortion performance for uniform-sampling $K$-sets, but suffers noise amplification for nonuniform-sampling $K$-sets. We then show that by increasing the sampling rate and adding a random-binning stage, the optimal operation point is achieved for any $K$-set.
*A Deterministic Annealing Approach to Optimization of Zero-delay Source-Channel Codes*- This paper studies optimization of zero-delay source-channel codes, and specifically the problem of obtaining globally optimal transformations that map between the source space and the channel space, under a given transmission power constraint and for the mean square error distortion. Particularly, we focus on the setting where the decoder has access to side information, whose cost surface is known to be riddled with local minima. Prior work derived the necessary conditions for optimality of the encoder and decoder mappings, along with a greedy optimization algorithm that imposes these conditions iteratively, in conjunction with the heuristic ``noisy channel relaxation'' method to mitigate poor local minima. While noisy channel relaxation is arguably effective in simple settings, it fails to provide accurate global optimization results in more complicated settings including the decoder with side information as considered in this paper. We propose a global optimization algorithm based on the ideas of ``deterministic annealing''-- a non-convex optimization method, derived from information theoretic principles with analogies to statistical physics, and successfully employed in several problems including clustering, vector quantization and regression. We present comparative numerical results that show strict superiority of the proposed algorithm over greedy optimization methods as well as over the noisy channel relaxation.
*Capacity Region of Multi-Resolution Streaming in Peer-to-Peer Networks*- We consider multi-resolution streaming in fully-connected peer-to-peer networks, where transmission rates are constrained by arbitrarily specified upload capacities of the source and peers. We fully characterize the capacity region of rate vectors achievable with arbitrary coding, where an achievable rate vector describes a vector of throughputs of the different resolutions that can be supported by the network. We then prove that all rate vectors in the capacity region can be achieved using pure routing strategies. This shows that coding has no capacity advantage over routing in this scenario.

### 16:50 - 18:10

#### ThF1: Source and Channel Coding

*The Least Degraded and the Least Upgraded Channel with respect to a Channel Family*- Given a family of binary-input memoryless output-symmetric (BMS) channels having a fixed capacity, we derive the BMS channel having the highest (resp.\ lowest) capacity among all channels that are degraded (resp.\ upgraded) with respect to the whole family. We give an explicit characterization of this channel as well as an explicit formula for the capacity of this channel.
*On 2-D Non-Adjacent-Error Channel Models*- In this work, we consider two-dimensional (2-D) binary channels in which the 2-D error patterns are constrained so that errors cannot occur in adjacent horizontal or vertical positions. We consider probabilistic and combinatorial models for such channels. A probabilistic model is obtained from a 2-D random field defined by Roth, Siegel and Wolf (2001). Based on the conjectured ergodicity of this random field, we obtain an expression for the capacity of the 2-D non-adjacent-errors channel. We also derive an upper bound for the asymptotic coding rate in the combinatorial model.
*On the Mutual Information between Random Variables in Networks*- This paper presents a lower bound on the mutual information between any two sets of source/edge random variables in a general multi-source multi-sink network. This bound is useful to derive a new class of better information-theoretic upper bounds on the network coding capacity given existing edge-cut based bounds. A refined functional dependence bound is characterized from the functional dependence bound using the lower bound. It is demonstrated that the refined versions of the existing edge-cut based outer bounds obtained using the mutual information lower bound are stronger.
*A Connection between Good Rate-distortion Codes and Backward DMCs*- Let $X^n\in\mathcal{X}^n$ be a sequence drawn from a discrete memoryless source, and let $Y^n\in\mathcal{Y}^n$ be the corresponding reconstruction sequence that is output by a good rate-distortion code. This paper establishes a property of the joint distribution of $(X^n,Y^n)$. It is shown that for $D>0$, the input-output statistics of a $R(D)$-achieving rate-distortion code converge (in normalized relative entropy) to the output-input statistics of a discrete memoryless channel (dmc). The dmc is ``backward" in that it is a channel from the reconstruction space $\mathcal{Y}^n$ to source space $\mathcal{X}^n$. It is also shown that the property does not necessarily hold when normalized relative entropy is replaced by variational distance.

#### ThF2: Information Theoretic Problems in Computer Science

*Bypassing Correlation Decay for Matchings with an Application to XORSAT*- Many combinatorial optimization problems on sparse graphs do not exhibit the correlation decay property. In such cases, the cavity method remains a sophisticated heuristic with no rigorous proof. In this paper, we consider the maximum matching problem which is one of the simplest such example. We show that monotonicity properties of the problem allows us to define solutions for the cavity equations. More importantly, we are able to identify the `right' solution of these equations and then to compute the asymptotics for the size of a maximum matching. The results for finite graphs are self-contained. We give references to recent extensions making use of the notion of local weak convergence for graphs and the theory of unimodular networks. As an application, we consider the random XORSAT problem which according to the physics literature has a `one-step replica symmetry breaking' (1RSB) glass phase. We derive new bounds on the satisfiability threshold valid for general graphs (and conjectured to be tight).
*Lovasz theta, SVMs and Applications*- The Lovász theta function was introduced to solve the problem of computing the Shannon capacity of the pentagon. It has subsequently become a fundamental tool in information theory, graph theory and combinatorial optimization. However computing it requires solving a SDP, making it hard to scale to large graphs. Recently we discovered a connection with the one class support vector machine (SVM) formulation, which yields an approximation to the Lovász function that can scale to large graphs. Here we review this connection and give a number of applications to fundamental graph theory and data mining problems such as coloring, max--$k$--cut and large sparse/dense subgraphs.
*Local Privacy, Quantitative Data Processing Inequalities, and Statistical Minimax Rates*- Working under local differential privacy---a model of privacy in which data remains private even from the statistician or learner---we study the tradeoff between privacy guarantees and the utility of the resulting statistical estimators. We show that local privacy acts as a type of contraction for information theoretic quantities including mutual information and Kullback-Leibler divergence, which thus influence estimation rates as a function of the amount of privacy preserved. Our bounds can be viewed as quantitative data-processing inequalities. When combined with minimax techniques such as Le Cam's and Fano's methods, these inequalities allow for a precise characterization of statistical rates under local privacy constraints. In this paper, we provide a complete treatment of two canonical problem families: mean estimation in location family models and estimation of multinomial probabilities. For these families, we provide lower and upper bounds that match up to constant factors.
*On the Information Complexity of Cascaded Norms with Small Domains*- We consider the problem of estimating cascaded norms in a data stream, a well-studied generalization of the classical norm estimation problem, where the data is aggregated in a cascaded fashion along multiple attributes. We show that when the number of attributes for each item is at most $d$, then estimating the cascaded norm $L_k \circ L_1$ requires space $\Omega(d n^{1−2/k})$ for every $d = O(n^{1/k})$. This result interpolates between the tight lower bounds known previously for the two extremes of $d = 1$ and $d = \Theta(n^{1/k})$ [1]. The proof of this result uses the information complexity paradigm that has proved successful in obtaining tight lower bounds for several well-known problems. We use the above data stream problem as a motivation to sketch some of the key ideas of this paradigm. In particular, we give a unified and a more general view of the key negative-type inequalities satisfied by the transcript distributions of communication protocols.

#### ThF3: Multicell and Broadcast

*Uplink Multi-Cell Processing: Approximate Sum Capacity under a Sum Backhaul Constraint*- This paper investigates an uplink multi-cell processing (MCP) model where the cell sites are linked to a central processor (CP) via noiseless backhaul links with limited capacity. A simple compress-and-forward scheme is employed, where the base-stations (BSs) quantize the received signals and send the quantized signals to the CP using distributed Wyner-Ziv compression. The CP decodes the quantization codewords first, then decodes the user messages as if the users and the CP form a virtual multiple-access channel. This paper formulates the problem of maximizing the overall sum-rate under a sum backhaul constraint for such a setting. It is shown that setting the quantization noise levels to be uniform across the BSs maximizes the achievable sum rate under high signal-to-noise ratio (SNR). Further, for general SNR a low-complexity fixed-point iteration algorithm is proposed to optimize the quantization noise levels. This paper further shows that with uniform quantization noise levels, the compress-and-forward scheme with Wyner-Ziv compression already achieves a sum-rate that is within a constant gap to the sum capacity of the uplink MCP model. The gap depends linearly on the number of BSs in the network but is independent of the SNR and the channel matrix.
*Multi-Cell Cooperation with Random User Locations under Arbitrary Signaling*- Base station cooperation in cellular networks has been recently recognized as a key technology for mitigating interference, providing thus significant improvements in the system performance. In this paper, we consider a simple scenario consisting of two one-dimensional cells, where the base stations have fixed locations, while the user terminals are randomly distributed on a line. Exploiting the replica method from statistical physics, we derive the ergodic sum-rate under arbitrary signaling for both cooperative and non-cooperative scenarios, when the system size grows large. The obtained results are analytically tractable and can be used to optimize the system parameters in a simple manner. The numerical examples show that the analysis provides good approximations for finite-sized systems.
*Any positive feedback rate increases the capacity of strictly less-noisy broadcast channels*- We propose two coding schemes for discrete memoryless broadcast channels (DMBCs) with rate-limited feedback from only one receiver. For any positive feedback rate and for the class of strictly less-noisy DMBCs, our schemes strictly improve over the no-feedback capacity region.
*On the Role of Interference Decoding in Simultaneous Broadcast Channels*- This work investigates the general three-message compound broadcast channel (BC) where an encoder wishes to communicate a common and two private messages to two groups of users. We focus on the study of the largest achievable region when the encoder is constrained to use a single description per message that we refer to as ``simplified random codes''. In this setting, we investigate the different roles that decoders can play at the destinations. Surprisingly enough, we show that simultaneous decoding of both the intended as well as the interference message at the destinations can strictly enlarge --in opposition with the standard BC-- the rate region with respect to the worst-case (over all channels) of Marton's inner bound.

### 20:30 - 23:00

#### ThG: Conference Banquet

## Friday, September 13

### 08:40 - 09:40

#### FrA: Friday's Plenary

What do the theory of computation, economics and related fields have to say about the emerging phenomena of crowdsourcing and social computing? Most successful applications of crowdsourcing to date have been on problems we might consider "embarrassingly parallelizable" from a computational perspective. But the power of the social computation approach is already evident, and the road cleared for applying it to more challenging problems.

In part towards this goal, for a number of years we have been conducting controlled human-subject experiments in distributed social computation in networks with only limited and local communication. These experiments cast a number of traditional computational problems --- including graph coloring, consensus, independent set, market equilibria, biased voting and network formation --- as games of strategic interaction in which subjects have financial

incentives to collectively "compute" global solutions. I will overview and summarize the many behavioral findings from this line of experimentation, and draw broad comparisons to some of the predictions made by the theory of computation and microeconomics.

### 09:50 - 11:10

#### FrB1: Information Function Computation

*Distributed Function Computation Over a Tree Network*- This paper investigates a distributed function computation setting where the underlying network is a rooted directed tree and where the root wants to compute a function of the sources of information available at the nodes of the network. The main result provides the rate region for an arbitrary function under the assumption that the sources satisfy a general criterion. This criterion is satisfied, in particular, when the sources are independent.
*Two-Partition-Symmetrical Entropy Function Regions*- Consider the entropy function region for discrete random variables $X_i, i \in \mathcal{N}$ and partition $\mathcal{N}$ into $\mathcal{N}_1$ and $\mathcal{N}_2$ with $0 \leq |\mathcal{N}_1| \leq |\mathcal{N}_2|$. An entropy function $h$ is called $(\mathcal{N}_1,\mathcal{N}_2)$-symmetrical if for all $A, B\subset \mathcal{N}$, $h(A) = h(B)$ whenever $|A\cap\mathcal{N}_i| = | B\cap\mathcal{N}_i|$, $ i = 1, 2$. We prove that for $|\mathcal{N}_1| = 0$ or $1$, the closure of the $(\mathcal{N}_1, \mathcal{N}_2)$-symmetrical entropy function region is completely characterized by Shannon-type information inequalities.
*Bounding the Entropic Region via Information Geometry*- This paper suggests that information geometry may form a natural framework to deal with the unknown part of the boundary of entropic region. An application of information geometry shows that distributions associated with Shannon facets can be associated, in the right coordinates, with affine collections of distributions. This observation allows an information geometric reinterpretation of the Shannon-type inequalities as arising from a Pythagorean style relationship. The set of distributions which violate Ingleton's inequality, and hence are linked with the part of the entropic region which is yet undetermined, is shown also to have a surprising affine information geometric structure in a special case involving four random variables and a certain support. These facts provide strong evidence for the link between information geometry and characterizing the boundary of the entropic region.
*Characterization of the Smooth Rényi Entropy Using Majorization*- This paper unveils a new connection between majorization theory and the smooth Rényi entropy of order $\alpha$. We completely characterize the subprobability distribution that attains the infimum included in the definition of the smooth Rényi entropy $H_\alpha^\varepsilon(p)$ of order $\alpha$ by using the notions of majorization and the Schur convexity/concavity, where $p$ denotes a probability distribution on a discrete alphabet and $\varepsilon \in [0, 1)$ is an arbitrarily given constant. We can apply the obtained result to characterization of asymptotic behavior of $\frac{1}{n}H_\alpha^\varepsilon(p)$ as $n \to \infty$ for general sources satisfying the strong converse property.

#### FrB2: Simulation and Approximation

*When is it possible to simulate a DMC channel from another?*- In this paper, we study the problem of simulating a DMC channel from another DMC channel. We assume that the input to the channel we are simulating is i.i.d.\ and that the transmitter and receivers are provided with common randomness at limited rates. We prove bounds for simulating point-to-point, MAC and broadcast channels. As a special case, we recover the achievability part of the result of Cuff for point-to-point channel simulation via a noiseless link and shared randomness.
*Joint Channel Intrinsic Randomness and Channel Resolvability*- This paper investigates the separation of channel intrinsic randomness and channel resolvability. We derive joint exponents, which are compared to the tandem exponents obtained with a separate approach. We prove at once, in a simple manner, achievability results for channel intrinsic randomness, random number generation, and channel resolvability. We also provide converse results in different special settings.
*Fixed-to-Variable Length Resolution Coding for Target Distributions*- The number of random bits required to approximate a target distribution in terms of un-normalized informational divergence is considered. It is shown that for a variable-to-variable length encoder, this number is lower bounded by the entropy of the target distribution. A fixed-to-variable length encoder is constructed using $M$-type quantization and Tunstall coding. It is shown that the encoder achieves in the limit an un-normalized informational divergence of zero with the number of random bits per generated symbol equal to the entropy of the target distribution. Numerical results show that the proposed encoder significantly outperforms the optimal block-to-block encoder in the finite length regime.
*A New Unified Method for Intrinsic Randomness Problems of General Sources*- The purpose of this paper is to establish a new unified method for random number generation from general sources. Specifically, we introduce an alternative definition of the smooth Rényi entropy of order infinity, and show a unified approach to represent the intrinsic randomness in terms of this information quantity. Our definition of the smooth Rényi entropy is easy to calculate for finite block lengths. We also represent $\delta$-intrinsic randomness and the strong converse property in terms of the smooth Rényi entropy.

#### FrB3: Relay Networks

*On the Stability Region of a Relay-Assisted Multiple Access Scheme*- In this paper we study the impact of a relay node in a two-user network. We assume a random access collision channel model with erasures. In particular we obtain an inner and an outer bound for the stability region.
*On information flow and feedback in relay networks*- We consider wireless relay networks where a source node communicates to a destination node with the help of multiple intermediate relay nodes. In wireless, if a node can send information to another node, typically it can also receive information from that node. Therefore, inherently there are many possibilities for feeding back information in wireless networks. However, transmissions are not isolated but usually subject to broadcast and interference. In this paper, we ask the following question: Can the information transfer in both directions of a link be critical to maximizing the end-to-end communication rate in such networks? Equivalently, could one of the directions in each bidirected link (and more generally at least one of the links forming a cycle) be shut down and the capacity of the network still be approximately maintained? Our main result is to show that in any arbitrary Gaussian relay network with bidirected edges and cycles, we can always identify a directed acyclic subnetwork that approximately maintains the capacity of the original network. The edges of this subnetwork can be identified as the information carrying links, and the remaining links as feedback, which can only provide limited contribution to capacity.
*Two-Unicast Two-Hop Interference Network: Finite-Field Model*- In this paper we present a novel framework to convert the $K$-user multiple access channel (MAC) over $\mathbb{F}_{p^m}$ into the $K$-user MAC over ground field $\mathbb{F}_{p}$ with $m$ multiple inputs/outputs (MIMO). This framework makes it possible to develop coding schemes for MIMO channel as done in symbol extension for time-varying channels. Using aligned network diagonalization based on this framework, we show that the sum-rate of $(2m-1)\log{p}$ is achievable for a $2\times 2\times 2$ interference channel over $\mathbb{F}_{p^m}$. We also provide some relation between field extension and symbol extension.
*Cyclic Interference Neutralization on the $2\times 2\times 2$ Full-Duplex Two-Way Relay-Interference Channel*- A two-way relay-interference channel describes a system of four communicating transceivers with two interjacent parallel relays arranged in a bidirectional $2\times 2\times 2$ relay-interference network. Two pairs of transceivers are each communicating bidirectionally with the aid of both relays. All transceivers and relays are assumed to operate in full-duplex mode. Since Interference Neutralization is known as a promising method to achieve the cut-set upper bounds on the data rates of the unidirectional relay-interference channel, we investigate a Cyclic Interference Neutralization scheme on the corresponding bidirectional relay-interference channel w.r.t.\ a conceptual channel model based on a polynomial ring. We show that, if the channel matrix satisfies a certain set of symmetry conditions, a total number of 4 degrees of freedom is asymptotically achievable.

### 11:30 - 12:50

#### FrC1: Error Exponent and Capacity

*On the Average-Listsize Capacity and the Cutoff Rate of Discrete Memoryless Channels with Feedback*- We study the cutoff rate and the average-listsize capacity of discrete memoryless channels (DMCs) with feedback. We show that feedback can increase the average-listsize capacity but not the cutoff rate. For DMCs with positive zero-error capacity, we show that the average-listsize capacity with feedback is equal to the cutoff rate. For all other DMCs, we derive a lower bound on the average-listsize capacity with feedback. The bound is asymptotically tight for low-noise channels. We also show that a multi-letter version of Forney's lower bound on the average-listsize capacity of DMCs without feedback is asymptotically tight.
*Improved Capacity Lower Bounds for Channels with Deletions and Insertions*- New lower bounds are obtained for the capacity of a binary channel with deletions and insertions. Each input bit to the channel is deleted with probability $d$, or an extra bit is inserted after it with probability $i$, or it is transmitted unmodified with probability $1 − d − i$. This paper builds on the idea introduced in [1] of using a sub-optimal decoder that decodes the positions of the deleted and inserted runs, in addition to the transmitted codeword. The mutual information between the channel input and output sequences is expressed as the sum of the rate achieved by this decoder and the rate loss due to its suboptimality. The main contribution is an analytical lower bound for the rate loss term which leads to an improvement in the capacity lower bound of [1]. For the special case of the deletion channel, the new bound is larger than the previous best lower bound for deletion probabilities up to $0.3$.
*$\varepsilon$-Capacity and Strong Converse for Channels with General State*- We consider state-dependent memoryless channels with general state available at both encoder and decoder. We establish the $\varepsilon$-capacity and the optimistic $\varepsilon$-capacity. This allows us to prove a necessary and sufficient condition for the strong converse to hold. We also provide a simpler sufficient condition on the first- and second-order statistics of the state process that ensures that the strong converse holds.

#### FrC2: Topics in Compression

*Compression of Noisy Signals with Information Bottlenecks*- We consider a novel approach to the information bottleneck problem where the goal is to perform compression of a noisy signal, while retaining a significant amount of information about a correlated auxiliary signal. To facilitate analysis, we cast compression with side information as an optimization problem involving an information measure, which for jointly Gaussian random variables equals the classical mutual information. We provide closed form expressions for locally optimal linear compression schemes; in particular, we show that the optimal solutions are of the form of the product of an arbitrary full-rank matrix and the left eigenvectors corresponding to smallest eigenvalues of a matrix related to the signals' covariance matrices. In addition, we study the influence of the sparsity level of the Bernoulli-Gaussian noise on the compression rate. We also highlight the similarities and differences between the noisy bottleneck problem and canonical correlation analysis (CCA), as well as the Gaussian information bottleneck problem.
*Distortion Minimization in Layered Broadcast Transmission of a Gaussian Source Over Rayleigh Channels*- We consider the problem of minimizing the expected distortion in the multilayer transmission of a Gaussian source using the broadcast approach with successive information enhancement. This minimization is contingent on the jointly optimal choice of the rates and power ratios of the different layers. This problem was tackled in the literature with the assumption that the fading channel has a finite number of states and the number of source layers matches the number of channel states. In this paper, we provide a more generic solution for a continuous Rayleigh fading channel, and for any predetermined number of layers. We prove that the primal optimization problem has a strong duality with the Lagrangian dual problem. Consequently, we propose a two-dimensional bisection search algorithm that can, for any number of layers, find the optimal solution of the dual problem which will be the same as the optimal solution of the primal problem. The complexity of the search algorithm has linear order with respect to the number of layers. We provide numerical results for the optimal rate and power allocation. Moreover, we show that with a small number of layers, we can approach the distortion lower bound that is achieved by transmitting an infinite number of layers.
*Sparse Signal Processing with Linear and Non-Linear Observations: A Unified Shannon Theoretic Approach*- In this work we derive fundamental limits for many linear and non-linear sparse signal processing models including group testing, quantized compressive sensing, multivariate regression and observations with missing features. In general sparse signal processing problems can be characterized in terms of the following Markovian property. We are given a set of $N$ variables $X_1,X_2,\ldots,X_N$, and there is an unknown subset $S \subset \{1,\,2,\,\ldots, N\}$ that are relevant for predicting outcomes/outputs $Y$. In other words, when $Y$ is conditioned on $\{X_n\}_{n\in S}$ it is conditionally independent of the other variables, $\{X_n\}_{n \not \in S}$. Our goal is to identify the set $S$ from samples of the variables $X$ and the associated outcomes $Y$. We characterize this problem as a version of the noisy channel coding theorem. Using asymptotic information theoretic analyses, we describe mutual information formulas that provide sufficient and necessary conditions on the number of samples required to successfully recover the salient variables. This mutual information expression unifies conditions for both linear and non-linear observations. We then compute sample complexity bounds for the aforementioned models, based on the mutual information expressions.
*Rate-Distortion Bounds for an Epsilon-Insensitive Distortion Measure*- Direct evaluation of the rate-distortion function has rarely been achieved when it is strictly greater than its Shannon lower bound. In this paper, we consider the rate- distortion function for the distortion measure defined by an $\varepsilon$-insensitive loss function. We first present the Shannon lower bound applicable to any source distribution with finite differential entropy. Then, focusing on the Laplacian and Gaussian sources, we prove that the rate-distortion functions of these sources are strictly greater than their Shannon lower bounds and obtain analytic upper bounds for the rate-distortion functions. Small distortion limit and numerical evaluation of the bounds suggest that the Shannon lower bound provides a good approximation to the rate-distortion function for the $\varepsilon$-insensitive distortion measure.

#### FrC3: Capacity of Relay Networks

*Improved Capacity Approximations for Gaussian Relay Networks*- Consider a Gaussian relay network where a number of sources communicate to a destination with the help of several layers of relays. Recent work has shown that a compress-and-forward based strategy at the relays can achieve the capacity of this network within an additive gap. In this strategy, the relays quantize their observations at the noise level and map it to a random Gaussian codebook. The resultant gap is independent of the SNR's of the channels in the network but linear in the total number of nodes. In this paper, we show that if the relays quantize their signals at a resolution decreasing with the number of nodes in the network, the additive gap to capacity can be made logarithmic in the number of nodes for a class of layered, time-varying wireless relay networks. This suggests that the rule-of-thumb to quantize the received signals at the noise level used for compress-and- forward in the current literature can be highly suboptimal.
*Gaussian Half-Duplex Relay Networks: Improved Gap and a Connection with the Assignment Problem*- This paper studies a Gaussian relay network, where the relays can either transmit or receive at any given time, but not both. Known upper (cut-set) and lower (noisy network coding) bounds on the capacity of a memoryless full-duplex relay network are specialized to the half-duplex case and shown to be to within a constant gap of one another. For fairly broad range of relay network sizes, the derived gap is smaller than what is known in the literature, and it can be further reduced for more structured networks such as diamond networks. It is shown that the asymptotically optimal duration of the listen and transmit phases for the relays can be obtained by solving a linear program; the coefficients of the linear constraints of this linear program are the solution of certain `assignment problems' for which efficient numerical routines are available; this gives a general interesting connection between the high SNR approximation of the capacity of a MIMO channel and the `assignment problem' in graph theory. Finally, some results available for diamond networks are extended to general networks. For a general relay network with 2 relays, it is proved that, out of the 4 possible listen/transmit states, at most 3 have a strictly positive probability. Numerical results for a network with $K − 2 < 9$ relays show that at most $K−1$ states have a strictly positive probability, which is conjectured to be true for any number of relays.
*Capacity Region of a Class of Interfering Relay Channels*- This paper studies a new model for cooperative communication, the interfering relay channels. We show that the hash-forward scheme introduced by Kim for the primitive relay channel is capacity achieving for a class of semideterministic interfering relay channels. The obtained capacity result generalizes and unifies earlier capacity results for a class of primitive relay channels and a class of deterministic interference channels.
*The Deterministic Capacity of Relay Networks with Relay Private Messages*- We study the capacity region of a deterministic 4-node network, where 3 nodes can only communicate via the fourth one. However, the fourth node is not merely a relay since it can exchange private messages with all other nodes. This situation resembles the case where a base station relays messages between users and delivers messages between the backbone system and the users. We assume an asymmetric scenario where the channel between any two nodes is not reciprocal. First, an upper bound on the capacity region is obtained based on the notion of single sided genie. Subsequently, we construct an achievable scheme that achieves this upper bound using a superposition of broadcasting node 4 messages and an achievable ``detour'' scheme for a reduced 3-user relay network.