Adaptive and Flexible Model-Based AI for Deep
Receivers in Dynamic Channels
Tomer Raviv, Sangwoo Park, Osvaldo Simeone, Yonina C. Eldar, and Nir Shlezinger
Abstract—Artificial intelligence (AI) is envisioned to play
a key role in future wireless technologies, with deep neural
networks (DNNs) enabling digital receivers to learn to operate in
challenging communication scenarios. However, wireless receiver
design poses unique challenges that fundamentally differ from
those encountered in traditional deep learning domains. The
main challenges arise from the limited power and computational
resources of wireless devices, as well as from the dynamic nature
of wireless communications, which causes continual changes
to the data distribution. These challenges impair conventional
AI based on highly-parameterized DNNs, motivating the devel-
opment of adaptive, flexible, and light-weight AI for wireless
communications, which is the focus of this article. Here, we
propose that AI-based design of wireless receivers requires
rethinking of the three main pillars of AI: architecture, data, and
training algorithms. In terms of architecture, we review how to
design compact DNNs via model-based deep learning. Then, we
discuss how to acquire training data for deep receivers without
compromising spectral efficiency. Finally, we review efficient,
reliable, and robust training algorithms via meta-learning and
generalized Bayesian learning. Numerical results are presented
to demonstrate the complementary effectiveness of each of the
surveyed methods. We conclude by presenting opportunities for
future research on the development of practical deep receivers.
I. INTRODUCTION
Wireless communication technologies are subject to esca-
lating demands for connectivity and throughput, with rapid
growth in media transmissions, including images, videos, and,
in the near future, augmented and virtual reality. Furthermore,
transformative applications such as the Interent of Things
(IOT), autonomous driving, and smart manufacturing are ex-
pected to play major roles in the new 5G-defined deployments
of ultra-reliable and low-latency communication (URLLC) and
massive machine-type communications (mMTC) services. To
accommodate these scenarios, communication systems must
meet strict performance requirements in reliability, latency, and
complexity [1].
To facilitate meeting these performance requirements,
emerging technologies such as mmWave and THz commu-
nication, holographic multiple-input multiple-output (MIMO),
This project has received funding from the Israeli 5G-WIN consortium,
the European Union’s Horizon 2020 research and innovation program un-
der grants No. 646804-ERC-COG-BNYQ, as well as 725731, and by the
European Union’s Horizon Europe project CENTRIC (101096379). Sup-
port is also acknowledged for the Israel Science Foundation under grant
No. 0100101, and for an Open Fellowship of the EPSRC with refer-
ence EP/W024101/1. T. Raviv and N. Shlezinger are with the School
of ECE, Ben-Gurion University of the Negev, Beer-Sheva, Israel (e-mail:
tomerravi[email protected]; [email protected]). S. Park and O. Simeone are
with the Department of Engineering, King’s College London, U.K. (email:
{sangwoo.park; osvaldo.simeone}@kcl.ac.uk). Y. C. Eldar is with the Faculty
of Math and CS, Weizmann Institute of Science, Rehovot, Israel (e-mail:
spectrum sharing, and intelligent reconfigurable surfaces
(IRSs) are currently being investigated. While these tech-
nologies may support desired performance levels, they also
introduce substantial design and operating complexity [1]. For
instance, holographic MIMO hardware is likely to introduce
non-linearities on transmission and reception; the presence of
IRSs complicates channel estimation; and classical communi-
cation models may no longer apply in novel settings such as
the mmWave and THz spectrum, due to violations of far-field
assumptions and lossy propagation. This paper addresses the
latter source of complexity by focusing on efficient design of
receiver processing.
Traditional receiver processing design is model-based, re-
lying on simplified channel models, which, as mentioned,
may no longer be adequate to meet the requirements of next-
generation wireless systems. The rise of deep learning as an
enabler technology for artificial intelligence (AI) has revo-
lutionized various disciplines, ranging from computer vision
and natural language processing (NLP) to speech refinement
and biomedical signal processing. The ability of deep neural
networks (DNNs) to learn mappings from data has spurred
growing interest in their usage for receiver design in digital
communications [2], [3]. DNN-aided receivers, referred to
henceforth as deep receivers, have the ability to succeed where
classical algorithms may fail. Specifically, deep receivers
can learn a detection function in scenarios having no well-
established physics-based mathematical model, a situation
known as model-deficit; or in settings for which the model
is too complex to give rise to tractable and efficient model-
based algorithms, a situation known as algorithm-deficit.
Consequently, deep receivers have the potential to meet the
constantly growing requirements of wireless systems.
Despite their promise, several core challenges arise from
the fundamental differences between wireless communications
and traditional AI domains such as computer vision and
NLP, limiting the widespread applicability of deep learning in
wireless communications. The first challenge is attributed to
the nature of the devices employed in communication systems.
Wireless communication receivers are highly constrained in
terms of their computational ability, battery consumption, and
memory resources. However, deep learning inherently relies
on highly parameterized architectures, assuming the avail-
ability of powerful devices, e.g., high-performance computing
servers.
A second challenge stems from the nature of the wire-
less communication domain. Communication channels are
dynamic, implying that the receiver task, dictated by the
data distribution, changes over time. This makes the standard
1
arXiv:2305.07309v1 [cs.IT] 12 May 2023
Pillar Method Literature
Architecture
Deep unfolding [4]–[6]
DNN-aided inference [7]
Data
Self-supervised training [7]–[10]
Data augmentation [11], [12]
Training
Algorithm
Meta-learning [13]–[16]
Bayesian learning [17], [18]
Modular training [13]
Fig. 1: A summary of methods surveyed in this article that adapt the three pillars of AI to the requirements of deep wireless
receivers.
pipeline of data collection, annotation, and training highly
challenging. Specifically, DNNs rely on (typically labelled)
data sets to learn from the underlying unknown, but stationary,
data distributions. For example, machine translation tasks,
requiring the mapping of an origin language into a destination
language, do not change over time, enabling the collection of
a large volume of training data and the deployment of a pre-
trained, static, DNN. This is not the case for wireless receivers,
whose processing task depends on the time-varying channel,
restricting the size of the training data set representing the
task.
The two challenges outlined above imply that successfully
applying AI for wireless receiver design requires deviating
from conventional deep learning approaches. To this end, there
is a need to develop communication-oriented AI techniques,
which are the focus of this article. Previous tutorials on AI for
communications, e.g., [2], [3], have primarily concentrated on
surveying diverse challenges and applications of conventional
deep learning methods in the context of communication. In
contrast, the present article aims to review approaches that
address the unique challenges in the design of deep receivers
that arise from the mentioned limitations of wireless devices
and from the dynamic nature of the communication domain.
Our main objective is to provide a systematic review of
research directions that target the practical deployment of deep
receivers.
We commence by motivating the development of AI systems
that are light-weight, and thus operable on power and hardware
limited devices, as well as adaptive and flexible, enabling
online on-device adaptation. As illustrated in Fig. 1, we
then propose that AI-based wireless receiver design requires
revisiting the three main pillars of AI, namely (i) the archi-
tecture of AI models; (ii) the data used to train AI models;
and (iii) the training algorithm that optimizes the AI model
for generalization, i.e., to maximize performance outside the
training set (either on the same distribution or for a completely
new one).
For each of these AI pillars, we survey candidate approaches
for facilitating the operation of the deep receivers. (i) We first
discuss how to design light-weight trainable architectures via
model-based deep learning [19]. This methodology hinges on
the principled incorporation of model-based processing, ob-
tained from domain knowledge on optimized communication
algorithms, within AI architectures. (ii) Next, we investigate
how labelled data can be obtained without impairing spectral
efficiency, i.e., without increasing the pilot overhead. To this
end, we show how receivers can generate labelled data by self-
supervision aided by existing communication algorithms; and
how they can further enrich data sets via data augmentation
techniques that utilize invariance properties of communication
systems. (iii) Finally, we cover training algorithms for deep
receivers that are designed to meet requirements in terms of
efficiency, reliability, and robust adaptation of wireless com-
munication systems, avoiding overfitting from limited training
data while limiting training time. These methods include
communication-specific meta-learning as well as generalized
Bayesian learning and modular learning.
To illustrate the individual and complementary gains of the
reviewed approaches, we provide a numerical study consider-
ing finite-memory single-input single-output (SISO) channels
as well as multi-user MIMO systems. We conclude by dis-
cussing the road ahead, as well as key research challenges
that are yet to be addressed to enable adaptive and flexible
light-weight deep receivers.
II. DEEP RECEIVERS IN DYNAMIC CHANNELS
As discussed in the previous section, harnessing the
potential of deep learning in wireless systems requires
communication-specific AI schemes that are adaptive, flexible,
and light-weight. The light-weight requirement follows from
the power and computational constraints of wireless devices;
while the need for adaptivity and flexibility is entailed by the
dynamic nature of wireless channels. Classical model-based
receiver processing is inherently adaptive and flexible: The
receiver periodically estimates the channel using the available
pilots, and then uses this estimate to adapt the operation of
the receiver baseband chain, which is a direct function of
the channel coefficients. In contrast, for deep receivers, the
dependence of the weights of the DNN on the channel state
is indirect, and hence designing flexible, channel-adaptive,
DNNs-based processing is a non-trivial task.
Current state of the art on deep receivers encompasses the
following three main approaches to address channel variations.
A1 Joint Learning: The most straightforward approach
amounts to optimizing a single DNN model to maximize
performance on average over a broad range of channel
2
Fig. 2: Overall illustration of online training of deep receivers in time-varying channels.
conditions. Methods in this class train a DNN using data
corresponding to an extensive set of expected channel
realizations, aiming to learn a mapping that is tailored
to the distribution of the channel [20]. Accordingly,
joint learning may be thought of as seeking the optimal
non-coherent receiver, which is agnostic to the current
channel realization. As a result, performance degradation
as compared to a coherent receiver is generally to be
expected.
A2 Channel as Input: An alternative approach uses an instan-
taneous estimate of the channel as an additional input
to the DNN [21]. Among the main drawbacks of this
approach are the limited flexibility in accommodating
different system dimensions, e.g., number of antennas or
number of users, and the lack of structure in the way
different inputs, such as received signals and channel state
information, are handled.
A3 Online Training: As illustrated in Fig. 2, in online train-
ing, decoded data from prior blocks is used, alongside
new pilots, to adapt the deep receiver to channel varia-
tions. This class of approaches inherits the limitations of
continual learning, such as catastrophic forgetting, and is
generally not suitable to ensure fast adaptation.
The mentioned shortcomings of the three existing ap-
proaches reviewed above motivate a fundamental rethinking of
the application of machine learning tools to wireless receivers
along the three directions illustrated in Fig. 1.
The architecture of the DNN should be carefully selected
on the basis of domain knowledge so as to reduce data re-
quirements, while also ensuring efficient implementation
of the model. This amounts to improvements in terms of
the inductive bias on which learning is based.
The data used for learning should be augmented, when
possible, by leveraging the inherent redundancies of en-
coded signals.
The training algorithm should make use of historical
data while also preparing for quick adaptation to changing
channel conditions.
In the following sections we review candidate approaches for
each of these aspects, as summarized in Fig. 1.
III. ARCHITECTURE
The standard neural architectures employed in AI systems
for communication are based on highly-parameterized, un-
structured, deep neural models such as feed-forward neural
networks. The over-parameterization has been found to be
advantageous in a host of other tasks, such as NLP. However,
since deep receivers should adapt to time-varying conditions
using limited training data, this type of architectures is typi-
cally undesirable. In this section, we introduce ways to design
tailored model architectures by leveraging domain knowledge
with the goal of improving adaptivity and data efficiency. In
Sec. V, we will also study data-driven approaches for the
optimization of the inductive bias also known as meta-
learning and see how they can be combined with model-
driven architectures introduced in this section to further reduce
the generalization gap.
In model-based deep learning, DNN architectures are de-
signed that are inspired by model-based algorithms tailored
to the particular problem of interest [19]. In the context
of deep receivers, the dominant model-based deep learning
methodologies are deep unfolding and DNN-aided inference,
which are illustrated in Fig. 3 and discussed next.
Many model-based algorithms used by wireless receivers
rely on iterative optimizers that operate by gradually improv-
ing an optimization variable based on an objective function.
Deep unfolding converts an iterative optimizer into a discrim-
inative AI model by introducing trainable parameters within
each of a fixed number of iterations [19]. Training a deep
unfolding architecture can thus adapt an iterative optimizer on
the basis of available data for a given problem of interest. As
3
Fig. 3: Illustration of model-based, data-driven, and model-based deep learning framework for deep receivers.
we detail next, the aim is addressing model and/or algorithmic
deficiencies of the original algorithm.
Specifically, deep unfolding enhances iterative optimizers in
the following ways (see [19] for further details).
Learned Hyperparameters: Iterative optimizers often in-
cludes hyperparameters, such as step-sizes, damping fac-
tors, and regularization coefficients, that are typically
tuned by hand by the designer and shared among all
iterations. Deep unfolding can treat such hyperparameters
as trainable parameters. This is useful to cope with forms
of algorithm deficiency, whereby an iterative algorithm
requires too many iterations or struggles to converge to a
suitable decision. For example, the work [4] showed that
unfolding the orthogonal approximate message passing
algorithm for MIMO detection, and learning iteration-
dependent scaling coefficients, notably improves perfor-
mance, requiring only a few iterations.
Learned Objective: Deep unfolding can also enhance an
iterative algorithm by tuning the objectives function ap-
proximately optimized at each iteration. This optimization
addresses algorithm deficiencies, in a manner similar to
the optimization of hyperparameters; as well as model
deficiencies by adapting the design criterion to observed
data, rather than to assumptions about the model. A
representative example is the MMNet architecture pro-
posed in [5] for unfolding MIMO detection. MMNet,
which is based on proximal gradient steps, parameterizes
the gradient computation procedure at each iteration,
effectively using an iteration-dependent design objective.
DNN Conversion: The third form of deep unfolding incor-
porates a full DNN module within each iteration in order
to implement some functionality of the solver in the most
flexible manner. DNN conversion is suitable for handling
model deficiency, since the DNN modules can learn how
to best realize model-independent internal computations
at each iteration. For instance, DeepSIC proposed in [6]
is derived from the iterative soft interference cancellation
(SIC) MIMO detection algorithm with the introduction of
DNN models for implementing each stage of interference
cancellation and soft detection in a manner agnostic to the
underlying channel model.
DNN-Aided inference refers to a family of model-based
deep learning methods that incorporate DNNs into model-
based methods that do not implement iterative processing. A
representative example is the ViterbiNet equalizer proposed
in [7]. Viterbi equalization is applicable to any finite-memory
channel, as long as one can compute the conditional distri-
bution of channel output given the corresponding input, also
known as likelihood. Based on this observation, ViterbiNet
implements the Viterbi algorithm while using a DNN to
compute the likelihood. In this way, ViterbiNet addresses
model deficiencies by operating in a channel-model-agnostic
manner and requiring only the conventional finite-memory
modelling assumption to hold.
IV. DATA
The amount of data obtained from pilots is typically insuf-
ficient to train an AI model for a deep receiver. This motivates
the introduction of strategies that expand the available labelled
training data set without requiring the transmission of more
pilots. As we detail in this section, existing techniques apply
either self-supervised learning or data augmentation.
With self-supervised learning, training data is extended
using the redundancy of transmitted signals either at the
symbol level or at the codeword level. In contrast, in data
augmentation, the goal is to enrich the given labelled data set
by leveraging invariance properties of the data. As summarized
in Fig. 4, these approaches can be potentially combined, and
integrated with a number of different architectures (Sec. III)
and training algorithms (Sec. V).
Codeword-level self-supervision exploits the presence of
channel coding to generate labelled data from channel outputs.
It uses error correction codes to correct detection errors, and
then utilizes the corrected data as labelled data for training, as
long as the codewords are decoded successfully [7]–[9].
4
Fig. 4: Data acquisition pipeline for deep receivers without
harming spectral efficiency.
Symbol-level self-supervision obtains labelled data from
information symbols without relying on channel decoding.
This is useful since some symbols can be correctly detected
even the decoding on the overall codeword fails. Symbol-level
self-supervision hence requires reliable soft detection measures
to indicate the degree to which each information symbol may
be considered to be correctly received [10].
Data augmentation is an established framework in conven-
tional AI domains to enrich training sets by leveraging known
invariances in the data. For instance, for image classification,
one can use a single image to generate multiple images
with the same label by rotating or clipping it. While data
augmentation is quite common in AI, it is highly geared to-
wards image and language data. Data augmentations for digital
communications have been explored in [11], and more recently
in [12]. The techniques studied in [12] include leveraging
the symmetry in digital constellations to project error pat-
terns between different symbols; exploiting the independence
between the noise and the transmitted symbols to generate
additional noisy realizations; and accounting for forms of
invariance to constellation-preserving rotations exhibited by
wireless channels.
V. TRAINING
The training algorithm addresses the optimization of a data-
dependent loss function, with the goal of identifying models
with satisfactory generalization performance. The performance
of a training algorithm depends, in practice, on (i) the choice of
the loss function; (ii) the optimization algorithm; and (iii) the
relevance and quality of the data used to evaluate the train-
ing loss. In this section, we review communication-oriented
approaches for designing adaptive, data-efficient, training al-
gorithm for deep receivers based on (i) meta-learning, (ii)
generalized Bayesian learning, and (iii) modular learning.
Meta-learning is a general framework that seeks to obtain a
data-efficient training procedure that is applicable for multiple
tasks of interest [15]. A training procedure that is data-, or
sample-, efficient is able to achieve a small generalization
gap, while using a small amount of training data. Meta-
learning and model-based learning (see Sec. III) are two
complementary approaches that reduce the generalization gap
under a fixed amount of training data: The former is data-
driven and typically optimizes the training algorithm; while
the latter is model-driven and optimizes the architecture. While
meta-learning encompasses a variety of conceptually distinct
methods, the prominent approaches for application to deep
receivers are gradient-based meta-learning and hypernetwork-
based meta-learning.
Gradient-based meta-learning: Gradient-based meta-
learning optimizes some of the hyperparameters of a
first-order training algorithm. While in principle, one
could “meta-learn” any hyperparameter, such as the learn-
ing rate, optimizing the initial weights of the DNNs
has been found to be extremely beneficial for boosting
adaptation and flexibility of training procedures in many
applications, including in wireless communications [15].
DNN initialization is a form of inductive bias, since the
parametric function space of the DNN becomes restricted
by enforcing adherence to the initialization through a
limited number of gradient-based updates. Meta-learning
can be combined with a model-based inductive bias, as
demonstrated in [13].
Hypernetwork-based meta-learning: Gradient-based
meta-learning requires running a number of (stochastic)
gradient updates. An alternative approach that does
not require in real-time any additional optimization for
adaptation to new tasks incorporates a hypernetwork in
the system, alongside the main DNN. The hypernetwork
takes as input the available data, or any other context
information, regarding the task of interest, and produces
at the output the weights of the main DNN. More
precisely, typically, only a subset of weights of the
main DNN are updated; and/or each output of the
hypernetwork affects simultaneously a group of weights,
e.g., in the same layer, of the main DNN. Hypernetwork-
based meta-learning has been applied successfully
in wireless communication systems, including for
beamforming and MIMO detection [14], [16].
Bayesian learning is the gold standard for training strate-
gies that aim at producing AI models offering a reliable as-
sessment of the uncertainty of their decisions. Such reliable AI
models must output confidence measures that reflect the true
accuracy of their decisions. Bayesian learning boosts reliability
by treating the model parameters as random variables, and
by accordingly maintaining a distribution over the weights
of a DNN. This distribution is meant to capture epistemic
uncertainty in the presence of limited training data.
Bayesian learning involves particle-based, deterministic or
stochastic, procedures, or optimization over the parameters of
the distribution in the model parameter space. Such optimiza-
tion addresses a training criterion that includes an information-
theoretic regularizer enforcing closeness to a prior distribution.
For deep receivers, boosting the reliability of a DNN
model allows the latter to provide informative soft decision
to downstream DNN or model-based modules, e.g., for soft
decoding. This makes it possible for the different modules of
a deep receiver to “trust” the outputs of other modules [18].
Generalized forms of Bayesian learning allow for a flexible
choice of the regularization function, as well as of the data-
5
fitting part of the training objective. Such methods were shown
to be useful in wireless systems for their capacity to deal with
model misspecification and outliers [17].
Modular learning exploits the interpretable structure of
hybrid model-based deep receivers to facilitate rapid learning
from limited data. As opposed to meta-learning and Bayesian
learning, modular learning is specific to model-based deep
learning architectures. It builds on the fact that, unlike black-
box DNNs, in model-based deep learning architectures, one
can often assign a concrete functionality to different trainable
sub-modules of the architecture, and not just to its input and
output. Each functionality may then be adapted at different
rates and times, as some functionalities may require rapid
adaptation, while the others may be kept unchanged over a
longer time scale.
This approach was applied in [13] for online adaptation
of the DeepSIC MIMO receiver of [6]. There, the ability
to associate different users with sub-modules of the deep
receivers was leveraged to carry out the online training of
sub-modules associated with users that are identified as being
characterized by faster dynamics. The method was shown to
dramatically reduce the number of gradient-based updates and
the amount of data needed for online training.
VI. NUMERICAL RESULTS
In this section we showcase the impact of schemes designed
to facilitate light-weight, adaptive, and flexible AI across the
three AI pillars highlighted throughout this article. We focus
on finite-memory SISO channels (with 4 taps) and memoryless
4 × 4 multi-user MIMO time-varying channels with binary
phase shift keying (BPSK) and quadrature phase shift keying
(QPSK) symbols, respectively
1
.
Architecture: In each channel, we consider a model-based
DNN architecture, as well as black-box DNN, having roughly
three times more parameters. For the SISO channel, with a
finite channel memory of L symbols, we compare ViterbiNet
[7] with a recurrent neural network (RNN)-based symbol
detector with a window size of L, followed by a linear
layer and the softmax function. For the MIMO channel, the
DeepSIC receiver [6] with three iterations is compared to a
fully connected DNN composed of four layers with ReLU
activations followed by the softmax layer.
Data: For each coherence duration, 200 pilot symbols are
available. We compare standard training with training that
leverages data augmentation. For the latter scheme, at each
time step, the pilot data is enriched with 600 artificial symbols
via a constellation-conserving projection and a translation-
preserving transformation [12].
Training: We consider the following training methodolo-
gies:
Joint training: The receiver is trained offline, using 5000
symbols simulated from a multitude of channel realiza-
tions. No additional training is done at run time.
Online training: The receiver is trained initially using 200
symbols, and then it adapts online by utilizing either the
pilot data or the augmented pilot data.
1
The source code used in our experiments is available at
https://github.com/tomerraviv95/facilitating-adaptation-deep-receivers
Online meta-learning: The training algorithm is opti-
mized via meta-learning that utilizes accumulated training
data from previous channel realizations, while adaptation
takes place via few gradient-based updates from the
online meta-learned initialization [13].
Results: Fig. 5a and Fig. 5b depict the average symbol error
rate (SER) as a function of signal-to-noise ratio (SNR). While
standard black-box models suffer from large generalization
gaps due to the limited availability of training data, deep
receivers with model-based architectures, namely ViterbiNet
[7] and DeepSIC [6], demonstrate successful detection perfor-
mance by adapting to the time-varying channel in an online
manner.
The performance is seen to be further improved by opti-
mizing the training algorithm via meta-learning, as well as by
increasing the data size via data augmentation. Overall, these
results indicate that the reviewed methods are complementary,
contributing to the challenges of adapting to time-varying
channels in different ways. This leads to the conclusion that
designing AI models for communications can benefit from a
rethinking of deep learning tools across all three AI pillars.
VII. FUTURE RESEARCH DIRECTIONS
We conclude by identifying some representative directions
for future research.
A. Deciding When to Train
The schemes surveyed thus far are geared towards enabling
efficient online on-device training. In this regard, a key open
question is how to determine when to train online. Periodically
re-training, e.g., at each coherence period, may be excessively
complex, particularly when channel variations are relatively
smooth. Efficient deep receiver would benefit from monitoring
mechanisms that can determine when to adapt the model
and/or when to meta-learn the inductive bias. One possible
way to achieve this goal is via data drift detection, a topic
widely studied in the machine learning literature [22]. While
some basic drift detection mechanisms can be directly applied
to communication systems, advanced mechanisms that lever-
age communication-specific characteristics may require further
development.
B. Fitting the Architecture to the Scenario
Deep receivers are often composed of multiple layers,
wherein each element takes part in the computation. Thus,
even for relatively light-weight architectures, full model com-
putations may incur computational overhead exceeding the
limited resources available, particularly for some edge devices.
This problem is typically tackled via pruning methods, which
aim to bridge the complexity-performance gap by removing
redundant parts of the models. While most existing pruning
methods find a single, input-independent, light-weight model,
for wireless communication systems it may be preferable to
adopt input-dependent, adaptive pruning methods [23], that
can adapt complexity to the current requirements.
6
(a) SISO - SER as SNR. (b) MIMO - SER vs. SNR.
Fig. 5: Average SER after transmission of 300 blocks in a time-varying channel as a function of SNR.
C. Hardware-Aware AI
The schemes surveyed in this paper do not make use of any
special characteristics of the hardware available at the host
device, focusing instead of generic improvements based on
limiting the architecture parameterization and/or the number
of training iterations. Larger efficiency gains are to be expected
with AI methods that are aware of the specific hardware at the
wireless receiver, which may encompass different technologies
such as emerging in-memory computing chips.
D. Continual Bayesian Learning
Bayesian learning was introduced in this article as a promis-
ing solution for deep receivers thanks to the potential gains
that are enabled by the deployment of more reliable AI
modules. Another important advantage of Bayesian learning
is its capacity to support continual learning via the update
of the model parameter distribution [24]. The integration of
online adaptation with Bayesian learning may further enhance
the performance of deep receivers.
E. Collaborative Learning and Inference
As mentioned, deep receivers are practically constrained
by the hardware available at the host device. Since deep
receivers are likely to be deployed in environments containing
other, similar, devices, this limitation may be mitigated via
resource sharing across devices. Such collaboration may entail
the exchange of data and/or model information, and it may
be supported by device-to-device communication capabilities.
This idea is deeply connected to federated learning [25] and
collaborative inference [26].
REFERENCES
[1] “6G - the next hyper connected experience for all, Samsung 6G Vision,
2020.
[2] L. Dai, R. Jiao, F. Adachi, H. V. Poor, and L. Hanzo, “Deep learning
for wireless communications: An emerging interdisciplinary paradigm,
IEEE Wireless Commun., vol. 27, no. 4, pp. 133–139, 2020.
[3] W. Tong and G. Y. Li, “Nine challenges in artificial intelligence and
wireless communications for 6G, IEEE Wireless Commun., vol. 29,
no. 4, pp. 140–145, 2022.
[4] H. He, C.-K. Wen, S. Jin, and G. Y. Li, “Model-driven deep learning for
MIMO detection, IEEE Trans. Signal Process., vol. 68, pp. 1702–1715,
2020.
[5] M. Khani, M. Alizadeh, J. Hoydis, and P. Fleming, Adaptive neural
signal detection for massive MIMO, IEEE Trans. Wireless Commun.,
vol. 19, no. 8, pp. 5635–5648, 2020.
[6] N. Shlezinger, R. Fu, and Y. C. Eldar, “DeepSIC: Deep soft interference
cancellation for multiuser MIMO detection, IEEE Trans. Wireless
Commun., vol. 20, no. 2, pp. 1349–1362, 2021.
[7] N. Shlezinger, N. Farsad, Y. C. Eldar, and A. J. Goldsmith, “ViterbiNet:
A deep learning based Viterbi algorithm for symbol detection, IEEE
Trans. Wireless Commun., vol. 19, no. 5, pp. 3319–3331, 2020.
[8] M. B. Fischer, S. D
¨
orner, S. Cammerer, T. Shimizu, H. Lu, and
S. Ten Brink, Adaptive neural network-based OFDM receivers, in
Proc. IEEE SPAWC, 2022.
[9] S. Schibisch, S. Cammerer, S. D
¨
orner, J. Hoydis, and S. ten Brink,
“Online label recovery for deep learning-based communication through
error correcting codes, in Proc. IEEE ISWCS, 2018.
[10] R. A. Finish, Y. Cohen, T. Raviv, and N. Shlezinger, “Symbol-level
online channel tracking for deep receivers, in Proc. IEEE ICASSP,
2022, pp. 8897–8901.
[11] L. Huang, W. Pan, Y. Zhang, L. Qian, N. Gao, and Y. Wu, “Data
augmentation for deep learning-based radio modulation classification,
IEEE Access, vol. 8, pp. 1498–1506, 2019.
[12] T. Raviv and N. Shlezinger, “Data augmentation for deep receivers,
IEEE Trans. Wireless Commun., 2023, early access.
[13] T. Raviv, S. Park, O. Simeone, Y. C. Eldar, and N. Shlezinger, “Online
meta-learning for hybrid model-based deep receivers, IEEE Trans.
Wireless Commun., 2023, early access.
[14] M. Goutay, F. A. Aoudia, and J. Hoydis, “Deep hypernetwork-based
MIMO detection, in Proc. IEEE SPAWC, 2020.
[15] L. Chen, S. T. Jose, I. Nikoloska, S. Park, T. Chen, and O. Simeone,
“Learning with limited samples: Meta-learning and applications to com-
munication systems, Foundations and Trends® in Signal Processing,
vol. 17, no. 2, pp. 79–208, 2023.
[16] Y. Liu and O. Simeone, “Learning how to transfer from uplink to
downlink via hyper-recurrent neural network for FDD massive MIMO,
IEEE Trans. Wireless Commun., vol. 21, no. 10, pp. 7975–7989, 2022.
[17] M. Zecchin, S. Park, O. Simeone, M. Kountouris, and D. Gesbert,
“Robust Bayesian learning for reliable wireless AI: Framework and
applications,IEEE Trans. on Cogn. Commun. Netw., 2023, early access.
[18] T. Raviv, S. Park, O. Simeone, and N. Shlezinger, “Modular model-
based Bayesian learning for uncertainty-aware and reliable deep MIMO
receivers, in Proc. IEEE ICC, 2023.
[19] N. Shlezinger, Y. C. Eldar, and S. P. Boyd, “Model-based deep learning:
On the intersection of deep learning and optimization, IEEE Access,
vol. 10, pp. 115 384–115 398, 2022.
[20] T. O’Shea and J. Hoydis, An introduction to deep learning for the
physical layer, IEEE Trans. on Cogn. Commun. Netw., vol. 3, no. 4,
pp. 563–575, 2017.
[21] M. Honkala, D. Korpi, and J. M. Huttunen, “DeepRx: Fully convolu-
7
tional deep learning receiver, IEEE Trans. Wireless Commun., vol. 20,
no. 6, pp. 3925–3940, 2021.
[22] J. Lu, A. Liu, F. Dong, F. Gu, J. Gama, and G. Zhang, “Learning under
concept drift: A review, IEEE Transactions on Knowledge and Data
Engineering, vol. 31, no. 12, pp. 2346–2363, 2018.
[23] P. Singh, V. K. Verma, P. Rai, and V. P. Namboodiri, “Play and prune:
Adaptive filter pruning for deep model compression, arXiv preprint
arXiv:1905.04446, 2019.
[24] P. G. Chang, K. P. Murphy, and M. Jones, “On diagonal approximations
to the extended Kalman filter for online training of Bayesian neural
networks, in Continual Lifelong Learning Workshop at ACML 2022,
2022.
[25] T. Gafni, N. Shlezinger, K. Cohen, Y. C. Eldar, and H. V. Poor,
“Federated learning: A signal processing perspective, IEEE Signal
Process. Mag., vol. 39, no. 3, pp. 14–41, 2022.
[26] N. Shlezinger and I. V. Baji
´
c, “Collaborative inference for AI-
empowered IoT devices, IEEE Internet of Things Magazine, vol. 5,
no. 4, pp. 92–98, 2022.
Tomer Raviv (tomerravi[email protected]) is currently pursuing his Ph.D
degree in electrical engineering in Ben-Gurion University.
Sangwoo Park ([email protected]) is currently a research associate at
the Department of Engineering, King’s Communications, Learning and Infor-
mation Processing (KCLIP) lab, King’s College London, United Kingdom.
Osvaldo Simeone (osv[email protected]) is a Professor of Information
Engineering with the Centre for Telecommunications Research at the Depart-
ment of Engineering of King’s College London, where he directs the King’s
Communications, Learning and Information Processing lab.
Yonina C. Eldar ([email protected]) is a Professor in the Depart-
ment of Math and Computer Science, Weizmann Institute of Science, Israel,
where she heads the center for Biomedical Engineering and Signal Processing.
She is a member of the Israel Academy of Sciences and Humanities, an IEEE
Fellow and a EURASIP Fellow.
Nir Shlezinger ([email protected]) is an Assistant Professor in the School of
Electrical and Computer Engineering in Ben-Gurion University, Israel.
8