Cancellation of Impulse Noise in VDSL systems

VDSL/VDSL2 are digital subscriber lines (DSL) that provide high bit rate and thus fast data transmission over wired telephone lines. These technologies use twisted pair of copper wires or coaxial cables for transmission. The greatest source of impediment in the DSL lines is crosstalk stemming from other DSL lines in the same cable binder. But this crosstalk is cancelled using vectoring which employs a joint signal processing of all user signals. The other major source of obstruction in these wired lines is noise. Impulse noise is generally a high power intermittent noise coupling electromagnetically into the cable binder with the common sources of such noises at the customer premises equipment (CPE) being PLC modems and household appliances like washing machine, treadmill etc. Impulse noise can be classified into two types: Repetitive Electrical Impulse Noise (REIN) and Prolonged Electrical Impulse Noise (PEIN). Reducing the disturbance caused by such noises is necessary for increasing the throughput of the VDSL systems.

The twisted pair has two modes for transmission: the common mode (CM) and the differential mode (DM). VDSL uses the differential mode for data transmission due to its robustness to the electromagnetically (EM) coupled noise. The EM coupled noises like crosstalk and impulse noises couple onto the twisted pair as common mode signals and a fraction of the coupled noise leaks into the differential mode due to cable imbalances in the twisted pair thus hampering the transmission of useful data signal. Thus, due to the leakage or coupling of signals between the common and differential mode, the CM and DM sensors exhibit a correlation. If the correlation can be estimated, the CM signal can be used to cancel the impulse noise in the DM.

The biggest challenge in impulse noise cancellation is that the estimation of the correlation or the coupling function has to be done in presence of high powered data signal during show time as the coupling is dependent on the noise source. Thus, the estimation of CM-DM coupling needs a large number of DMT symbols to average out the stronger data signal. In case of PEIN/SHINE noises, which are transient in nature, this cancellation needs to be faster i.e. within a few symbols.

While there have been several approaches which attempt to cancel out impulse noise based on CM sensor, they have slow convergence in presence of data signal and rely on the repetitive nature of the noise. Therefore, they are not suitable for transient noise cancellation. To combat this shortcoming, we address the issue in a decision directed way which can give faster convergence for near optimal cancellation of impulse noise useful for transient as well as repetitive noise.

The power minimization challenge in G.fast

G.fast is the latest buzzword in broadband, promising to provide rates as high as several hundreds of Mbps to customers, without the need for drawing out optical fiber to their households. This is made possible by a hybrid copper-fiber access technology – fiber from the exchange terminates in a Distribution Point Unit (DPU) a few hundred meters from the customers’ premises, while the remaining distance is covered by existing copper telephony wires.  A DPU serves multiple users, with each user connected via a dedicated copper line.

gfast_architecture
The basic network architecture of G.fast

One of the key features of G.fast is the ability to power the DPU from the customer premises, something known as reverse powering. This eliminates the need of a stand-alone power supply, or of extracting power from the exchange. There’s just one caveat: not much power can be tapped from the subscribers, implying that equipment at the DPU must operate at low power levels.

To make the DPU nodes energy efficient, G.fast makes the following provisions with regards to transmission over the copper lines:

  • It uses Time Division Duplexing (TDD), whereby data transmission is organized in frames, each frame comprising of a few time slots allocated to upstream transmission and a few to downstream transmission, all using the same frequency band. Compared to Frequency Division Duplexing used in G.fast’s predecessor VDSL, TDD incurs lower analog front-end power consumption.
  • G.fast standards define Discontinuous Operation, whereby lines may stop transmitting or receiving for certain time slots within a frame. Disabling a line for a time slot considerably lowers its power consumption for the slot.
DO_in_gfast
Discontinuous operation in G.fast for one TDD frame; each line transmits for some time slots (active) and remains disabled for the rest

Given these provisions, minimizing the power consumption at the DPU appears to be a trivial task. Power consumption of a line scales linearly with the number of time slots for which the line is active. For each line, one could estimate the minimum number of slots that it needs to remain active for in a frame, in order to satisfy its data rate requirement for the frame, and disable it for the remaining time slots. That would correspond to the minimum power consumption per line, seemingly solving the problem.  But the practical scenario is far from simple. Vectoring is what complicates the situation.

As multiple lines within a binder operate at the same frequency band, there occurs electromagnetic coupling among the lines, causing interference known as crosstalk, which happens to be a major cause of performance degradation. The method for crosstalk cancellation for downstream data transmission in VDSL is known as vectoring or precoding – a concept somewhat similar to that used in noise-cancelling headphones. Prior to transmission, pre-distortion is added to the signals such that it cancels the crosstalk introduced in the binder, allowing each line to operate over a crosstalk free channel.

Integrating vectoring with discontinuous operation, however, is not an easy task. Because the set of active lines may vary with every time slot, deciding which lines to precode becomes crucial. A simple choice would be to precode all lines, active as well as disabled. This would cancel crosstalk among active lines, but would also lead to idle symbols being transmitted on the disabled lines, defeating the very purpose of discontinuous operation. Another option would be to precode only the active users in each time slot, but this would require changing the precoder in every time slot as the set of active users changes, and is also not a feasible option.  This brings us to the following question:

How do we incorporate vectoring into the discontinuous operation of G.fast in a computationally efficient manner, while maintaining energy efficiency?

While there have been studies proposing novel vectoring methods to tackle the power crunch, we have tried to answer a more fundamental question: How do we harness the very nature of discontinuous operation to minimize the power consumption of a set of lines over a frame, independent of vectoring? This must be done while conforming to the data rate requirement of each line, where the amount of data transmitted by an active line in a time slot depends upon which other users are active in the slot.  All we need, then, is a smart arrangement of the transmission of various lines over a time frame such that power is minimized, while ensuring that each line meets its target data rate.

When performing such a slot assignment, the operator has the liberty to choose any type of precoder for crosstalk cancellation. An efficient assignment in fact eliminates the need for complex precoding techniques. With vectoring made easier to implement, the power bottleneck at the DPU can thus be overcome, while still delivering high data rates to customers.

 

Single Channel Audio Source Separation

Audio signals like speech and music form an integral part of our everyday life. But it is very common to come across mixture of signals coming from various sources like during a conversation when a number of people are talking together, or when different instruments playing simultaneously. While following one speaker or instrument from such scenarios is easy for human beings, it is rather difficult for computer audition. This is where source separation steps in. A common example for source separation application is the cocktail party problem.  As in a typical cocktail party scenario, there are a number of people talking and some music playing all at the same time. Now, if one wants to follow one particular speaker, we need source separation. Let’s say we have the karaoke problem. In karaoke we just need the music of a song.  So in that case the vocal needs to be separated from the music.  A speech recognition system tries to find the content of a speech signal. But if two people are talking together, it is difficult to perceive speech of a single person. So, source separation would be required as a pre-processing step for such systems.

While we want to separate the source signals, the number of microphones which recorded the mixture of signals can be anything. More recordings i.e. more observations imply more information about the sources. Lesser the number of observations, lesser information we have about these sources. In the extreme case, we have only one observation. This is single channel source separation wherein we want to separate the sources from a single observation. Mathematically, given a mixed signal

y(t) = s_1(t) + s_2(t)

the signals  s_l(t), l= 1,2..L  are to be estimated. That is, the L unknowns are to be found out from this one equation making this problem underdetermined with infinite solutions. To recover the exact sources from this infinite set of solutions, one needs some prior information about the sources.

Example: Let’s say the mixed signal contains two sources i.e. . If the bases to which these sources belong were known and orthogonal as shown in Figure 1, then a simple projection on the bases will give perfect separation.

orthogonal
Figure 1

The problem in audio sources is that they don’t lie in such orthogonal spaces and so can’t be separated so easily and perfectly. Hence, single channel source separation is about finding models/structures/bases for the underlying sources such that they have a ‘discriminative’ property for a good separation i.e., the models are able to reconstruct the sources well enough as well as help in separation of the sources from each other.

Discriminative training of source models is thus important for single channel source separation. While many model-based methods for source separation have been proposed in the past, all the approaches overlook a fundamental question: What are the right parameters of the models to be used for source separation? Also, source separation algorithms are burdened with the task of separating all the sources at the same time. But, retrieval of one source at a time will be more helpful when it comes to the quality of separated sources.

We have developed a discriminative framework for single channel audio source separation which searches for right parameters(dimension/sparsity) for the models so that they give a viable separation. This framework also retrieves one source at a time from the mixture instead of separating all the sources at once.