This is the natural way to model most sequential data.
We can think of the recurrent net as a layered, feed-forward net with shared weights and then train the feed-forward net with weight constraints.
We can also think of this training algorithm in the time domain:
The forward pass determines the slope of the linear function used for backpropagating through each neuron