If you see any signal, it can be represented as a value at each time, x(0) = 1, x(1) = 2 .. x(100) = 5 etc. We can visualize them as you shouting 1 at time 0, 2 at time 1 and 5 at time 100. Alternatively we can also do the same with a larger number of persons.
Representation using dirac delta
--------------------------------------
Lets say that you have 100 persons at your disposal. You ask first person to shout 1 at time 0, second person to shout 2 at time 1 and person to shout 5 at time 100. Other times they will be silent. So with these 100 people you can represent the signal X. We call each of these person as bases. Mathematically they are delta functions of time, ie they get activated only at their specified time. Other times they are silent, ie 0. The advantage of this representation is that you have fine control on the signal. If you want to modify value at time=5, you can just inform the 5th guy.
Introduction to bases
--------------------------
Dirac delta is not the only bases. You can ask multiple guys to shout at multiple times. They can even tell negative numbers. All you have to ensure is that they add up to the value of X. The guys should be able tell any number that can come as a part of X. This we name the property "SPAN".
Instead of 100 guys, we can have 200 guys too, ie 2 guys for each time and they tell half of the original value. However, this is wasteful since you have to pay for extra guys with no use. Hence we say that the bases should be orthogonal, ie they should not have correlation with others in the group. So as we have uncorrelated and spanning guys, we can represent any signal using them.
Fourier transform
--------------------------
In case of Fourier transform, each guy will shout according to a sinusoidal wave. Lets say sine wave. ie guy 1 at time 0 will tell the value of sine(f0 t). Second guy will shout value of sine(f1t) and so on. The f0, f1 etc are the frequencies for each guy. Now it comes out that these guys will be orthogonal to each other, and they can span all the signals. Thus we have Fourier transform. Hence instead of representing signal as value at each step, we can represent it as value at each frequency.
Why Fourier transform
-------------------------
We have seen that as long as bases span and and are orthogonal, they can define a transformation. But why is Fourier transform so famous. This comes from the systems we use. The most common systems we use are LTI(Linear time invariant) systems. A property of the said system is that they work on sinusoidal waves. Ie if a sinusoidal wave of frequency f is passed through an LTI system, all it can do is to multiply with a scalar. Any other wave will have a more complex effect. Hence if we can represent signals as a sum of sinusoids, we can represent our system as just a amplifier at each frequency. This makes whole of system analysis into a set of linear equations which we are good at solving. So we love Fourier transform