Fundamentals of Image Processing demystified! Contd...

Hey again! 
Starting from where I left...

Transform: What is a transform? Yes, you are right it changes a system/signal from one domain to other but that is actually a transformation process. A transform is simply a linear operator. By saying linear it just means that the signal should not change when you view it in other domain. It's like I go from one room to other, though the room is different but I'm the same. Why is a transformation necessary? Because it segregates the frequency which helps to understand the signal more. If a transformation process exists then it's inverse should also exist, i.e. if I go from my room to other room, I should also be able to come back to my room with me being the same throughout. Let me write down some transformation equations:
What do you understand by these three equations? Let me explain you in a dramatic way :P I'm a Mumbaikar and Mumbai is my main domain where I know to manage out everything independently. This is what is f(t) in time domain. If by flight/train or any such means I go to Bangalore, now this is a new place to me and I know nothing here so I'm dependent on someone. If this someone is of a higher authority I stand no chance of enjoying my freedom and would have to adjust myself accordingly. To draw an analogy, flight/train is the transform used, Bangalore is the transformed domain and that someone of higher authority is the exponential function. As I lose my freedom, same way the function looses its time characteristics compared to the huge exponential signal (the rhs of the equation has no time functions). Now why choose an exponential signal? It is so because it is composed of cos and sin signals which are known signals, like the authority who will be my point of contact in Bangalore would be someone known to me. This opens something for us to think...is loosing time going to be good? Just go through the following scribble:
We observe three different signals but taking their Fourier Transform gives us exactly the same frequency composition. This simply happens because in the frequency domain you don't get the feel of time. But in time domain you don't get the feel of frequency as well. So we are in a fix and are not able to figure out which is the original signal from its fourier transform. Coming back to the exponential signal or that someone of higher authority, what if now that someone is having less authority than you, so in such case you can still enjoy your freedom or in this context by making the time of the exponential signal finite, the time characteristics can still be retained now. This is what is STFT-Short time fourier transform. So There is only a minor difference between STFT and FT. In STFT, the signal is divided into small enough segments, where these segments (portions) of the signal can be assumed to be stationary. For this purpose, a window function "w" is chosen. The width of this window must be equal to the segment of the signal where its stationarity is valid. Difference in the equations reflecting this change would be just by replacing 't' by 't-t0' in the exponential power i.e. making the time finite. 

Let's observe one more interesting thing here. We know if we use a window of infinite length, we get the FT, which gives perfect frequency resolution, but no time information. So, 
Wide window ===>good frequency resolution, poor time resolution. 
Narrow window ===>good time resolution, poor frequency resolution.
Let us see that for ourselves below. Here we have taken a non-stationary signal with four different frequency components at different times. 

Now, let's look at its STFT and we find that these four peaks are located at different time intervals
along the time axis. 

The following figure shows four window functions.  I will now show the STFT of the same signal given above computed with these windows. 

First let's look at the first most narrow window. We expect the STFT to have a very good time resolution, but relatively poor frequency resolution and we note that the four peaks are well separated from each other in time. 

Now let's take a wider window:

Even wider:

Note that the peaks are not well separated from each other in time, however, in frequency resolution is much better. Another thing we can infer is that low frequency signals are better resolved when the window function is wider as then we get more frequency resolution and less time resolution while the high frequency signal is better resolved when the window function is narrower as for high frequency i.e fast changing signal a better time resolution is necessary. 


These examples should have illustrated the implicit problem of resolution of the STFT. Anyone who would like to use STFT is faced with this problem of resolution. What kind of a window to use? 

To again make the problem of resolution clear, one cannot know the exact time-frequency representation of a signal, i.e., one cannot know what spectral components exist at what instances of times. What one can know are the time intervals in which certain band of frequencies exist, which is a resolution problem.  The Wavelet transform (WT) solves the dilemma of resolution to a certain extent. I'm not taking details of wavelet transform in this blog though I have already given an idea of it towards the end of my first blog on Neural Networks. 

Digital Filters: The last topic of this discussion. Designing the right filter is very important for any application, and we all realise this. This is again a very huge topic but I'll just summarise it in the following charts. 

It is very good to know everything in depth for designing a filter by various methods (I haven't explained the methods here, just mentioned them above) because that will only allow us to grasp or visualise any problem and get us the right solution to it, though all these methods are just direct functions in any toolbox so you don't need to sit and design these algorithms, all you must know is when and where to use which method. So reading about the filter design frameworks and the different methods for both FIR and IIR from any standard DSP text book would be helpful now. 

With this I wrap up the topic and I hope you were able to follow me throughout this blog and the previous blog. :)
  
  
  


Fundamentals of Image Processing demystified!

Image Processing does sound so cool, isn't it!? But are your signal processing fundamentals strong enough? After all image is also a 2D signal. Image processing can be implemented using various platforms like MATLAB, Python etc, but your expertise doesn't lie in that, what matters is your understanding of the subjects and the clarity of fundamentals which helps a very simple algorithm to be build for any complex problem. So, here I'm not really concerned with explaining what image processing is but I'll make sure I discuss certain concepts that will make learning image processing easy.
Certain key terms that you should be thorough with are:

  • Frequency
  • Signal
  • System
  • Convolution
  • Correlation
  • Transform
  • Digital Filters
Ask yourself what do you understand by all the above terms. Just keep your answer in mind and read through the following explanation to form a better understanding.


Frequency: You are hearing PM Narendra Modi giving a speech and in another channel a cricket commentary. This goes unsaid that you may find the speech to be easier to comprehend with every word being distinctly heard while the commentary to be little difficult to understand. Clearly what makes these two sounds different is their frequencies. Hence, frequency is nothing but the rate of change of a signal. A signal with low frequency contains a lot of information and is said to be slow changing signal while a signal with high frequency might be a little irritating at times to our ears as it is fast changing and so is treated as noise. Same goes when saying babies sleep listening to lullaby rather than a rap song.
Slow changing signal is thus of interest to us but is there any problem you can think of it? Yes, being slow changing it carries very less energy and so would die out soon without reaching a great distance. Well, this can be tackled by either amplifying the signal or doing a frequency modulation. Frequency modulation is nothing but passing the low frequency signal with a high frequency signal that can take it to a greater distance. Like you take your lunch box and eat the food inside that and not the box, right :P
So, frequency plays an important role because looking at the frequency you can understand if that signal is of interest to you or not.
To give an idea of one application, there is an image with a tumor cell that needs to be detected. So, all throughout the image you may find uniformity or the neighboring pixel values changing by a small amount (slow changing) but when a tumor cell is encountered, because it is captured differently than normal cells around it there is a sudden change in pixel value creating a high frequency region. Hence the algorithm will narrow down to just finding the high frequency regions.

Signal: Everyone knows what a signal is. Going by the technical definition it is just a function of one or more independent variable, independent variable like time you can say. It is important to know the characteristics of the signal in order to process it. And it would have different characteristics in time and frequency domains. It is also important to know what type of signal is it like casual, LTI or what. Taking me as an example, in my blog site I'll be talking about technical stuffs but if you find me on Instagram or Twitter you can find me talking about various other stuffs like travel photography, social causes etc. So taking this blog site as one domain, you will get only a little idea about me but if you change the domain you can find my more characteristics. Same goes with any signal, you will have to deal with both time and frequency domain to deduce more characteristics in order to understand the signal better for processing.

System: Anything you see is a system be it a mobile phone, a human body or any device. It is not at all important what happens inside the system but what is important is to have an understanding of the system. By saying understanding I mean you should know what a system does when you give a certain input. You'll attend a call by pressing green and reject using red so this is what is understanding, what happens inside is of no concern to us. Also it is very very necessary to know for what inputs the system will undergo instability. Essentially there are three things that you need to bother about any system, they are: Dynamics of the system i.e. the way system changes with time, Transfer function of the system that actually gives a relation between input and output and finally the stability of the system.

All the signal processing is done in the computer. Real life signal being analog in nature needs to be first converted to digital for giving it as input to the computer. Knowing what analog to digital conversion and vice-versa is thus essential. Any analog signal is first sampled or discretised in time. At this step following the sampling theorem is important (fs>2fm) because when you do the reverse to obtain analog from digital if initially sampling wasn't done properly then here you'll lose or overlap the information called aliasing effect. After sampling you know it is quantisation where selecting the no. of quantisation levels is important and is given by 2^n where n is the no. of bits.

Convolution: This is the best part. It is actually the origin of every digital signal processing algorithm. First giving the mathematical understanding:
So, 'x' basically denotes any signal along with noise and 'h' is the impulse response of the system. Filter can be the system, and actually convolution is a filtering process. How do you view it is this way: you want to approach your senior for some guidance but you don't know that person so first you'll talk to others who know him and form a reference response and by taking the reference being impulse it becomes the impulse response. After that you'll approach and finally form a complete response taking your as well as reference into consideration. You being the 'x' input and that senior being the system with the reference response 'h' and finally your output response is a convolution giving 'y' as resultant response. This is an explanation for 1D convolution. Now as we observed we need to do flip, shift, multiply, add all in one cycle which gets tedious so in such situation you change the domain and do a simple multiplication which gives the same result, this is actually a property of convolution.

Correlation: This gives degree of similarity between two signals in the form of percentage. There are two types of correlation: Auto correlation and Cross correlation. Auto correlation gives the similarity between a signal and the same signal after a certain period of time. Therefore allows to predict ahead of time. (note: convolution was back in time) Cross correlation gives degree of similarity between any two signals. Giving few application specific examples: There is a patient under observation whose ECG is taken every hour and compared with previous to draw similarity and predict future response that would help detect any heart problem. Speaker recognition also is based on this concept, where my speech is compared with my own saved speech and similarity is drawn. Same way in bio-metrics, degree of similarity between the fingerprints are found, this threshold is set like in highly secure places the degree of similarity required might be 90% and so on.

There are two more concepts, Transforms and Digital Filters that I'm yet to cover which I'm keeping for the next blog as I don't want to unnecessarily make this blog very long. If you want to dig more into these concepts sure do, there is hell lot to explore. Everything I have written can be explained by maths and formulas which I have avoided here.



My first Code-along workshop

I had one of the most satisfying Saturday last weekend and that feeling is the reason I'm writing a blog after almost a year. I often...