Fundamentals of Image Processing demystified! Contd...

Hey again! 
Starting from where I left...

Transform: What is a transform? Yes, you are right it changes a system/signal from one domain to other but that is actually a transformation process. A transform is simply a linear operator. By saying linear it just means that the signal should not change when you view it in other domain. It's like I go from one room to other, though the room is different but I'm the same. Why is a transformation necessary? Because it segregates the frequency which helps to understand the signal more. If a transformation process exists then it's inverse should also exist, i.e. if I go from my room to other room, I should also be able to come back to my room with me being the same throughout. Let me write down some transformation equations:
What do you understand by these three equations? Let me explain you in a dramatic way :P I'm a Mumbaikar and Mumbai is my main domain where I know to manage out everything independently. This is what is f(t) in time domain. If by flight/train or any such means I go to Bangalore, now this is a new place to me and I know nothing here so I'm dependent on someone. If this someone is of a higher authority I stand no chance of enjoying my freedom and would have to adjust myself accordingly. To draw an analogy, flight/train is the transform used, Bangalore is the transformed domain and that someone of higher authority is the exponential function. As I lose my freedom, same way the function looses its time characteristics compared to the huge exponential signal (the rhs of the equation has no time functions). Now why choose an exponential signal? It is so because it is composed of cos and sin signals which are known signals, like the authority who will be my point of contact in Bangalore would be someone known to me. This opens something for us to think...is loosing time going to be good? Just go through the following scribble:
We observe three different signals but taking their Fourier Transform gives us exactly the same frequency composition. This simply happens because in the frequency domain you don't get the feel of time. But in time domain you don't get the feel of frequency as well. So we are in a fix and are not able to figure out which is the original signal from its fourier transform. Coming back to the exponential signal or that someone of higher authority, what if now that someone is having less authority than you, so in such case you can still enjoy your freedom or in this context by making the time of the exponential signal finite, the time characteristics can still be retained now. This is what is STFT-Short time fourier transform. So There is only a minor difference between STFT and FT. In STFT, the signal is divided into small enough segments, where these segments (portions) of the signal can be assumed to be stationary. For this purpose, a window function "w" is chosen. The width of this window must be equal to the segment of the signal where its stationarity is valid. Difference in the equations reflecting this change would be just by replacing 't' by 't-t0' in the exponential power i.e. making the time finite. 

Let's observe one more interesting thing here. We know if we use a window of infinite length, we get the FT, which gives perfect frequency resolution, but no time information. So, 
Wide window ===>good frequency resolution, poor time resolution. 
Narrow window ===>good time resolution, poor frequency resolution.
Let us see that for ourselves below. Here we have taken a non-stationary signal with four different frequency components at different times. 

Now, let's look at its STFT and we find that these four peaks are located at different time intervals
along the time axis. 

The following figure shows four window functions.  I will now show the STFT of the same signal given above computed with these windows. 

First let's look at the first most narrow window. We expect the STFT to have a very good time resolution, but relatively poor frequency resolution and we note that the four peaks are well separated from each other in time. 

Now let's take a wider window:

Even wider:

Note that the peaks are not well separated from each other in time, however, in frequency resolution is much better. Another thing we can infer is that low frequency signals are better resolved when the window function is wider as then we get more frequency resolution and less time resolution while the high frequency signal is better resolved when the window function is narrower as for high frequency i.e fast changing signal a better time resolution is necessary. 


These examples should have illustrated the implicit problem of resolution of the STFT. Anyone who would like to use STFT is faced with this problem of resolution. What kind of a window to use? 

To again make the problem of resolution clear, one cannot know the exact time-frequency representation of a signal, i.e., one cannot know what spectral components exist at what instances of times. What one can know are the time intervals in which certain band of frequencies exist, which is a resolution problem.  The Wavelet transform (WT) solves the dilemma of resolution to a certain extent. I'm not taking details of wavelet transform in this blog though I have already given an idea of it towards the end of my first blog on Neural Networks. 

Digital Filters: The last topic of this discussion. Designing the right filter is very important for any application, and we all realise this. This is again a very huge topic but I'll just summarise it in the following charts. 

It is very good to know everything in depth for designing a filter by various methods (I haven't explained the methods here, just mentioned them above) because that will only allow us to grasp or visualise any problem and get us the right solution to it, though all these methods are just direct functions in any toolbox so you don't need to sit and design these algorithms, all you must know is when and where to use which method. So reading about the filter design frameworks and the different methods for both FIR and IIR from any standard DSP text book would be helpful now. 

With this I wrap up the topic and I hope you were able to follow me throughout this blog and the previous blog. :)
  
  
  


Fundamentals of Image Processing demystified!

Image Processing does sound so cool, isn't it!? But are your signal processing fundamentals strong enough? After all image is also a 2D signal. Image processing can be implemented using various platforms like MATLAB, Python etc, but your expertise doesn't lie in that, what matters is your understanding of the subjects and the clarity of fundamentals which helps a very simple algorithm to be build for any complex problem. So, here I'm not really concerned with explaining what image processing is but I'll make sure I discuss certain concepts that will make learning image processing easy.
Certain key terms that you should be thorough with are:

  • Frequency
  • Signal
  • System
  • Convolution
  • Correlation
  • Transform
  • Digital Filters
Ask yourself what do you understand by all the above terms. Just keep your answer in mind and read through the following explanation to form a better understanding.


Frequency: You are hearing PM Narendra Modi giving a speech and in another channel a cricket commentary. This goes unsaid that you may find the speech to be easier to comprehend with every word being distinctly heard while the commentary to be little difficult to understand. Clearly what makes these two sounds different is their frequencies. Hence, frequency is nothing but the rate of change of a signal. A signal with low frequency contains a lot of information and is said to be slow changing signal while a signal with high frequency might be a little irritating at times to our ears as it is fast changing and so is treated as noise. Same goes when saying babies sleep listening to lullaby rather than a rap song.
Slow changing signal is thus of interest to us but is there any problem you can think of it? Yes, being slow changing it carries very less energy and so would die out soon without reaching a great distance. Well, this can be tackled by either amplifying the signal or doing a frequency modulation. Frequency modulation is nothing but passing the low frequency signal with a high frequency signal that can take it to a greater distance. Like you take your lunch box and eat the food inside that and not the box, right :P
So, frequency plays an important role because looking at the frequency you can understand if that signal is of interest to you or not.
To give an idea of one application, there is an image with a tumor cell that needs to be detected. So, all throughout the image you may find uniformity or the neighboring pixel values changing by a small amount (slow changing) but when a tumor cell is encountered, because it is captured differently than normal cells around it there is a sudden change in pixel value creating a high frequency region. Hence the algorithm will narrow down to just finding the high frequency regions.

Signal: Everyone knows what a signal is. Going by the technical definition it is just a function of one or more independent variable, independent variable like time you can say. It is important to know the characteristics of the signal in order to process it. And it would have different characteristics in time and frequency domains. It is also important to know what type of signal is it like casual, LTI or what. Taking me as an example, in my blog site I'll be talking about technical stuffs but if you find me on Instagram or Twitter you can find me talking about various other stuffs like travel photography, social causes etc. So taking this blog site as one domain, you will get only a little idea about me but if you change the domain you can find my more characteristics. Same goes with any signal, you will have to deal with both time and frequency domain to deduce more characteristics in order to understand the signal better for processing.

System: Anything you see is a system be it a mobile phone, a human body or any device. It is not at all important what happens inside the system but what is important is to have an understanding of the system. By saying understanding I mean you should know what a system does when you give a certain input. You'll attend a call by pressing green and reject using red so this is what is understanding, what happens inside is of no concern to us. Also it is very very necessary to know for what inputs the system will undergo instability. Essentially there are three things that you need to bother about any system, they are: Dynamics of the system i.e. the way system changes with time, Transfer function of the system that actually gives a relation between input and output and finally the stability of the system.

All the signal processing is done in the computer. Real life signal being analog in nature needs to be first converted to digital for giving it as input to the computer. Knowing what analog to digital conversion and vice-versa is thus essential. Any analog signal is first sampled or discretised in time. At this step following the sampling theorem is important (fs>2fm) because when you do the reverse to obtain analog from digital if initially sampling wasn't done properly then here you'll lose or overlap the information called aliasing effect. After sampling you know it is quantisation where selecting the no. of quantisation levels is important and is given by 2^n where n is the no. of bits.

Convolution: This is the best part. It is actually the origin of every digital signal processing algorithm. First giving the mathematical understanding:
So, 'x' basically denotes any signal along with noise and 'h' is the impulse response of the system. Filter can be the system, and actually convolution is a filtering process. How do you view it is this way: you want to approach your senior for some guidance but you don't know that person so first you'll talk to others who know him and form a reference response and by taking the reference being impulse it becomes the impulse response. After that you'll approach and finally form a complete response taking your as well as reference into consideration. You being the 'x' input and that senior being the system with the reference response 'h' and finally your output response is a convolution giving 'y' as resultant response. This is an explanation for 1D convolution. Now as we observed we need to do flip, shift, multiply, add all in one cycle which gets tedious so in such situation you change the domain and do a simple multiplication which gives the same result, this is actually a property of convolution.

Correlation: This gives degree of similarity between two signals in the form of percentage. There are two types of correlation: Auto correlation and Cross correlation. Auto correlation gives the similarity between a signal and the same signal after a certain period of time. Therefore allows to predict ahead of time. (note: convolution was back in time) Cross correlation gives degree of similarity between any two signals. Giving few application specific examples: There is a patient under observation whose ECG is taken every hour and compared with previous to draw similarity and predict future response that would help detect any heart problem. Speaker recognition also is based on this concept, where my speech is compared with my own saved speech and similarity is drawn. Same way in bio-metrics, degree of similarity between the fingerprints are found, this threshold is set like in highly secure places the degree of similarity required might be 90% and so on.

There are two more concepts, Transforms and Digital Filters that I'm yet to cover which I'm keeping for the next blog as I don't want to unnecessarily make this blog very long. If you want to dig more into these concepts sure do, there is hell lot to explore. Everything I have written can be explained by maths and formulas which I have avoided here.



Shape your project ideas to Empower The Nation!

Here is a blog on my weekend. 


On 11th May, 1998 India achieved a major technological breakthrough by successfully carrying out nuclear tests at Pokhran. Also first, indigenous aircraft "Hansa-3" was test flown at Bangalore on this day and India also performed successful test firing of the Trishul missile on the same day. Considering these technological achievements on a particular date i.e. 11th May, the day of 11th May was chosen to be commemorated as National Technology Day.
On account of National Technology Day, there was an exhibition of BARC technologies in the field of Electronics and Computers held on the weekend. There were around 34 amazing exhibits under the theme Electronics and Computer Technology- Empowering the Nation. 

Just to throw some light on few of the projects:
  • Seeker for BrahMos Missile
    • Missile hits the target but the target is also intelligent enough to predict an attack and shift its position. Controlling the target is not in our hands (obviously, because it is the enemy) but what we could rather be doing is detect the moving target and hit it at the right coordinate. So this seeker which is located at the apex of the missile does this job. It is actually a transceiver which works on monopulse detection method and then applying some ratiometric analysis the coordinates of the target is determined with great accuracy. 
  • Robot assisted Neuro Navigation and Neurosurgical Suite
    • For a normal surgery to be carried out the doctor needs to cut open a part of the skull so that it can be seen properly inside while performing the surgery. Whereas the idea of this project lies in finding out the coordinates inside the skull where the operation needs to be done through normal CT scan and then making a very small hole just so that the surgical needle gets in. There are certain markers like reference points that would assist in getting the needle to the right position. Now all this can be viewed onto the monitor and the doctor can see how the robotic arm with the needle is moving into the hole and inside the skull. And this monitor image of the skull can be cross sectioned in whichever way so that the doctor gets the idea of inside the skull without actually performing the cross section on the patient. This whole task could be either automated or performed manually depending upon the surgical requirements. This was developed on ROS (Robotic Operating System) environment. I wish I could have clicked pictures of the demonstration to give you all a better idea.  
  • Digital Medical Imaging System
    • Through conventional X-Rays a film is produced which is then examined by the doctors. But with this a digital image of the X-Ray is produced directly onto the computer which can be sent to the doctor anywhere. For different examination purposes different types of machines are required like for angiography, radiography, tomography etc. So this is like one machine for all purpose. It is like a bed where the patient lies and from below it the X-Rays are emitted. There is a flat panel which is a stack of multiple layers performing different functions on the top where the X-rays are received. The first layer being the scintillator which converts x-rays to light and then there is a CCD (charge coupled device) layer which can convert the analog light signal to digital output that can be given to the computer. Even taking X-ray videos becomes easy. Not saying that this technology doesn't exist, there are big hospitals that have it installed but the specialty of this product lies in being indigenous and being really affordable. 
  • Electronic Nose
    • This device imitates a nose by detecting the presence of different gases. It has a sensor array with each segment being a different semiconductor composition in order to detect a particular gas. Like the segment having Sn02:CuO film can detect H2S gas. It is basically a chemiresistive sensor. When H2S reacts with this film CuS is formed which is metallic and hence electrical resistance changes, this change is calibrated and recorded as concentration in ppm. This is a reversible reaction because of which it can be reused. On the similar line other segments in the sensor array also work for different gases. A particular gas can also be detected by multiple segments and so in such case PCA(Principle Component Analysis) is carried out to determine the gas. 
  • Application of Signal Processing in Health Monitoring of Buried Oil pipelines
    • There is a caterpillar like device which is inserted from one end of the pipeline and taken out to process the data it collected throughout its journey from the other end. It works on the principle of detecting change in magnetic flux and eddy current using hall effect sensors. It is an in house system having a magnetic module, DAQ(data acquisition system) and power module inbuilt. When there is a crack the magnetic flux increases so this way the data is recorded and health is monitored. 
  • Object Identification and Face Detection
    • If you had read my previous post then this is just an application of Neural networks. The training data which identified 80 different objects was taken from Microsoft coco database. Normally for images CNN(convolutional neural networks) becomes handy and hence that idea was only applied. Backpropagation algorithm was used for the training. The training took about 200 hrs that too on GPUs! Idea being each pixel value is given to a separate input neuron and the output layer has 80 neurons to detect the probability of 80 different objects. The number of hidden layers and the number on neurons in each layer is a matter of trial and error. Once the training is done and the correct weights of the neural net are recorded, this architecture can be passed onto any other device like a mobile and now if any random image is provided then it can easily detect which object it is within just 5-10 secs. 
  • Depth estimation and application in Robotics
    • Depth estimation is a fundamental problem in robotics and computer vision. This project used Laser assisted Stereo Vision. We have two eyes for a purpose and that is to realize the depth. Also if we view an object which is placed near from only one eye and then close that eye and view it from other eye, the shift in the object is more compared to when the object is placed far away (try it out yourself). Stereo cameras are used generally for depth estimation but with the disadvantage of being very difficult to calibrate whereas using laser would be very helpful here. For the algorithm building various principles were used like triangulation principle and time of flight principle and the shift or the error was estimated and a depth map was generated. The end product was a cart having a laser beam generator and detector with a span of 0 to 270 degree and as it navigates it generates the map of the whole area on the screen using Simultaneous Localisation and Mapping (SLAM) method. All this was developed on ROS (distributed system).
  • X-Ray 3D Computed Tomography for Non-destructive Evaluation
    • The idea in one sentence is taking X-Ray images that are grey scale images at different angles and by applying some back-projection algorithm the whole volume of the object is constructed. This can be performed using VTK software. 
I've tried describing few of the projects above, at least their ideas and because I'm from Electronics background my description would have been a little Electronics centric and not much of Computers. To name a few more projects: 
  • Cyber Security Monitoring in BARC
  • Handheld Biosensor for detecting Methyl Parathion Pesticides
  • Body Composition Analyser
  • Speech Analytics with Machine Learning
  • Aerial Radiation Monitoring Systems
  • Real time intelligent perimeter intrusion detection system
All in all the weekend was amazing and I spent a total of 5 hours at the exhibition absorbing myself in the wide spectrum of emerging technologies and appreciating all of them at the same time thinking of possible ways through which I can contribute in this field. 

Cheers!!

The Idea of Neural Networks

Last semester I had this very interesting course on Neural Networks and Fuzzy Logic. A big thanks to our professor to make the course as interesting as it sounds.

Do we all not realize our brain is a super computing massively fascinating biological entity. The way we build memories and connect the dots later to figure things out is not at all easy as it sounds. We made computers to do things that we are not good at, like huge mathematical computations. But imagine the power of computers that can learn and make decisions the way humans do. That's what it is! Artificial Intelligence!

Through this post I would straightaway jump into the concepts and the key points. Dividing it into ten sub topics:

  1. Sigmoid Neurons
  2. Backpropagation Algorithm
  3. Training vs Testing Error
  4. RBF: Radial Basis Function
  5. PCA: Principle Component Analysis
  6. Supervised vs Unsupervised learning
  7. Classification vs Regression
  8. SVM: Support Vector Machine
  9. CNN: Convolutional Neural Network
  10. Fourier vs Wavelets
Let me first share the doc link in which I have summarized every lecture- https://docs.google.com/document/d/1hVTkZZg0dv_VdqvnsozQwxmCHgarCSnuo0zpEwmV2Tk/edit?usp=sharing

Here I share my scribbles of the sub topics I just mentioned.














I could have made this post very elaborate but I limited myself in just providing the key ideas. I'll share few references here:
http://videolectures.net/DLRLsummerschool2018_toronto/
https://drive.google.com/drive/folders/0B41Zbb4c8HVyUndGdGdJSXd5d3M
https://www.udacity.com/course/intro-to-machine-learning--ud120
http://neuralnetworksanddeeplearning.com/about.html
https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/

This subject is very interesting and the more you explore the more you'll understand. So, enjoy the exploration folks! 

My first Code-along workshop

I had one of the most satisfying Saturday last weekend and that feeling is the reason I'm writing a blog after almost a year. I often...