Thursday, 8 March 2012

Neural Networks - Primer


About the Post
This post is part of a 3-part introduction to Artificial neural networks. Part-I will introduce the basic concepts of neural networks, Part-II will introduce the common types of neural networks and Part-III will provide some programming examples that illustrate implementation of basic neural networks.

Part – I: Introduction to Artificial Neural Networks.

Introduction
Our brains perform sophisticated information processing tasks, using hardware and operation rules which are quite different from the ones on which conventional computers are based. The processors in the brain, the neurons (nerve cell), are rather noisy elements which operate in parallel. They are organized in dense networks and they communicate signals through a huge number of inter-neuron connections (the so-called synapses). These connections represent the 'program' of a network. By continuously updating the strengths of the connections, a network as a whole can modify and optimize its 'program', 'learn' from experience and adapt to changing circumstances.

The term neural network was traditionally used to refer to a network or circuit of biological neurons. However, modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes.
Artificial neural networks represent a type of computing that is based on the way that the brain performs computations. Neural networks are good at fitting non-linear functions and recognizing patterns. Consequently, they are used in the aerospace, automotive, banking, defense, electronics, entertainment, financial, insurance, manufacturing, oil and gas, robotics, telecommunications, and transportation industries.

Non-linearity of Neurons:
Conventional computers that we use today are more suited to solve problems that exhibit linearity. 

You might recall from your high-school math classes that typically equations are classified as linear, polynomial etc. A Linear equation is one that exhibits additivity and homogeneity:
  • additivity, f(x+y) = f(x) + f(y);
  •  homogeneity, f(αx) = αf(x).
(Additivity implies homogeneity for any rational α, and, for continuous functions, for any real α. For a complex α, homogeneity does not follow from additivity; for example, an anti-linear map is additive but not homogeneous.)


So, conventional computers simply execute a detailed set of instructions, requiring programmers to know exactly which data can be expected and how to respond. Subsequent changes in the actual situation, not foreseen by the programmer, lead to trouble.
Neural networks, on the other hand, can adapt to changing circumstances i.e. can solve non-linear problems. Neural networks are superior to conventional computers in dealing with real-world tasks, such as e.g. communication (vision, speech recognition), movement coordination (robotics) and experience-based decision making (classification, prediction, system control), where data are often messy, uncertain or even inconsistent, where the number of possible situations is infinite and where perfect solutions are for all practical purposes non-existent.

Following table shows a brief comparison between conventional computers and biological neurons:

Conventional computers
Biological neural networks
Operation speed
~108m=sec
connections ~10
~108Hz

~102Hz
Signal/noise
~∞
~1
Signal velocity
~108m/sec
~1m/sec
Execution Model
Sequential operation
Parallel operation
Programming Model
External programming
Self-programming & adaptation
Resilience
Almost fatal incase of hardware failure or unforeseen data
Robust against hardware failure & unforeseen data



Artificial Neural Networks:

An artificial network consists of a set of processing units (called neurons or nodes) that communicate with each other by sending signals over a large number of weighted connections. The neurons along with the interconnections are referred to as neural nets (networks).

Single-layer Neural Networks:

These are the simplest of neural networks, they have a single input layer of neurons connected to a output neuron(s). 

The output is a weighted sum of all the inputs. The output neuron has a threshold (t), if the weighted sum of the input neurons is >= t then the output neuron fires (i.e. output = 1).

Single-layer neural networks have many advantages:
  • Easy to setup and train
  • Explicit link to statistical models
    • Shared covariance Gaussian density function
    • Sigmoid output functions allow a link to posterior probabilities
  • Outputs are weighted sum of inputs: interpretable representation

But some big limitations:
  •          Can only represent a limited set of functions
  •          Decision boundaries must be hyper-planes
  •          Can only perfectly separate linearly separable data


Multi-layer Neural Networks:
Multi-layer networks can model more general networks by considering layers of processing units. They can solve the classification problem for non linear sets by employing hidden layers, whose input neurons are not directly connected to the output neurons. The additional hidden layers can be interpreted geometrically as additional hyper-planes, which enhance the separation capacity of the network. 


Training of artificial neural networks

A neural network has to be configured such that the application of a set of inputs produces the desired set of outputs. Various methods to set the strengths of the connections exist. One way is to set the weights explicitly, using a priori knowledge. Another way is to 'train' the neural network by feeding it teaching patterns and letting it change its weights according to some learning rule.

These are multiple learning paradigms that can be employed to train a neural network:

  • Supervised learning or Associative learning in which the network is trained by providing it with input and matching output patterns. These input-output pairs can be provided by an external teacher, or by the system which contains the neural network (self-supervised).
  • Unsupervised learning or Self-organization in which an (output) neuron is trained to respond to clusters of pattern within the input. In this paradigm the system is supposed to discover statistically salient features of the input population. Unlike the supervised learning paradigm, there is no a priori set of categories into which the patterns are to be classified; rather the system must develop its own representation of the input stimuli.
  • Reinforcement Learning This type of learning may be considered as an intermediate form of the above two types of learning. Here the learning machine does some action on the environment and gets a feedback response from the environment. The learning system grades its action good (rewarding) or bad (punishable) based on the environmental response and accordingly adjusts its parameters. Generally, parameter adjustment is continued until an equilibrium state occurs, following which there will be no more changes in its parameters. The self organizing neural learning may be categorized under this type of learning.
Summary:
This concludes the introductory part to artificial neural networks, in the next part of this series of posts we will see the different types of neural networks.









No comments:

Post a Comment