Showing posts with label Regression. Show all posts
Showing posts with label Regression. Show all posts

Thursday, March 10, 2011

A Python Script to Fit an Ellipse to Noisy Data

ExampleEllipse

Problem statement

Given a set of noisy data which represents noisy samples from the perimeter of an ellipse, estimate the parameters which describe the underlying ellipse.

Discussion

There are two general ways to fit an ellipse: algebraic and geometric approaches. In an algebraic approach, the parameters for an algebraic description of an ellipse are fit subject to constraints which guarantee the parameters result in an ellipse. In the geometric approach,  characteristics of the ellipse are fit.

The code snippet below uses a method described by Yu, Kulkarni & Poor. The location of the foci and the length of the line segments from the foci to a point on the perimeter of the ellipse are found through an optimization problem. Because the fitting objective is not convex and has a minimum at infinity, a penalty cost is added to prevent the foci from wandering off.

Code

'''
Script to fit an ellipse to a set of points.
- The ellipse is represented by the two foci and the length of a 
     line segment which is drawn from the foci to the 
     point where the ellipse intersects the minor axis.
    
- Fitting algorithm from Yu, Kulkarni & Poor

'''

__author__ = 'Ed Tate'
__email__  = 'edtategmail-dot-com'
__website__ = 'exnumerus.blogspot.com'
__license__ = 'Creative Commons Attribute By - http://creativecommons.org/licenses/by/3.0/us/'''

####################################################
# create ellipse with random noise in points
from random import uniform,normalvariate
from math import pi, sin, cos, exp, pi, sqrt
from openopt import NLP
from numpy import *
from numpy import linalg as LA
import matplotlib.pylab as pp

def gen_ellipse_pts(a,foci1,foci2,
                    num_pts=200, angles=None,
                    x_noise = None, y_noise=None):

    '''
       Generate points for an ellipse given
          the foci, and
          the distance to the intersection of the minor axis and ellipse.
    
       Optionally, 
          the number of points can be specified,
          the angles for the points wrt to the centroid of the ellipse, and 
          a noise offset for each point in the x and y axis.
    '''
    c = (1/2.0)*LA.norm(foci1-foci2)
    b = sqrt(a**2-c**2)
    x1 = foci1[0]
    y1 = foci1[1]
    x2 = foci2[0]
    y2 = foci2[1]
    if angles is None:
        t = arange(0,2*pi,2*pi/float(num_pts))
    else:
        t = array(angles)
            
    ellipse_x = (x1+x2)/2 +(x2-x1)/(2*c)*a*cos(t) - (y2-y1)/(2*c)*b*sin(t)
    ellipse_y = (y1+y2)/2 +(y2-y1)/(2*c)*a*cos(t) + (x2-x1)/(2*c)*b*sin(t)
    try:
        # try adding noise to the ellipse points
        ellipse_x = ellipse_x + x_noise
        ellipse_y = ellipse_y + y_noise
    except TypeError:
        pass
    return (ellipse_x,ellipse_y)

####################################################################

# setup the reference ellipse

# define the foci locations
foci1_ref = array([2,-1])
foci2_ref = array([-2,1])
# pick distance from foci to ellipse
a_ref = 2.5

# generate points for reference ellipse without noise
ref_ellipse_x,ref_ellipse_y = gen_ellipse_pts(a_ref,foci1_ref,foci2_ref)

# generate list of noisy samples on the ellipse
num_samples = 1000
angles = [uniform(-pi,pi) for i in range(0,num_samples)]
sigma = 0.2
x_noise = [normalvariate(0,sigma) for t in angles]
y_noise = [normalvariate(0,sigma) for t in angles]
x_list,y_list = gen_ellipse_pts(a_ref,foci1_ref,foci2_ref,
                                angles  = angles,
                                x_noise = x_noise,
                                y_noise = y_noise)

point_list = []
for x,y in zip(x_list,y_list):
    point_list.append(array([x,y]))    

# draw the reference ellipse and the noisy samples    
pp.figure()
pp.plot(x_list,y_list,'.b', alpha=0.5)
pp.plot(ref_ellipse_x,ref_ellipse_y,'g',lw=2)
pp.plot(foci1_ref[0],foci1_ref[1],'o')
pp.plot(foci2_ref[0],foci2_ref[1],'o')

#####################################################

def initialize():
    '''
    Determine the initial value for the optimization problem.
    '''
    # find x mean
    x_mean = array(x_list).mean()
    # find y mean
    y_mean = array(y_list).mean()
    # find point farthest away from mean
    points = array(zip(x_list,y_list))
    center = array([x_mean,y_mean])
    distances = zeros((len(x_list),1))
    for i,point in enumerate(points):
        distances[i,0]=LA.norm(point-center)
    ind = where(distances==distances.max())
    max_pt = points[ind[0],:][0]
    # find point between mean and max point
    foci1 = (max_pt+center)/2.0
    # find point opposite from 
    foci2 = 2*center - max_pt
    return [distances.max(), foci1[0],foci1[1],foci2[0],foci2[1]]


def objective(x):
    '''
    Calculate the objective cost in the optimization problem.
    '''
    foci1 = array([x[1],x[2]])
    foci2 = array([x[3],x[4]])
    a     = x[0]
    n = float(len(point_list))
    _lambda =0.1
    _sigma = sigma
    sum = 0
    for point in point_list:
        sum += ((LA.norm(point-foci1,2)+LA.norm(point-foci2,2)-2*a)**2)/n
    sum += _lambda*ahat_max*_sigma*exp((a/ahat_max)**4)
    return sum

# solve the optimization problem
x0 = initialize()
ahat_max = x0[0]
print x0
p = NLP(objective, x0)
r = p.solve('ralg')
print r.xf

# get the results from the optimization problem
xf = r.xf
# unload the specific values from the result vector
foci1 = array([xf[1],xf[2]])
foci2 = array([xf[3],xf[4]])
a     = xf[0]

# reverse the order of the foci to get closest to ref foci
if LA.norm(foci1-foci1_ref)>LA.norm(foci1-foci2_ref):
    _temp = foci1
    foci1 = foci2
    foci2 = _temp

####################################################
# plot the fitted ellipse foci
pp.plot([foci1[0]],[foci1[1]],'xk')
pp.plot([foci2[0]],[foci2[1]],'xk')

# plot a line between the fitted ellipse foci and the reference foci
pp.plot([foci1[0],foci1_ref[0]],[foci1[1],foci1_ref[1]],'m-')
pp.plot([foci2[0],foci2_ref[0]],[foci2[1],foci2_ref[1]],'m-')

# plot fitted ellipse
(ellipse_x,ellipse_y) = gen_ellipse_pts(a,foci1,foci2,num_pts=1000)  
pp.plot(ellipse_x,ellipse_y,'r-',lw=3,alpha=0.5)

# scale the axes for a square display
x_max = max(x_list)
x_min = min(x_list)
y_max = max(y_list)
y_min = min(y_list)

box_max = max([x_max,y_max])
box_min = min([x_min,y_min])
pp.axis([box_min, box_max, box_min, box_max])

pp.show()


References


Testing Configuration

This work is licensed under a Creative Commons Attribution By license.

Wednesday, July 21, 2010

Regression and Curve Fitting in Python – Pt 2

Weighted Curve Fitting.

ErrorBand

Introduction

When using least-squares linear regression, an assumption in typical  implementations is that the noise is Gaussian, white, and has the same statistics for all measurements. All of the solutions discussed in part 1 of this tutorial make this assumption including the polyfit function. This means that the regression will only work correctly is the measurement device always has the same statistics in its measurement errors. Sometimes, when data is collected, the noise statistics vary with each measurement. For example, this can happen when the background noise changes over time. Weighted least squares is a way to find fit a curve or find parameters when this occurs.

An Example

SCAN0899 (2)

Consider a simplified ballistic problem. Ignoring the effect of air resistance, the vertical position of a projective is governed by its initial position, initial velocity, and the force of gravity. Since air resistance is not considered, the measured altitude of projectile is a closed form function of time:

image

The noise term at the end of the equation is the error introduced by the measurement system. The noise, varies as a function of time and is represented by a normal distribution with time varying variance. For this problem, assume the noise gets worse with each subsequent sample. The mean of the error in each measurement is zero and has a standard deviation equal to 1 plus 4 times the number of seconds since the measurements started:

image

This might happen because the measurement hardware heats up with each sample, or the projectile moves away from the sensor. To visualize how this distribution of measurement errors looks, the measurements are taken many times. Each experiment is plotted over the previous experiments. The density of the measurements provides a visual feel for how the increase in error spreads out the measurements. This is illustrated here:

ErrorCloud The challenge in this kind of problem is using the knowledge of the model of the behavior (e.g. the ballistics) and the model of the noise to find an optimal fit. If a suboptimal method is used, the error in the fit is significantly greater than if an optimal method is used. A single comparison on a one data set is insufficient to see the benefits of one approach versus another. When a large number of experiments are performed where the true value is know and the estimated values are know, the distribution of the estimation errors due to different approaches is visible. The following graph shows how using weighted least-squares versus least-squares assuming constant weights improves on the distribution of errors.

ErrorDistributionThese graphs were generated from the following script:

from random import normalvariate
from pylab import *

class System(object):
    def __init__(self,p0=0.0,v0=10.0,a=-9.8):
        self.p0 = p0
        self.v0 = v0
        self.a = a

    def position(self,time):
        return self.p0+self.v0*t+0.5*self.a*t**2
        
class Sensor(object):
    def __init__(self,system=System(),errorStd=0.0):
        self.system = system
        self.errorStd = errorStd
    def error(self,t=0):
        try:
            std = self.errorStd(t=t)
            err = normalvariate(0,std)
        except TypeError:
            err = normalvariate(0,self.errorStd)
        return err
        
class PositionSensor(Sensor):
    def measure(self,t):
        err = Sensor.error(self,t=t)
        return self.system.position(t)+err
        
    def errorStd(self,t):
        try:
           return self.errStd(t)
        except TypeError:
            return self.errStd


def weightedPolyFit(xList,yList,sigmaList,order=1,crossesAtZero=False):
    '''fit the data using a weighted least squares for a polynomial model
        xList is a list of input values
        yList is a list of output values
        sigmaList is a list with the standard deviation for each y value
        order defines the order of the model, 
            order = 1 -> linear model
            order = 2 -> quadratic model
        crossesAtZero specifies whether the polynomial must be equal to zer
            at x = 0
    '''
    fList = [(lambda x,n=n: x**n) for n in range(order,-1,-1)]
    if crossesAtZero: 
        # eliminate the first column so the model corsses at 0
        del fList[0]
    # build row for each element in y
    bList = []
    A_List = []
    for (thisX,thisY) in zip(xList,yList):
        bList.append(thisY)
        A_Row = [f(thisX) for f in fList]
        A_List.append(A_Row)
    W = diag([1.0 /sigma for sigma in sigmaList])
    b = matrix(bList).T
    A = matrix(A_List)
    b2 = W*b
    A2 = W*A
    w = inv(A2.T*A2)*A2.T*b2
    return w.T.tolist()[0]


if __name__=='__main__':
    import random
    
    errStdFun = lambda t: 1 + 4.0*t
    
    random.seed(0)

    p0True = 0.0
    v0True = 10.0
    aTrue = 9.8

    s = System()
    p = PositionSensor(s,errStdFun)
    tSamples = arange(0,2,0.025)
 
    pErr1List = []
    pErr2List = []
    vErr1List = []
    vErr2List = []
    aErr1List = []
    aErr2List = []
    
    for i in range(0,20000):
        measuredPosition = [p.measure(t) for t in tSamples]
        
        xList = tSamples
        yListMeasured = measuredPosition
        sigmaList = [p.errorStd(t) for t in tSamples]
        w =  weightedPolyFit(xList,yListMeasured,sigmaList,order=2)
        p0Est = w[2]
        v0Est = w[1]
        aEst  = w[0]*2
        pErr1List.append(p0Est-p0True)
        vErr1List.append(v0Est-v0True)
        aErr1List.append(aEst-aTrue)
        
        w2 =  polyfit(xList,yListMeasured,2)
        p0Est2 = w2[2]
        v0Est2 = w2[1]
        aEst2  = w2[0]*2
        pErr2List.append(p0Est2-p0True)
        vErr2List.append(v0Est2-v0True)
        aErr2List.append(aEst2-aTrue)

    figure(1)
    subplot(3,1,1)
    hist(pErr1List,50,normed=True)
    hist(pErr2List,50,normed=True,alpha=0.5)
    grid(True)
    title('Error in initial position estimate')
    
    
    subplot(3,1,2)
    hist(vErr1List,50,normed=True)
    hist(vErr2List,50,normed=True,alpha=0.5)
    grid(True)
    title('Error in initial velocity estimate')

    subplot(3,1,3)
    hist(aErr1List,50,normed=True,label='Weighted Least Sq Fit')
    hist(aErr2List,50,normed=True,label='Least Sq Fit',alpha=0.5)
    legend()
    grid(True)
    xlabel('error')
    title('Error in acceleration estimate')


    show() 

Some Theory: Finding the best fit

Recall from pt 1, that a least-squares fit is performed by reducing the equations to a matrix expression,

image then using the KKT conditions to find the weights which minimize the sum of errors squared.

In weighted least-squares, each error can have a different relative importance in the minimization problem. Usually, this weighting is equal to the inverse of the standard deviation of the error and each error is assumed to be uncorrelated. If these conditions are met, the relative weighting is in a diagonal matrix. Using the KKT conditions, this minimization problem,

image

is solved using

image 

The solved example

To solve this problem, a system class and a sensor class is created. The sensor class is subclassed into a position sensor. The system class provides a model of the true behavior of the system. The sensor class provides a model of the data measured by a sensor which detecting the true position of the system in the presence of noise.

The weightedPolyFit function, in the listing,  provides the logic to generate a weighted fit for parameters in a polynomial equation, which describes the position of the projectile.

 

This plot shows the true trajectory of the projectile, the measured positions, and the estimated positions.

Position and Measurements

The full code for the example

from random import normalvariate
from pylab import *

class System(object):
    def __init__(self,p0=0.0,v0=10.0,a=-9.8):
        self.p0 = p0
        self.v0 = v0
        self.a = a

    def position(self,time):
        return self.p0+self.v0*t+0.5*self.a*t**2
        
class Sensor(object):
    def __init__(self,system=System(),errorStd=0.0):
        self.system = system
        self.errorStd = errorStd
    def error(self,t=0):
        try:
            std = self.errorStd(t=t)
            err = normalvariate(0,std)
        except TypeError:
            err = normalvariate(0,self.errorStd)
        return err
        
class PositionSensor(Sensor):
    def measure(self,t):
        err = Sensor.error(self,t=t)
        return self.system.position(t)+err
        
    def errorStd(self,t):
        try:
           return self.errStd(t)
        except TypeError:
            return self.errStd


def weightedPolyFit(xList,yList,sigmaList,order=1,crossesAtZero=False):
    '''fit the data using a weighted least squares for a polynomial model
        xList is a list of input values
        yList is a list of output values
        sigmaList is a list with the standard deviation for each y value
        order defines the order of the model, 
            order = 1 -> linear model
            order = 2 -> quadratic model
        crossesAtZero specifies whether the polynomial must be equal to zer
            at x = 0
    '''
    fList = [(lambda x,n=n: x**n) for n in range(order,-1,-1)]
    if crossesAtZero: 
        # eliminate the first column so the model corsses at 0
        del fList[0]
    # build row for each element in y
    bList = []
    A_List = []
    for (thisX,thisY) in zip(xList,yList):
        bList.append(thisY)
        A_Row = [f(thisX) for f in fList]
        A_List.append(A_Row)
    W = diag([1.0 /sigma for sigma in sigmaList])
    b = matrix(bList).T
    A = matrix(A_List)
    b2 = W*b
    A2 = W*A
    w = inv(A2.T*A2)*A2.T*b2
    return w.T.tolist()[0]


if __name__=='__main__':
    import random
    
    errStdFun = lambda t: 1 + 4.0*t
    
    random.seed(0)
    s = System()
    p = PositionSensor(s,errStdFun)
    
    tSamples = arange(0,2,0.025)
    truePosition = [s.position(t) for t in tSamples]
    measuredPosition = [p.measure(t) for t in tSamples]
    
    xList = tSamples
    yListMeasured = measuredPosition
    sigmaList = [p.errorStd(t) for t in tSamples]
   
    w =  weightedPolyFit(xList,yListMeasured,sigmaList,order=2)
    p0Est = w[2]
    v0Est = w[1]
    aEst  = w[0]*2
    print '..p0Est = %f, v0Est=%f, aEst = %f' % (p0Est,v0Est,aEst)
    sEst = System(p0=p0Est,v0=v0Est,a=aEst)
    est1Position = [sEst.position(t) for t in tSamples] 
 
    figure(1)
    plot(tSamples,truePosition)
    plot(tSamples,est1Position,'--k')
    plot(tSamples,measuredPosition,'+r',markeredgewidth=2)
    grid(True)
    ylabel('Position [meters]')
    xlabel('Time [sec]')
    title('Position, Measurements, and Estimated Position')
    legend(['True Position','Estimated Position','Measured Position'])
    savefig('Position and Measurements.png')

    figure(2)
    err3Sigma = [2.0*p.errorStd(t) for t in tSamples]
    errorbar(tSamples,truePosition,err3Sigma)
    #plot(tSamples,truePosition)
    plot(tSamples,est1Position,'--k')
    plot(tSamples,measuredPosition,'+r',markeredgewidth=2)
    grid(True)
    ylabel('Position [meters]')
    xlabel('Time [sec]')
    title('2 sigma error band')
    savefig('ErrorBand.png')

    figure(3)
    for i in range(0,250):
        measuredPosition = [p.measure(t) for t in tSamples]
        plot(tSamples,measuredPosition,'xr',alpha=0.1,markeredgewidth=2)
    plot(tSamples,truePosition,'-b')
    upperLimit = [ 2.0*p.errorStd(t)+s.position(t) for t in tSamples]
    lowerLimit = [-2.0*p.errorStd(t)+s.position(t) for t in tSamples]
    plot(tSamples,upperLimit,'--k')
    plot(tSamples,lowerLimit,'--k')
    grid(True)
    ylabel('Position [meters]')
    xlabel('Time [sec]')
    title('True Position, Measurements, & 95% Bounds on Measurements')

    show()

 


See part 1 here.


All text is copyright © 2010, Ed Tate, All Rights Reserved.

All software and example codes are subject to the MIT License

Copyright (c) 2010, Ed Tate, Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Wednesday, April 7, 2010

How to fit exponential decay – An example in Python

titleGraphOfExponentFit

Linear least squares can be used to fit an exponent. However, the linear least square problem that is formed, has a structure and behavior that requires some careful consideration to fully understand. Usually, fitting is used because the data is noisy. If only the number of data points equal to the number of free variables in system of equations is used, the estimate of parameters will generally be poor.

For example, a common problem is estimating the parameters or coefficients for cooling. For example, a mass is heated to a steady temperature, then left to cool. Ignoring a lot of detail, a model of this behavior can be described by a simple first order, ordinary differential equation:

imageIn this equation, T is the temperature of the object, T0 is the ambient temperature, and h is a coefficient of hear transfer. When T0 is held constant and T(t=0) is not equal to T0, T(t) is described by an exponential decay function.

An exponential decay function is

image

For a system whose behavior can be defined by exponential decay, the parameters for the decay function can be found using least-squares. Since the data usually has measurement errors, the measured data from an exponential decay will usually contain an error term.

imageIdeally, this equation could  be directly set up as a linear least squares problem.   However, minimizing the norm of epsilon, requires solution via methods other than linear least squares. To formulate this problem as a linear least squares minimization, a new error term inside the exponent is introduced, del.

image

The usual way to set this problem up is to minimize the norm of epsilon.

image

However, if the problem is set up to minimize the 2-norm of del, then a linear least squared minimization can be formed.

 image

To linearize this problem, the terms in the constraints are rearranged, the natural log of each side is taken, and the properties of logarithms are isolate terms.

image

The problem statement is simplified by eliminating the epsilon term.

image 

Furthermore, this constrained optimization problem is restated as an unconstrained optimization problem.

image

Least squares can be used to solve this problem.

The reason for this development is to understand what is really solved by this formulation. When this technique is used to solve for an exponential delay function’s parameters, the measurement errors are not minimized. An artificial term which resembles an error in time is minimized.

The following python code shows how to solve this kind of problem.

 

from pylab import *
from math import log

def fitExponent(tList,yList,ySS=0):
   '''
   This function finds a 
       tList in sec
       yList - measurements
       ySS - the steady state value of y
   returns
       amplitude of exponent
       tau - the time constant
   '''
   bList = [log(max(y-ySS,1e-6)) for y in yList]
   b = matrix(bList).T
   rows = [ [1,t] for t in tList]
   A = matrix(rows)
   #w = (pinv(A)*b)
   (w,residuals,rank,sing_vals) = lstsq(A,b)
   tau = -1.0/w[1,0]
   amplitude = exp(w[0,0])
   return (amplitude,tau)

if __name__=='__main__':
   import random

   tList = arange(0.0,1.0,0.001)
   tSamples = arange(0.0,1.0,0.2)
   random.seed(0.0)
   tau = 0.3
   amplitude = 3
   ySS = 3
   yList = amplitude*(exp(-tList/tau))+ySS
   ySamples = amplitude*(exp(-tSamples/tau))+ySS
   yMeasured = [y+random.normalvariate(0,0.05) for y in ySamples]
   #print yList
   (amplitudeEst,tauEst) = fitExponent(tSamples,yMeasured,ySS)
   print ('Amplitude estimate = %f, tau estimate = %f'
       % (amplitudeEst,tauEst))
       
   yEst = amplitudeEst*(exp(-tList/tauEst))+ySS

   figure(1)
   plot(tList,yList,'b')
   plot(tSamples,yMeasured,'+r',markersize=12,markeredgewidth=2)
   plot(tList,yEst,'--g')
   xlabel('seconds')
   legend(['True value','Measured values','Estimated value'])
   grid(True)
   show()

 


All text is copyright © 2010, Ed Tate, All Rights Reserved.

All software and example codes are subject to the MIT License

Copyright (c) 2010, Ed Tate, Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Tuesday, April 6, 2010

How to fit a sine wave – An example in Python

TitleGraphicsOfSineFit

If the frequency of a signal is known, the amplitude, phase, and bias on the signal can be estimated using least-squares regression. The key concept that makes this possible is the fact that a sine wave of arbitrary phase can be represented by the sum of a sin wave and a cosine wave.

image

The regression problem to find the amplitude and phase is an optimization problem. However,  it is not easily solved when using the amplitude and phase directly. This is because the problem is nonconvex; it has multiple minima.  By applying trigonometric identities, an equivalent problem, which is convex, is formed.

image

Once the regression problem is in this form, the solution is found by forming linear least squares problem. The python function illustrates how to do this. Since python’s function work in radians but most people prefer Hertz and degrees, this script performs those conversions.

 

from pylab import *
from math import atan2

def fitSine(tList,yList,freq):
   '''
       freq in Hz
       tList in sec
   returns
       phase in degrees
   '''
   b = matrix(yList).T
   rows = [ [sin(freq*2*pi*t), cos(freq*2*pi*t), 1] for t in tList]
   A = matrix(rows)
   (w,residuals,rank,sing_vals) = lstsq(A,b)
   phase = atan2(w[1,0],w[0,0])*180/pi
   amplitude = norm([w[0,0],w[1,0]],2)
   bias = w[2,0]
   return (phase,amplitude,bias)

if __name__=='__main__':
   import random

   tList = arange(0.0,1.0,0.001)
   tSamples = arange(0.0,1.0,0.05)
   random.seed(0.0)
   phase = 65
   amplitude = 3
   bias = -0.3
   frequency = 4
   yList = amplitude*sin(tList*frequency*2*pi+phase*pi/180.0)+bias
   ySamples = amplitude*sin(tSamples*frequency*2*pi+phase*pi/180.0)+bias
   yMeasured = [y+random.normalvariate(0,2) for y in ySamples]
   #print yList
   (phaseEst,amplitudeEst,biasEst) = fitSine(tSamples,yMeasured,frequency)
   print ('Phase estimate = %f, Amplitude estimate = %f, Bias estimate = %f'
       % (phaseEst,amplitudeEst,biasEst))
       
   yEst = amplitudeEst*sin(tList*frequency*2*pi+phaseEst*pi/180.0)+biasEst

   figure(1)
   plot(tList,yList,'b')
   plot(tSamples,yMeasured,'+r',markersize=12,markeredgewidth=2)
   plot(tList,yEst,'-g')
   xlabel('seconds')
   legend(['True value','Measured values','Estimated value'])
   grid(True)
   show()

 


All text is copyright © 2010, Ed Tate, All Rights Reserved.

All software and example codes are subject to the MIT License

Copyright (c) 2010, Ed Tate, Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Sunday, March 28, 2010

Regression & Curve Fitting in Python – pt 1

wideTrajectoryFit

Background

There are several good tutorials on linear regression and curve fitting using python already available. See here, here, here, and here.  There is a quick note on curve fitting using genetic algorithms here. There is even an interesting foray into Bayesian Logistic Regression here. For simple regression problems involving only polynomials, look at the polyfit function. For other regression problems, the curve_fit function in scipy is available.
This sequence of tutorials will introduce a less common approach to linear regression based on convex optimization. An excellent text on this topic,  (although very dense reading) is Convex Optimization
These tutorials show
  • how to scale a set of functions to best approximate a set of data: curve fitting, regression, approximation, smoothing,  interpolation, and extrapolation; 
  • what are the conditions for that fit to be best;
  • how to use different functions like sine, cos, tan, log, and exp to find an analytic expression that ‘best’ describes arbitrary data; and
  • how to use knowledge about the final function to improve a fit: monotonicity, convexity, extreme values, and limits.

Introduction

Lets start with a picture. This graph shows a trajectory fit to noisy data. Measurements on the trajectory are shown as red crosses and the regressed trajectory is shown as the black line.  By the last entry of this tutorial, solving this kind of problem will be easy with a few lines of python.
trajectoryFit

The Basics – Linear Regression using Polynomials

The usual regression question is how to fit a polynomial to a set of data. There is more to this question than appears at first. Fitting data involves answering the question of what is ‘best’. Linear regression answers that question by providing an answer that minimizes the sum if squares of difference between the fit and the data. This answer is useful in many cases, but not always! There are other answers.
When fitting data to a polynomial, regression minimizes this expression:
equations0x
In this expression, xi and yi, are a data tuple and wi is the weighting to apply to each power of xi .  The  wi  values are selected to minimize the squared difference between the estimate, which is a function of x, and the measurement y. This expression is used because it is easy to solve (once you know how), and it describes the maximum likelihood answer if the polynomial describes the relations between x and y (e.g. if there were no errors, the equations perfectly describe what is happening), the measurement errors are not correlated (e.g. independent and identically distributed – iid) and the errors have a zero mean gaussian distribution. Even when this is not the case, this approach is pretty good.
This equation can be solved in many ways with readily available software packages. In the Numpy libraries there is polyfit. Numpy also has lstsq, which solves a least squares fit.  Argonne National Labs has a least squares fit package here that can find the best polynomial (or other families of functions). For this tutorial, things will be solved the hard way before existing libraries are used.
If the value for y are formed into a vector, and a special matrix, known as the Vandermonde matrix,  is formed from the values of x, then the result is a linear system of equations.
imageThis matrix can be formed using the vander function in Numpy.For a fitting problem is in this form,  it can be neatly expressed using matrix notation.
image Using matrix expression, the least-squares fitting problem is neatly expressed using just a few lines.
image This can be solved numerically using an optimizer in scipy like fmin_slqp. This can be a very computationally expensive way to solve the problem (e.g. it takes a long time to solve). A more computationally efficiency (e.g. faster) way to solve this problem is to use the KKT conditions.  This not only works, but results in the global optimum, because this problem is convex. This is important, because not all problems can easily be solved for the global optimum. Some problems have many local optimums, which make it difficult to find the best overall answer. Using the KKT conditions, the optimum values for w* are found using simple matrix operations.
image

A first example, the hard way…

Before fitting a curve to data, it helps to have data. The following python script will data for a quadratic relationship between x and y. The measurements of y will be corrupted by a Gaussian (or normal) noise. This model is expressed as
image 
with the random noise described by
image
This python script will build a useful data set.
from pylab import *
from random import normalvariate

a = 0.03
b = –0.21
c = 0.4
f = lambda x,a=a,b=b,c=c:a*x**2+b*x+c
xList = arange(-10,10,0.25)
yListTrue = [f(x) for x in xList]
yListMeasured = [y+normalvariate(0,1) for y in yListTrue]
This script takes the lists of points and finds the best quadratic fit for the data.
# fit the data
def polyFit(xList,yList,order=1):
    '''fit the data using a least squares and polynomial'''
    fList = [(lambda x,n=n: x**n) for n in range(order,-1,-1)]
    # build row for each element in y
    bList = []
    A_List = []
    for (thisX,thisY) in zip(xList,yList):
        bList.append(thisY)
        A_Row = [f(thisX) for f in fList]
        A_List.append(A_Row)
    b = matrix(bList).T
    A = matrix(A_List)
    w = inv(A.T*A)*A.T*b
    return w.T.tolist()[0]
    
w = polyFit(xList,yListMeasured,order=2)
aHat = w[0]
bHat = w[1]
cHat = w[2]

# summarize the results
print 'Data model is   :%4.2f*x^2 + %4.2f*x + %4.2f' % (a,b,c)
print 'Fit equation is :%4.2f*x^2 + %4.2f*x + %4.2f' % (aHat,bHat,cHat)

# plot, for visual comparison
def getPolyF(w):
    '''create a function using the fit values'''
    return lambda x,w=w: sum([thisW*(x**(len(w)-n-1)) 
                            for n,thisW in enumerate(w)])

fHat = getPolyF(w)
xPlotList = arange(-10,10,0.1)
yEstList = [fHat(x) for x in xPlotList]
fTrue = getPolyF([a,b,c])
yTrueList = [fTrue(x) for x in xPlotList]

figure(1)
plot(xPlotList,yEstList,'--g',linewidth=2)
plot(xPlotList,yTrueList,'b',linewidth=2)
plot(xList,yListMeasured,'+r',markersize = 12,markeredgewidth=2)
xlabel('x')
ylabel('y')
legend(['estimate','true','measured'])
grid(True)
savefig('secondOrderFitEx.png')
show()
When this script is run, the console output is
Data model is   :0.03*x^2 + -0.21*x + 0.40 
Fit equation is :0.03*x^2 + -0.26*x + 0.46
Different answers will occurs on each run because random numbers are used.
secondOrderFitEx

Another example, simpler this time…

In the first example, a lot of the code was built by hand. To make the task easier, the libraries in Numpy & Scipy are used.
The first change is to incorporate the vander function and psuedo inverse, pinv, functions into the polyFit function. The vander function builds the Vendermonde matrix and the pinv function performs the same operation as inv(A.T*A)*A.T
def polyFit(xList,yList,order=1):
    '''fit the data using a least squares and polynomial'''
    A = vander(xList,order+1)
    b = matrix(yList).T
    w = pinv(A)*b
    return w.T.tolist()[0]
Alternatively, the polyFit function could be created using the lstsq function. This function is nice because it provides additional information that can be useful in checking on the quality of a fit.
def polyFit(xList,yList,order=1):
    '''fit the data using a least squares and polynomial'''
    A = vander(xList,order+1)
    b = matrix(yList).T
    (w,residuals,rank,sing_vals) = lstsq(A,b)
    return w.T.tolist()[0]
Finally, the polyFit function could be eliminated entirely, and replaced with the polyfit function.
The second change is to replace the getPolyF function with the poly1d function in Numpy. This gets rid of a few lines of code.
fHat = poly1d((aHat,bHat,cHat))
xPlotList = arange(-10,10,0.1)
yEstList = [fHat(x) for x in xPlotList]
fTrue = poly1d((a,b,c))
yTrueList = [fTrue(x) for x in xPlotList]
Combining all of these changes, the example script becomes….
from pylab import *
from random import normalvariate

# generate the data
a = 0.03
b = -0.21
c = 0.4
f = lambda x,a=a,b=b,c=c:a*x**2+b*x+c
xList = arange(-10,10,0.5)
yListTrue = [f(x) for x in xList]
yListMeasured = [y+normalvariate(0,1) for y in yListTrue]

# fit the data
w = polyfit(xList,yListMeasured,2)
aHat = w[0]
bHat = w[1]
cHat = w[2]

# summarize the results
print 'Data model is   :%4.2f*x^2 + %4.2f*x + %4.2f' % (a,b,c)
print 'Fit equation is :%4.2f*x^2 + %4.2f*x + %4.2f' % (aHat,bHat,cHat)

# plot, for visual comparison
fHat = poly1d((aHat,bHat,cHat))
xPlotList = arange(-10,10,0.1)
yEstList = [fHat(x) for x in xPlotList]
fTrue = poly1d((a,b,c))
yTrueList = [fTrue(x) for x in xPlotList]

figure(1)
plot(xPlotList,yEstList,'--g',linewidth=2)
plot(xPlotList,yTrueList,'b',linewidth=2)
plot(xList,yListMeasured,'+r',markersize = 12,markeredgewidth=2)
xlabel('x')
ylabel('y')
legend(['estimate','true','measured'])
grid(True)
savefig('secondOrderFitEx.png')
show()
The real work for fitting the polynomial is now done by one line of code, and the reconstruction of the curve is done by another.
The reason for performing the fits using custom code is so later, more interesting fits can be found.

See part 2 for a tutorial on weighted fitting & regression.


All text is copyright © 2010, Ed Tate, All Rights Reserved.
All software and example codes are subject to the MIT License
Copyright (c) 2010, Ed Tate, Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.