Page 1 of 1

Another one for the mathematicians...

Posted: Mon Sep 13, 2010 7:19 am
by Seymour Clufley
I’m pretty bad at maths. I need to know a general “thing”. The answer may be pathetically obvious but please be gentle!

Is it possible to come up with an equation for the trend formed by a number of non-repeating, possibly random points?

For example, could there be an equation to describe a trend like this:

Image

In my mathematical ignorance, I doubt there is such an equation.

The goal is this: from a number of known levels, derive a generic "tendency". I want to be able to play around with different lifestyle factors, to see how they influence a person's (likely) behaviour - specifically, how likely they are to own a car.

Obviously having the money available will make it more likely that they'll get a car. And age will be involved, etc.

Now I can trend these two factors separately and get fairly random looking trends.

Image
Image

What I want to be able to do is say: "let's set everyone's income to $20k and see how that affects their car-buying".
Or even: "divide salary influence by age influence and plot the resulting likely car ownership"

Does anyone have any recommendations on how to approach this?

I don't think it can be done by simply operating on the points along both trends. The trends represent completely different things. For example, 40% of the way along the trends is 44 (for the age graph), and $96k (for the salary graph). If you were to get the median of those two levels, you’d be using a salary of $96k for everyone of age 44, no matter what their real salary was.

Therefore, I need to turn each trend into a generic “tendency”, a “gear”... and it would seem an equation would be the way to do that. Get the equation for one trend, and you can process the other trend using the equation.

Or maybe that way is just not possible... so, is this pie-in-the-sky or does anyone know a way to do it?

Thanks for reading,
Seymour.

Re: Another one for the mathematicians...

Posted: Mon Sep 13, 2010 8:58 am
by blueznl
Whip up the cream... pie time I fear.

Re: Another one for the mathematicians...

Posted: Mon Sep 13, 2010 3:39 pm
by srod
A Lagrange polynomial would allow you to shove a polynomial (quadratic, cubic, quartic etc.) through a given set of points or a set of cubic splines would probably do as well.

It just depends what you are trying to do here because it sounds a bit 'muddled' to me?

Other than that... as blueznl says, it's pie time! :)

Re: Another one for the mathematicians...

Posted: Mon Sep 13, 2010 3:49 pm
by gnasen
Im not sure what you mean, but if it is what srod said, then you should read this thread http://www.purebasic.fr/english/viewtop ... 16&t=43376

And where is the pie? Now im hungry

Re: Another one for the mathematicians...

Posted: Mon Sep 13, 2010 7:18 pm
by Rook Zimbabwe
Ahhhh SADISTICS!!! I took many classes in SADISTICS when I was in Grad School for my EdD.

You are only counting percentages of people that report car ownership... Both by ALLEGED salary and Age...

(Keep in mind your salary table will be skewed more heavily with fibbing most likely!)

Since you are just counting the number or percentage of people that own a car eacah age range I am usure what the issue is... Are you planning to age grouping/salary grouping? Even then I ssee no real issue. Develop it as a bar chart but only draw a little square at the TOP of the value for each column...

Image

You can use smoothing if you want by checking the end/start of the previou/next column... link them... :mrgreen:

Re: Another one for the mathematicians...

Posted: Mon Sep 13, 2010 8:02 pm
by Little John
Rook Zimbabwe wrote:Ahhhh SADISTICS!!! I took many classes in SADISTICS when I was in Grad School for my EdD.
:lol: :lol:

Re: Another one for the mathematicians...

Posted: Mon Sep 13, 2010 9:20 pm
by idle
If your wanting a moving trend, that's a bit beyond my brains sanity threshold though if you have
two parameters like Number of coders , lines of code written
then you can probably use a linear trend

I don't know if this is right or not only had one coffee

Code: Select all

Structure myData
  x.f
  y.f 
EndStructure   

Structure resData
  yest.f
  resi.f
EndStructure   

Structure TrendData
  yint.f
  slope.f
EndStructure 

Procedure leastSquare(Array mInput.myData(1),Array mOutput.resData(1),*dat.TrendData)
  
Protected a.i,sumX.f,sumY.f,sumXY.f,sumXX.f,slope.f,yintercept.f,sumRes.f

len = ArraySize(mInput())

ReDim mOutput.resData(len)

For a=1 To len 
    
    sumX + mInput(a)\x  
    sumY + mInput(a)\y 
    sumXY + mInput(a)\x * mInput(a)\y 
    sumXX + mInput(a)\x * mInput(a)\x 
Next 
  
slope.f = ((sumX*sumY)-(len * sumXY)) / ((sumX*sumX) - (len*sumXX));
yintercept = (sumY - slope*sumX) / len;
  
For a=1 To len
  yestimate = slope * mInput(a)\x + yintercept;
  resi = mInput(a)\y - yestimate;
  SUMres + resi*resi;
  moutput(a)\yest = yestimate
  moutput(a)\resi = resi 
Next 

*dat\slope = slope 
*dat\yint = yintercept 

EndProcedure 

width = 800 
height = 600 
points = 50

Dim MyInput.myData(points) 
Dim MyOutput.resData(points)
myTrend.TrendData 

my.f = (1-height) / (1-(points*points))
mx.f = (1-width) / (1-(points*points)) 

OpenWindow(0,0,0,width,height,"test") 
StartDrawing(WindowOutput(0))
For a = 1 To points 
  MyInput(a)\x = a*a ;number of coders 
  MyInput(a)\y = a*a + (Random(a*a) * (Random(1)-1)) ;lines of code written  
  x = myinput(a)\x*mx 
  y = height-(myinput(a)\y*my)
  If y > 0 And y < height And lpx
    LineXY(lpx,lpy,x,y,RGB(255,0,0))
    Circle(x,y,3,RGB(255,0,0))
  EndIf
  lpx= x 
  lpy=y
Next 
 
leastSquare(MyInput(),MyOutput(),myTrend)

Debug "Y intercept " + Str(myTrend\yint) 
Debug "Slope " + Str(myTrend\slope)
lpx=0
For a = 1 To points 
  str.s = Str(MyInput(a)\y) +  " " + Str(MyOutput(a)\yest) + " " + Str(MyOutput(a)\resi)   
  Debug str 
  y = MyOutput(a)\yest*my
  If y > 0 And y < height And lpx 
    LineXY(lpx,lpy,myinput(a)\x*mx,height-y,RGB(0,255,0))
    Circle(myinput(a)\x*mx,height-y,3,RGB(0,255,0))
  EndIf
  lpx = myinput(a)\x*mx
  lpy = height-y
Next 

StopDrawing() 


Repeat
  ev = WaitWindowEvent()
Until ev = #WM_CLOSE 



Re: Another one for the mathematicians...

Posted: Thu Sep 16, 2010 7:05 am
by Seymour Clufley
Thanks for this, Idle.

I've only now got a chance to try it out. I'm curious about how it works... I'm not even sure what has been achieved at the end of it.
  • The "regressed" line doesn't go along the most frequent trend. It's between the most frequent trend and the divergent points.
  • Using the slope to draw a new line, it matches the most frequent trends - except if you resize the graph in either dimension, then it seems to mess up.
By the way, shouldn't you put the code in Tips&Tricks?

Re: Another one for the mathematicians...

Posted: Thu Sep 16, 2010 7:48 am
by idle
least squares is what is used to find a linear trend, the line that fits through the center of a scatter plot.
like the center of mass but it's not of much use if your data is exponential or if you want a moving trend
though I'm not sure if it's right. If someone can verify that it's right I'll post it in tips and tricks