Is there any code available in the public domain for PB statistics functions? I haven't seen anything on the code archive site. e.g. mean, mode, variance, standard deviation, binomial distribution etc.
Thanks.
Statistics functions?
Re: Statistics functions?
I havent seen something like this here until now. Otherwise it shouldnt be that hard to implement it, as long as its only discrete stochastic.
pb 5.11
Re: Statistics functions?
For float-lists I have something like this:
Code: Select all
Enumeration
#Average_Arithmetic
#Average_Geometric
#Average_Harmonic
#Average_Median
EndEnumeration
Procedure.f AverageList(List Float.f(), Mode.i=#Average_Arithmetic)
Protected Average.f
Select Mode
Case #Average_Arithmetic
Average = 0.0
ForEach Float()
Average + Float()
Next
Average = Average / ListSize(Float())
Case #Average_Geometric
Average = 1.0
ForEach Float()
Average * Float()
Next
Average = Pow(Average, 1/ListSize(Float()))
Case #Average_Harmonic
Average = 0.0
ForEach Float()
Average + 1/Float()
Next
Average = ListSize(Float()) / Average
Case #Average_Median
If ListSize(Float()) % 2
SelectElement(Float(), (ListSize(Float())-1)/2)
Average = Float()
Else
SelectElement(Float(), ListSize(Float())/2-1)
Average = Float()/2
NextElement(Float())
Average + Float()/2
EndIf
EndSelect
ProcedureReturn Average
EndProcedure
NewList Test.f()
AddElement(Test()) : Test() = 1
AddElement(Test()) : Test() = 2
AddElement(Test()) : Test() = 3
AddElement(Test()) : Test() = 4
AddElement(Test()) : Test() = 7
Debug "Arithmetic: "+StrF(AverageList(Test(), #Average_Arithmetic))
Debug "Geometric: "+StrF(AverageList(Test(), #Average_Geometric))
Debug "Harmonic: "+StrF(AverageList(Test(), #Average_Harmonic))
Debug "Median: "+StrF(AverageList(Test(), #Average_Median))
PB 6.01 ― Win 10, 21H2 ― Ryzen 9 3900X, 32 GB ― NVIDIA GeForce RTX 3080 ― Vivaldi 6.0 ― www.unionbytes.de
Lizard - Script language for symbolic calculations and more ― Typeface - Sprite-based font include/module
Lizard - Script language for symbolic calculations and more ― Typeface - Sprite-based font include/module
-
Little John
- Addict

- Posts: 4869
- Joined: Thu Jun 07, 2007 3:25 pm
- Location: Berlin, Germany
Re: Statistics functions?
Code: Select all
Procedure MeanAndVariance (Array x.d(1), *mean.Double, *variance.Double, last=-1)
; -- Fast calculation of Arithmetic Mean and Estimated Population Variance
; (only for *quantitative* data)
; in : x() : array of measured data
; last: last element that is taken into account
; out: *mean, *variance
Protected i.i, n.i
Protected.d sum=0, sum_q=0
If last = -1
last = ArraySize(x())
EndIf
For i = 0 To last
sum + x(i)
sum_q + x(i)*x(i)
Next
n = last + 1
*mean\d = sum / n
*variance\d = (sum_q - *mean\d*sum) / (n-1)
EndProcedure
StandardDeviation = Sqr(Variance)Regards, Little John
Re: Statistics functions?
Thanks Little John and STARGATE. It shouldn't involve too much work to write the procedures I need, but maybe I'll put in a request to have some stats routines added to PB for future versions. 
Re: Statistics functions?
Some old code I use for stats of 1 dimensional arrays of Doubles and ignoring nulls if necessary.
Code: Select all
EnableExplicit
;-{ CONSTANTS
#small1 = 1e-16 ; small '1' to add to double calcs to round up or prevent overflows.
#nNull = -999.00000000000000000
#nNullSTOP = -999999.00000000000000
;- Array Math Operation Units
#arm_unitsNA = 0
#arm_unitslinear = 1
#arm_unitsLog = 2
#arm_unitsLogPwr = 3
;- Array Stats
#ars_nPts = 0 ; nPts of array
#ars_min = 1
#ars_max = 2
#ars_range = 3
#ars_sum = 4
#ars_avg = 5
#ars_stddev = 6
#ars_SkipNull = 7 ; nPts - nulls
#ars_pcstddev = 8 ; %Stddev = Sigma/Avg*100
#ars_mnsigma = 9 ; minus n Sigma
#ars_pnsigma = 10 ; plus n Sigma
#ars_Skew = 11 ; Skew of data set
#ars_Kurt = 12 ; Kurtosis of data set
#ars_Xmonotonic = 13
#ars_Ymonotonic = 14
#ars_minp = 15 ; Pointer to min
#ars_maxp = 16 ; Pointer to max
#ars_pcrange = 17 ; %range = range/Avg*100
#ars_nStats = 18 ; Current number of Stats Fields
;-}
Macro UC_mW_dB(X_mW)
; Convert mW To dB
Log10(X_mW) * 20.0
EndMacro
Macro UC_mW_dBm(X_mW)
; Convert mW To dBm
Log10(X_mW) * 10.0
EndMacro
Macro UC_dB_LinMag(Y)
Pow(10.0, ((y) / 20.0))
EndMacro
Macro UC_dBm_LinMag(Y)
Pow(10.0, ((y) / 10.0))
EndMacro
Procedure.i arDelD(Array X.d(1), N.d=#nNull)
; Delete any N Values from the array, then redim the array
Protected.i nPts
nPts = ArraySize(x()) + 1
If nPts > 1
Protected.i r, k
Protected Dim Xt.d(nPts - 1)
k = 0
For r = 0 To nPts - 1
If X(r) <> N
Xt(k) = X(r)
k = k + 1
EndIf
Next r
; Assign the reduced array to the calling array
If k > 0
ReDim Xt(k-1)
; copy array contents to X()
CopyArray(Xt(), X()) ; Dim X() not req'd
Else ; no non-null pts found
k = 1
Dim X.d(0)
X(0) = N
EndIf
ProcedureReturn k
Else
ProcedureReturn 0
EndIf
EndProcedure
Procedure arFillD(Array x.d(1), nPts.i, N.d=#nNull)
Protected.i r
Dim x.d(nPts - 1)
For r = 0 To nPts - 1
x(r) = N
Next r
EndProcedure
Procedure.i CT_LoadDataD(DataLabel.i, Array Y.D(1))
; REV: 110507, skywalk
; Code tool to dynamically load sample data from a DataSection.
; Change data type suffixes as required. ie; CT_LoadDataS
Protected.i k
ReDim Y.d(10)
!MOV eax, [p.v_DataLabel]
!MOV [PB_DataPointer], eax
Repeat
Read.d Y(k)
If Y(k) <> #nNullSTOP ; End of List
k + 1
Else
Break
EndIf
If (k % 10) = 0
ReDim Y(k+10)
EndIf
ForEver
ReDim Y(k-1)
ProcedureReturn k
EndProcedure
Procedure.i ML_arrStats(Array x.d(1), Array arS.d(1), SkipNull.i=0, nSigma.d=3.0, isLog.i=0)
; REV: 040927, skywalk
; Changed sub to function returning errcode if any, otherwise = 0
; REV: 110110, skywalk
; Changed return value to skipnull, since PB does not return arrays()
; arS(1) is passed ByRef, so edited contents will be seen globally.
; #ars_nPts = 0 ; nPts of array
; #ars_min = 1
; #ars_max = 2
; #ars_range = 3
; #ars_sum = 4
; #ars_avg = 5
; #ars_stddev = 6
; #ars_SkipNull = 7 ; nPts - nulls
; #ars_pcstddev = 8 ; %Stddev = Sigma/Avg*100
; #ars_m3sigma = 9 ; minus 3 Sigma
; #ars_p3sigma = 10 ; plus 3 Sigma
; #ars_Skew = 11 ; Skew of data set
; #ars_Kurt = 12 ; Kurtosis of data set
; #ars_Xmonotonic = 13
; #ars_Ymonotonic = 14
; #ars_minp = 15 ; Pointer to min
; #ars_maxp = 16 ; Pointer to max
; #ars_pcrange = 17 ; %range = range/Avg*100
Protected.d SumSq, d
Dim ars.d(#ars_nStats-1)
arFillD(ars(), #ars_nStats) ; Pre-fill with nulls <> 0
Protected.i i, nPts = ArraySize(X()) + 1
ars(#ars_nPts) = nPts
If SkipNull ; delete all nulls 1st and then proceed w/non-null array
Protected Dim Xn.d(nPts-1)
CopyArray(x(), Xn())
nPts = arDelD(Xn()) ; Del all nulls and recalc npts
skipnull = nPts ; Could = 0 if there were no actual nulls
ars(#ars_SkipNull) = skipnull
EndIf
If isLog > #arm_unitslinear ; Use Xn(), since log data will always ignore nulls.
Dim xc.d(nPts - 1) ; Use the Xc() convert array for log data
If isLog = #arm_unitsLogPwr ; Convert to linear before computing stats.
For i = 0 To nPts - 1
xc(i) = UC_dBm_LinMag(Xn(i))
Next i
ElseIf isLog = #arm_unitsLog ; Convert to linear before computing stats.
For i = 0 To nPts - 1
xc(i) = UC_dB_LinMag(Xn(i))
Next i
EndIf
EndIf
If skipnull
ars(#ars_min) = Xn(0)
Else
ars(#ars_min) = x(0)
EndIf
ars(#ars_minp) = 0
ars(#ars_max) = ars(#ars_min)
ars(#ars_sum) = 0
SumSq = 0
For i = 0 To nPts - 1
If skipnull
d = Xn(i)
Else
d = x(i)
EndIf
If d < ars(#ars_min)
ars(#ars_min) = d
ars(#ars_minp) = i
EndIf
If d > ars(#ars_max)
ars(#ars_max) = d
ars(#ars_maxp) = i
EndIf
If isLog > #arm_unitslinear
ars(#ars_sum) + xc(i)
SumSq + xc(i) * xc(i)
Else
ars(#ars_sum) + d
SumSq + d * d
EndIf
Next i
ars(#ars_range) = ars(#ars_max) - ars(#ars_min)
ars(#ars_avg) = ars(#ars_sum) / nPts
If nPts < 2 ; There are nPts samples. Samples >=2 for StdDev.
ars(#ars_stddev) = 0
Else
ars(#ars_stddev) = (SumSq - nPts * ars(#ars_avg) * ars(#ars_avg)) / (nPts - 1)
If ars(#ars_stddev) <= 0
ars(#ars_stddev) = #small1
Else
ars(#ars_stddev) = Sqr((SumSq - nPts * ars(#ars_avg) * ars(#ars_avg)) / (nPts - 1))
EndIf
EndIf
; avg, stddev are in linear at this point
ars(#ars_pcstddev) = ars(#ars_stddev) / (ars(#ars_avg) + #small1) * 100.0
ars(#ars_mnsigma) = ars(#ars_avg) - nSigma * ars(#ars_stddev)
ars(#ars_pnsigma) = ars(#ars_avg) + nSigma * ars(#ars_stddev)
ars(#ars_pcrange) = ars(#ars_range) / (ars(#ars_avg) + #small1) * 100.0
If isLog = #arm_unitsLogPwr ; Convert back if necessary
ars(#ars_pcstddev) = UC_mW_dBm(ars(#ars_pcstddev))
ars(#ars_mnsigma) = UC_mW_dBm(ars(#ars_mnsigma))
ars(#ars_pnsigma) = UC_mW_dBm(ars(#ars_pnsigma))
ars(#ars_pcrange) = UC_dBm_LinMag(ars(#ars_range)) / (ars(#ars_avg) + #small1) * 100
ars(#ars_pcrange) = UC_mW_dBm(ars(#ars_pcrange))
ars(#ars_avg) = UC_mW_dBm(ars(#ars_avg))
ars(#ars_stddev) = UC_mW_dBm(ars(#ars_stddev))
ElseIf isLog = #arm_unitsLog ; Convert back if necessary
ars(#ars_pcstddev) = UC_mW_dB(ars(#ars_pcstddev))
ars(#ars_mnsigma) = UC_mW_dB(ars(#ars_mnsigma))
ars(#ars_pnsigma) = UC_mW_dB(ars(#ars_pnsigma))
ars(#ars_pcrange) = UC_dB_LinMag(ars(#ars_range)) / (ars(#ars_avg) + #small1) * 100
ars(#ars_pcrange) = UC_mW_dB(ars(#ars_pcrange))
ars(#ars_avg) = UC_mW_dB(ars(#ars_avg))
ars(#ars_stddev) = UC_mW_dB(ars(#ars_stddev))
EndIf
; TBD...
ars(#ars_Skew) = #nNull
ars(#ars_Kurt) = #nNull
ars(#ars_Xmonotonic) = 0
ars(#ars_Ymonotonic) = 0
ProcedureReturn skipnull
EndProcedure
;-{ TEST
Dim X.d(0)
Dim Stats.d(0)
CT_LoadDataD(?Samples_D, X())
ML_arrStats(X(),Stats(),0)
Debug "; -- Stats[X()] --"
Debug "; nPts = " + Str(Stats(#ars_nPts))
Debug "; noNulls = " + Str(Stats(#ars_SkipNull))
Debug "; Min = " + StrD(Stats(#ars_min),2)
Debug "; Max = " + StrD(Stats(#ars_max),2)
Debug "; Minp = " + Str(Stats(#ars_minp))
Debug "; Maxp = " + Str(Stats(#ars_maxp))
Debug "; Avg = " + StrD(Stats(#ars_avg),2)
Debug "; Rng = " + StrD(Stats(#ars_range),2)
Debug "; %Rng = " + StrD(Stats(#ars_pcrange),2)
Debug "; Sum = " + StrD(Stats(#ars_sum),2)
Debug "; Std = " + StrD(Stats(#ars_stddev),2)
Debug "; -3Std = " + StrD(Stats(#ars_mnsigma),2)
Debug "; +3Std = " + StrD(Stats(#ars_pnsigma),2)
Debug "; %Std = " + StrD(Stats(#ars_pcstddev),2)
ML_arrStats(X(),Stats(),1)
Debug "; -- Stats[X()], Nulls ignored --"
Debug "; nPts = " + Str(Stats(#ars_nPts))
Debug "; noNulls = " + Str(Stats(#ars_SkipNull))
Debug "; Min = " + StrD(Stats(#ars_min),2)
Debug "; Max = " + StrD(Stats(#ars_max),2)
Debug "; Minp = " + Str(Stats(#ars_minp))
Debug "; Maxp = " + Str(Stats(#ars_maxp))
Debug "; Avg = " + StrD(Stats(#ars_avg),2)
Debug "; Rng = " + StrD(Stats(#ars_range),2)
Debug "; %Rng = " + StrD(Stats(#ars_pcrange),2)
Debug "; Sum = " + StrD(Stats(#ars_sum),2)
Debug "; Std = " + StrD(Stats(#ars_stddev),2)
Debug "; -3Std = " + StrD(Stats(#ars_mnsigma),2)
Debug "; +3Std = " + StrD(Stats(#ars_pnsigma),2)
Debug "; %Std = " + StrD(Stats(#ars_pcstddev),2)
DataSection
Samples_D:
; Append examples here...
; Doubles
Data.d 1 ; 0
Data.d 1 ; 1
Data.d 1 ; 2
Data.d 1 ; 3
Data.d 10 ; 4
Data.d -999 ; 5 <-- Ignored if SkipNull=1
Data.d 10 ; 6
Data.d 1 ; 7
Data.d -999 ; 8 <-- Ignored if SkipNull=1
Data.d 1 ; 9
Data.d #nNullSTOP
EndDataSection
;-}
; -- Stats[X()] --
; nPts = 10
; noNulls = -999
; Min = -999.00
; Max = 10.00
; Minp = 5
; Maxp = 4
; Avg = -197.20
; Rng = 1009.00
; %Rng = -511.66
; Sum = -1972.00
; Std = 422.60
; -3Std = -1465.01
; +3Std = 1070.61
; %Std = -214.30
; -- Stats[X()], Nulls ignored --
; nPts = 10
; noNulls = 8
; Min = 1.00
; Max = 10.00
; Minp = 0
; Maxp = 4
; Avg = 3.25
; Rng = 9.00
; %Rng = 276.92
; Sum = 26.00
; Std = 4.17
; -3Std = -9.25
; +3Std = 15.75
; %Std = 128.19
Last edited by skywalk on Thu Jul 12, 2012 6:23 am, edited 1 time in total.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
-
DarkDragon
- Addict

- Posts: 2348
- Joined: Mon Jun 02, 2003 9:16 am
- Location: Germany
- Contact:
Re: Statistics functions?
I made a principal component analysis which contains a few statistical functions. Just the row echelon form has one bug, maybe I'll fix it when I have enough time: http://www.purebasic.fr/english/viewtop ... 67#p329267
bye,
Daniel
Daniel
Re: Statistics functions?
Probably of no use but just for fun some x86 ASM
( It might make a difference if you have a lot of data to process )
To make it work on x64 each edx should be replaced with rdx .
It's based on this explanation http://www.mathsisfun.com/data/standard-deviation.html
( It might make a difference if you have a lot of data to process )
To make it work on x64 each edx should be replaced with rdx .
It's based on this explanation http://www.mathsisfun.com/data/standard-deviation.html
Code: Select all
Structure dArrayInfo
Sum.d
Min.d
Max.d
Avg.d
PopulationVariance.d
PopulationStdDev.d
SampleVariance.d
SampleStdDev.d
EndStructure
Procedure DoubleArrayInfo(Array Values.d(1), *ArrayInfo.dArrayInfo)
Protected.i ArrAddr = @Values(0)
Protected.i ArrLen = ArraySize(Values()) + 1
!mov ecx, [p.v_ArrLen]
!mov edx, [p.v_ArrAddr]
!pxor xmm0, xmm0
!cvtsi2sd xmm2, ecx ; xmm2 = Array length as double
!mov eax, 0x7f800000
!movd xmm3, eax ; xmm3 = +Infinity (used for Min)
!cvtss2sd xmm3, xmm3
!pxor xmm4, xmm4
!subsd xmm4, xmm3 ; xmm4 = -Infinity (used for Max)
!dArrayInfoLoop1:
!movsd xmm1, [edx]
!minsd xmm3, xmm1
!maxsd xmm4, xmm1
!addsd xmm0, xmm1
!add edx, 8
!dec ecx
!jnz dArrayInfoLoop1
!mov edx, [p.p_ArrayInfo]
!movsd [edx], xmm0
!movsd [edx + 8], xmm3
!movsd [edx + 16], xmm4
!divsd xmm0, xmm2
!movsd [edx + 24], xmm0
!mov ecx, [p.v_ArrLen]
!mov edx, [p.v_ArrAddr]
!movsd xmm3, xmm0 ; xmm3 = Avg
!lea eax, [ecx - 1]
!cvtsi2sd xmm4, eax ; xmm4 = Array length - 1
!pxor xmm0, xmm0
!dArrayInfoLoop2:
!movsd xmm1, [edx]
!subsd xmm1, xmm3
!mulsd xmm1, xmm1
!addsd xmm0, xmm1
!add edx, 8
!dec ecx
!jnz dArrayInfoLoop2
!mov edx, [p.p_ArrayInfo]
!movsd xmm1, xmm0
!divsd xmm0, xmm2
!movsd [edx + 32], xmm0
!sqrtsd xmm0, xmm0
!movsd [edx + 40], xmm0
!divsd xmm1, xmm4
!movsd [edx + 48], xmm1
!sqrtsd xmm1, xmm1
!movsd [edx + 56], xmm1
EndProcedure
Dim A.d(4)
A(0) = 600
A(1) = 470
A(2) = 170
A(3) = 430
A(4) = 300
DoubleArrayInfo(A(), ArrayInfo.dArrayInfo)
Info.s = "Sum = " + StrD(ArrayInfo\Sum) + #LF$
Info + "Min = " + StrD(ArrayInfo\Min) + #LF$
Info + "Max = " + StrD(ArrayInfo\Max) + #LF$
Info + "Avg = " + StrD(ArrayInfo\Avg) + #LF$
Info + "Population variance = " + StrD(ArrayInfo\PopulationVariance) + #LF$
Info + "Population standard deviation = " + StrD(ArrayInfo\PopulationStdDev) + #LF$
Info + "Sample variance = " + StrD(ArrayInfo\SampleVariance) + #LF$
Info + "Sample standard deviation = " + StrD(ArrayInfo\SampleStdDev)
MessageRequester("Info", Info)Re: Statistics functions?
Checkout the "Rosetta Code" website. You'll find something at: http://rosettacode.org/wiki/Statistics/Basic#PureBasicDavy wrote:Is there any code available in the public domain for PB statistics functions? I haven't seen anything on the code archive site. e.g. mean, mode, variance, standard deviation, binomial distribution etc.
Thanks.
(Big thanks to Demivec and others for all their hard work)
blueb
- It was too lonely at the top.
System : PB 6.21(x64) and Win 11 Pro (x64)
Hardware: AMD Ryzen 9 5900X w/64 gigs Ram, AMD RX 6950 XT Graphics w/16gigs Mem
System : PB 6.21(x64) and Win 11 Pro (x64)
Hardware: AMD Ryzen 9 5900X w/64 gigs Ram, AMD RX 6950 XT Graphics w/16gigs Mem


