Forums › Forums › SIMPOL Programming › SD and Variance

Tagged: regression, standard deviation, variance

- This topic has 6 replies, 2 voices, and was last updated 7 years, 4 months ago by JD Kromkowski.

- AuthorPosts
- April 3, 2017 at 1:32 pm #3493JD KromkowskiParticipant
I’d like to add standard deviation and variance to report aggregation options. I don’t see the underlying code to the quickreportlib.sml or repguilib.sml

Here’s sort of generic code I have worked up as start to make sure I have math correct:

function main()

array data

number variance,sd

string resultdata =@ array.new()

data[1] = 1

data[2] = 2

data[3] = 3

data[4] = 4

data[5] = 5

data[6] = 6

data[7] = 7

data[8] = 8

data[9] = 9

data[10] = 10

calcSD(data)

result = “total – ” + .tostr(data[“sum”],10) + “{d}{a}”

result = result + “average – ” + .tostr(data[“average”],10) + “{d}{a}”

result = result + “variance – ” + .tostr(data[“var”],10) + “{d}{a}”

result = result + “SD – ” + .tostr(data[“SD”],10)

end function resultfunction calcSD(array data)

number sum

integer count, i

count = data.count()

i = 1

sum = 0

while

sum = sum + data

i = i+1

end while (i > count)

data[“sum”] = sum

data[“average”] = sum/count

data[“var”] = 0

i= 1

while

data[“var”] = data[“var”] + raisetopower((data – data[“average”]),2)

i=i+1

end while (i > count)

data[“var”] = data[“var”]/count

// standard deviation of population correction need to be made for stds (/count – 1)

data[“SD”] = sqrt(data[“var”],14)

end functionApril 10, 2017 at 11:38 am #3494MichaelKeymasterHi John,

I have added the Variance to the list of aggregates supported by the Quick Report and Graphic Report engines. It will be available in the next release.

Ciao, Neil

April 10, 2017 at 2:56 pm #3497JD KromkowskiParticipantGreat. Thanks. SD too? Because that is really more useful than Variance. Although, you have to calc variance to get to SD.

If I’m greedy, which I am, I’d love UCL and LCL (upper control limit and lower control limit) which is Mean +/- 3*SD!

JDK

April 20, 2017 at 11:16 am #3500MichaelKeymasterHi John,

Since the SD can be calculated as the sqrt(Variance), I decided at this stage only to add variance. Other options such as those you described, we contemplate for future changes. I will see about adding a wishlist capability to the online system.

Ciao, Neil

April 29, 2017 at 8:51 pm #3501JD KromkowskiParticipantBefore I write correlation, regression, r^2, etc stuff. These haven’t already been written have they? I am getting tired of exporting to spreadsheets and want to automate to reduce dimensionality in a dataset.

My niece and nephews think I should learn R for machine learning, but it seems like SBNG could do the stuff.

Might also be another market for SBNG?May 2, 2017 at 1:20 pm #3502JD KromkowskiParticipantI’ve started writing a regression function. I’d like to make it GUI. Is there code already available for a picking fields (like field selection dialog in personal) that I could have? The function I’m writing basically takes fieldA and fieldB and outputs the regression equation and R squared.

function regression(sbmefield fieldA, smbefield fieldB)

…

end function RegressionInfoMay 3, 2017 at 2:56 pm #3503JD KromkowskiParticipant//Regression John Kromkowski 5/2/2017

//Beginnings need to add routine for equation/constant, etc.function main()

string s, s1, sfieldnameX,sfieldnameY

number Rsquared

sbme1 sbme

integer e, error, count1, count2

sbme1table tablename

sbme1field sbfieldX,sbfieldY

boolean found

array b

b =@ array.new()s = “”

count1 = 4

count2 = 0

e = 0

error = 0

sbme =@ sbme1.new(“C:\SIMPOL\Ethnic\ARHE\cdancestry.sbm”, “O”, error=e)

if sbme =@= .nul

s = “Error number ” + .tostr(e, 10) + ” opening the SBME file: ‘cdancestry.sbm'{d}{a}”

else

tablename =@ sbme.opentable(“CDANSCESTRY”, error=e)

if e

s = “Error number ” + .tostr(e, 10) + ” opening the table{d}{a}”

else

// Initialize the found flag to .false

found = .false

b =@ getfieldinfoarray(tablename)

while count1 < 158

sfieldnameX = b[4]

sfieldnameY = b[count1]

sbfieldX =@ getfield(tablename,sfieldnameX)

sbfieldY =@ getfield(tablename,sfieldnameY)

Rsquared = regression(sbfieldX, sbfieldY)

s = sfieldnameX + ” – ” + sfieldnameY + ” R-squared: ”

s1 = .tostr(Rsquared,10)

wxmessagedialog(message = s + s1)

count1 = count1 + 1

end while

end if

end if

end function sfunction regression(sbme1field sbfieldX, sbme1field sbfieldY)

array x, y, xy, x2, y2

x =@ array.new()

y =@ array.new()

xy=@ array.new()

x2=@ array.new()

y2=@ array.new()

number Pearson, Rsquared, numer, denom

number sumX,sumY, sumXY, sumX2, sumY2

integer i, n

sbme1record r

i = sbfieldX.table.recordcount()

n = i

r =@ sbfieldX.table.select(lastrecord=.false)x = .fix(r.get(sbfieldX),1000000)

y = .fix(r.get(sbfieldY),1000000)

xy = .fix(r.get(sbfieldX),1000000) * .fix(r.get(sbfieldY),1000000)

x2 = raisetopower(x,2)

y2 = raisetopower(y,2)

while i > 1

r =@ r.select(previousrecord = .false)

i = i – 1

x = .fix(r.get(sbfieldX),1000000)

y = .fix(r.get(sbfieldY),1000000)

xy = .fix(r.get(sbfieldX),1000000) * .fix(r.get(sbfieldY),1000000)

x2 = raisetopower(x,2)

y2 = raisetopower(y,2)

end while

sumX = sumarray(x)

sumY = sumarray(y)

sumXY = sumarray(xy)

sumX2 = sumarray(x2)

sumY2 = sumarray(y2)

numer = ((n*sumXY)-(sumX*sumY))

denom = sqrt(((n*sumX2)-raisetopower(sumX,2))*((n*sumY2)-raisetopower(sumY,2)),14)

Pearson = .fix(numer/denom,1000000000)

Rsquared = raisetopower(Pearson,2)

end function Rsquaredfunction sumarray(array data)

number sum

integer count, i

count = data.count()

i = 1

sum = 0

while

sum = sum + data

i = i+1

end while (i > count)

end function sum - AuthorPosts

- You must be logged in to reply to this topic.