Wednesday, October 19, 2011

Client Driven Modifications

There are now three thousand entries in the database. A thousand came from a single evening's work by my niece. Another 500 came from a nephew. The remainder probably came from my daughters. A number of design changes are prompted by the data so far.

The first is a mechanism to prevent multiple errors. If a student can't solve say 20 divided by 10, nothing is gained by having them submit 10 consecutive wrong answers. It wouldn't happen in a pencil and paper test and it shouldn't happen in my Java Math Test. For the purpose of pure Rasch theory, all that matters is that one student did not know the answer to one question. So including 10 wrong answers in the database gives rise to quite a serious distortion. But from the perspective of the user, allowing only a single attempt might be confusing; or would it?

My first instinct had been to add a "cheat" or "panic" button, to get the user out of a bad loop. And the current code just leaves the item there for a second (third, fourth ...) attempt. But, after thinking about the theory, and the traditional pencil and paper test, perhaps the correct procedure is to log the wrong answer, and go straight on to the generation of the next item. In this case, something needs to be added to the messages, to remind the student what the item was, and to inform them of the correct answer.

The second is a mechanism for me to track more accurately when the Applet is being used by outsiders as opposed to family members. Getting the IP address of the visitor to a web page is pretty easy with PHP, but the question is how to store it. In the current format, the SQL query is more or less written by the time a PHP script is called, and there is no field in the database for an IP address to be stored.

The third change, requested by a school in the Philippines, is the ability to track individual students. I had deliberately left this out, because of the Australian/Western obsession with privacy.

Australian data protection legislation enshrines ten privacy principles, which begin with:

An organisation must not collect personal information unless the information is necessary for one or more of its functions or activities.

The key word here is personal. I don't need personal information, but if users elect to personalize their own transactions for their own purposes, they must deem it necessary. Furthermore, if schools are encouraged to use avatar ids, which only they can associate with real students, my data will remain impersonal to anyone except those who created it.

Another principle worth mentioning (Number 8 on the list) is:

Wherever it is lawful and practicable, individuals must have the option of not identifying themselves when entering transactions with an organisation.

So notwithstanding what tracking mechanisms certain schools elect to use for their own purposes as long as I maintain the open (and anonymous) portal, the project as a whole will comply with this principle.

Of course if schools or individuals ignore my recommendation to preserve anonymity, there are eight other principles to consider, including data quality, security, openness, access and sensitivity. I think I need to cut and paste a suitable "conditions of use" from somewhere for those who elect to track their own data.

Leaving aside these potential legal issues, if I am changing the database to enable schools to track their students, I can pop in a field for the IP address at the same. And while I'm at it, I'll modify the Applet structure to collect some data before use as well as storing data after use. In my first post on Applet/Javascript Communication, information was passed from a web page to an Applet, so I can use that model to pass the student id and IP address to the Applet as parameters. They can then be woven into SQL statements to be passed back to the host page.

In the current version, users are taken straight to the page hosting the applet. In future versions, users will go first to a PHP page, which offers the option to log in, or to remain anonymous. They will then go to the page hosting the applet, and carry with them an id parameter, which will either be a real id for those who have logged in, or an anonymous id generated from the date stamp and IP address. Technically of course, the IP address could be used to identify a user, but most ISP would only give personal information to government agencies is special circumstances. As the users of my applet are not doing anything illegal, this is not going to happen.

The fourth change, which might as well be woven into all these others, is to allow operations with decimals. The current Items table only allows operations with integers. In fact the field type is actually small int, which further restricts the fields to plus or minus 32767. This is fine for multiples of ten in the order of magnitude, but it could lead to problems with larger multiples. I think for the foreseeable future, a single precision float will be more than adequate to store decimals in the item columns, as well as larger number if required.

The question is, should the IP address be stored alongside id with every transactions. It is a bit of a Catch 22. With the current infrequent use, the start time in milliseconds should be more than adequate for unique separation of sessions, because the odds of two sessions beginning simultaneously are very low. But with the current infrequent use, space is not an issue so IP could be included without running into storage problems. And as the frequency of use increases, and the odds of two simultaneous starts increase, so space becomes more of an issue as well. I think for now, I'll shove the IP address in. After all, computer space is pretty cheap these days, so even if I hit my web host limit, I can always siphon data off and store it at home.

For the handful of schools who want unique tracking, I can add an extra table to store their user id's and any other data they want to store alongside it.

Tuesday, October 18, 2011

Local Command Line Diagnostics

Being stuck in the middle of a typhoon with no Internet connection, I found myself browsing my computer for the pre-Applet version of my Java Math Test. And when I stumbled across a collection of code, I couldn't believe how advanced it was - with all the core functionality, but a slightly simpler interface - identical in fact with the April 2009 Applet. It ran perfectly, and better still, ticking away on the command line, were all the diagnostic messages, which are just not practical to display on a production or near production Applet.

So while the storm raged outside, I busied myself bringing the front end of this local version up to the level of the current applet and used it as a test bed for new code. This worked much better than testing code in a public web space, because no matter how much you label it as a test version, somebody will stumble across it, try it, and get put off if it doesn't work. And it pretty well eliminated the need to have diagnostic messages painting themselves across the face of the Applet.

Tuesday, October 4, 2011

Coding for Improved Flexibility

My Java Math Test Applet is designed to respond to student ability, but I have found difficulty fine tuning the responsiveness for able and less able children. The most able children want to race through the difficulty levels, so that they can get to the top level by the end of the activity. The less able children get put off if the items get too hard too quickly, so they want a more gentle gradient.

The current promotion code, called by a correct answer, is as as follows:

void raiseLevelnum(int oldLevel) {
if (promoscore > 2 && timeonTask <= 3) {
if (oldLevel < 5) {
levelnum = oldLevel + 1;
} else {
levelnum = oldLevel;
}
promoscore = 0;
}
}

The relegation code, called by an incorrect answer, is:

void reduceLevelnum(int oldLevel) {
if (oldLevel > 1) {
levelnum = oldLevel - 1;
} else {
levelnum = 1;
}
}

So to get promoted, the student needs 3 consecutive correct answers and a rapid response on the last item. That is about right for an Australian Year 2 or 3 student, but it is too slow for a year 5 or 6 student. My plan is therefore to offer students a choice of gradient, but I also need to tighten up the student responsiveness code.

And while I'm at it, there is hard coded maximum difficulty level of 5 in:

if (oldLevel < 5) {

This needs changing to a soft parameter so as to make future changes to the number of levels easier to implement and to allow for different numbers of levels for different operations:

if (oldLevel < levelMax) {

Now giving students a choice of gradient requires a new dropdown box on the front end, which in turn requires changing the gridy value of everything below. This all seems a bit archaic, but it can be researched another day.

The gradient selection needs to recorded for use in the promotion code, and then the code needs adjusting:

void raiseLevelnum(int oldLevel) {
int promoParam = 3 - gradIndex;
float paramRate = (float)(gradIndex);
float relegRate = 4*paramRate;
paramRate = 10*paramRate;
if (promoscore > promoParam && promoRate > paramRate) {
if (oldLevel < LevelMax) {
levelnum = oldLevel + 1;
} else {
levelnum = oldLevel;
}
resetPromo();
}
if (promoRate < relegRate) {
if (oldLevel > 1) {
levelnum = oldLevel - 1;
} else {
levelnum = 1;
}
resetPromo();
}
}

I have included in the adjustment a promotion rate as well as a promotion raw score, and I have added in a relegation rate. I have observed with children using the software that when you set an addition or subtraction item which is too hard for say a Year 2 or Year 3 student they will use their fingers to produce the correct answer, although it takes them a very long time. So using an incorrect answer to bring them back to the appropriate difficulty level is insufficient as an effective trigger.

Monday, October 3, 2011

Applying Theory to the Item Arrays

The high point, or perhaps end point of my theoretical thinking was 25 August 2009. It all went to shit after that. I was disappointed by the results of my further iterations, and by the imbecility of the contemporary local school teachers.

I don't know why I was obsessed with the idea of multiple iterations. A single adjustment of item difficulty to compensate for the ability of the students tackling them, and of student ability to compensate for the difficulty of the items they tackle seems, on reflection, more than adequate. So I shall now revisit the calculations I did 2 years ago and focus on the results of the first pass.

A couple of things occur to me as I run my eyes over the data again. The first is the size of the dataset, at around 15,000 records, it is much larger than I remember. The second is the depth of the item list, at around 380 items for addition alone. So there is room for many more than the five difficulty levels that I currently use.

I am not sure what got me into such a negative frame of mind. I dug myself into a catch 22 mindset. that I needed more data to make it better, and I needed to make it better to get more data. But in fact I was already sitting on enough data at least to make some improvement.

So now I am sorting the item list by numeric value of the left and right hand numbers, and observing the completeness of the set. The number 1 has been combined with the numbers 1 to 7. The number 2 combined with the numbers 1 to 6. The number 3 combined with the numbers 1 to 5. The number 4 was combined with the numbers 1 to 8. So there are some gaps, but they are quite narrow. Certainly I think the first step now is to use all the number combinations in the existing data, and then later the gaps can be filled. I shall also order the items exactly by the results in the data set, with no personal juggling.

One of the problems with never having documented what I did in the past, is that I have spent hours writing something to produce item arrays from sorted items, but I have now idea what it was or where is it.

My first thought was that the arrays would have been produced with a few lines of Java, but the folder, where I found text files populated with arrays, had no sign of any java code, and a hunt for source files containing the term array yielded nothing. I then found a spreadsheet, which had probably been used to produce the files, containing cuttings from an Access query. Eventually I found an Access database dated March 2009, six months prior to my attempts at systematic estimation of item difficulty. So the logical thing to do now is to cut and paste those queries into the September database.

The first problem is that the September database uses compound items, whereas the March one uses the individual terms. Note to self - why did I ever put the compound item into a database and thank goodness I've changed that. A second is that after spitting the dummy two years ago, I did no analysis of the operations other than addition.

But with a bit of juggling and a tiny bit of cheating, I produced four new item arrays. The cheating is not an issue, because this is just an opening array set. They will all be adjusted regularly as new data comes in.

Sunday, October 2, 2011

Reducing Transaction Frequency

My niece was using the Applet last night, and in a couple of hours she generated a thousand lines of data, each one requiring it's own transaction with the database. A quick forum post confirmed that I need to modify the code so as to send more data with each transaction.

My first instinct was to upload data at the end of every activity, and I coded for that.

So the addItem3 method, which previously posted data after every item, was reduced down to accumulate a multiline SQL insert command in a new string , and the guts of it were put into a new addItem4 method, which posted the now multiline command:

private void addItem3(String newWord) {
if(firstPass) {
firstPass = false;
sqlbuffer = sqlbuffer + newWord;
}
else {
sqlbuffer = sqlbuffer + ", ";
sqlbuffer = sqlbuffer + newWord;
}
}

private void addItem4(String newWord) {
if(LIVE) {
sqlbuffer = sqlbuffer + ", ";
sqlbuffer = sqlbuffer + newWord;
firstPass = true;
if(jso != null )
try {
jso.call("updateWebPage", new String[] {sqlbuffer});
sqlbuffer = "";
}
catch (Exception ex) {
addItem2("jso call failed... ");
ex.printStackTrace();
}
}
}

But problems arose with posting 15 and 20 line inserts. So I modified the code again such that on longer activities, addItem4 gets called after 10 items, as well as on completion of the activity:

if ((oldItem == 10) && (NoOfItems > 10)) {
addItem4(qTrack.datInsert());
} else {
addItem3(qTrack.datInsert());
}

Then there was a typhoon, which prevented anything happening for a few days.