Archive for the ‘ COMP20252 ’ Category

COMP20252: Mobile systems

This year, I had a go at doing some teaching at the COMP20252 computer science course. I will try and record some of the things I tried so they may be useful to others. You may wish to jump directly to the summary before reading the body.

Previous years

This course has been running for 4 years now. I originally designed the lab element and have been involved in the labs to help with any issues, so I got to know the issues involved. I was most concerned with the lab element where the restriction on my in designing it was that they had to work in groups, having weekly meetings and many of the marks being awarded for group work. The idea behind this was that this would teach them to work in groups, like they would have to in real jobs. This is great in theory, but I felt this approach was very unsuccessful.

In the very first year of the course, the groups would consist of four students. In their first meeting, the students elected a manager and proceeded to implement the different parts of the task. The manager would do no coding and would just instruct the others. This was a recipe for disaster.The students quickly realised that their effort would be divided by four and shared to lazy collogues. In each group, about one student would switch course (they can do so within the first three weeks). This dropped most groups down to three, which they would use as an excuse for doing no work. The work was scattered with one student sitting at a computer and three sitting around them playing with their mobile phones. Students soon realised they were not necessary to the process and would stop attending. The morale was incredibly low and nearly all the questions in the labs, rather than being technical, begun with “this isn’t fair because…”.

Things to learn

There were a number of things that I learned from this about students that I wanted to solve.

Students just want marks

Turning up to labs is not something that many people would do voluntarily. Most students do turn up. Not because they are fascinated about the subject and they decided that the perfect time to satisfy their curiosity is at 9am on a miserable Thursday morning. They are there to do a job. And that is the way the lab component is described to them, “a job”. The job is to get up and receive their marks so they can go back to bed. They are interested in the most direct route to getting their points. After they get these, the work they have done is thrown away, so there is little desire to make the work expandable, tidy or correctly commented.

Students hate each-other

They will happily cut their noses off to spite their face and then present you with the said nose crying “look what they made me do”. When working in a group with one of two lazy individuals, students will simply refuse to work. Because they see the marks as the result of their work, giving these away to uncooperative colleagues feels wrong.

Students look for excuses

One of the most shocking things I noticed was many students putting a fantastic amount of effort to get compassionate marking. Interestingly, this was much more prevalent when it came to tasks in which they never faced before, yet could be performed very quickly. Boring tedious tasks were performed happily without any agitation. As a student, you quickly learn that you may get stuck for hours, maybe days on a problem and have little to show for your effort. This is difficult to appreciate as someone who understands how to do the task, but imagine picking up a 700 page reference manual for something you don’t really understand and be expected to get something working within a couple hours. So risking even trying for perform these tasks may be a huge waste of time. Finding an excuse gives much more consistent results and doesn’t damage your self respect.

Preparation

The lecture course was only using 12 of it’s 24 lecture slots expecting students to use the extra hour per week on personal study of the field (you can guess what happened). I asked Doug if I could give six extra lectures explaining the lab and he agreed. This was in December so I only had two months to prepare.

Psychology

I felt the majority of the challenge was psychological, so I downloaded a lecture course of developmental psychology (search torrent sites). These are 30 minutes per lecture, but time stretching these, you can get then down to about 18 minutes. I could do two lectures on my commute in to work and two on the way back. This way I listened to 20 lectures per week without it impacting my work time. Although many concepts are rather woolly and unscientific, I can wholeheartedly recommend these if you are trying to be a good educator or parent.

This is a massive field of study but for the purposes of this post I will try and summarise it. Developmental psychology examines the change of character that is necessary for individuals to progress though their different roles in life. Here I concentrated on Erik Erikson’s theory of industry and  identity. It felt a bit silly using some approaches that should only be applicable on children 7-11 years olds, but these are very important as years of school grading, target chasing and little appreciation for extra effort, have regressed these people into grade producing zombies. Before you can get people to be motivated to work voluntarily, without the assurance of success and without a direct connection to a reward or punishment, several barriers must be broken.

  1. People feel worthless and unappreciated. Much of the effort on their part is not recognised.
  2. People don’t identify their expertise as valuable. They dread being asked what they do as they think other people will think poorly of them.
  3. People feel anonymous and hope they can fade into the background. There is always one correct answer to all questions, so thinking differently is usually punished.
  4. People are afraid of decisions or committing effort in case they are wrong. Putting in effort into the wrong direction is usually punished which makes people afraid of commitment.

Subject

The subject involved a lot of hard core mathematics. This is not only very hard, but many find it rather dull. This, I wanted to learn to a very deep degree beyond that what is useful in the lab exercises. Not because I was afraid of hard questions, but I wanted to find other uses for each element taught. Advanced mathematics is one of the subjects that has very little function in day to day life, so committing yourself to learn something of little practical use is somewhat hard.

I wanted to move the lab to use techniques implemented by current devices. This had a double use as students would believe they were learning something that was novel that would be practical, rather than academic. For example, Hamming versus Reed-Solomon error correction. Virtually every error correction system nowadays uses Reed-Solomon, while hamming remains an academic curiosity. When students see this, they know they are learning something they will be able to apply, rather than learning for the sake of learning.

Lectures

In total I presented six lectures. The structure of each lecture was the following

  • 5 minute subject motivation
  • 40 minute content of subject
  • 5 minute off-topic cool-down

Many people would turn up a couple minutes late, so it is a good idea to not to put something too important right at the start. Here I presented why the subject talked about in that lecture is not just important, but also exciting.

Every area of computer science has gone though massive changes over a relatively short period of time. It is possible for people to come into a field and make great leaps of advancements which will be regarded as amazing by the community, and useful by millions of users. These advances were not done by people sitting in rooms and emulating each other, by but rather by people who decided to pick up the baton and run ahead. And if you want to be awesome and advance our civilization, then concentrate for the next 40 minutes and I will show you where the current frontier of knowledge is.

This only woks if you are leading to the most advanced systems out there. Following this with some ideas from the sixties would not work.

The body of the lecture was intense. I tend to do one slide per minute when talking to experienced researchers and they have problems keeping up. When doing these lectures, I went even faster, in the knowledge that if some parts were not understood, I would get feedback and explain them though other means. Getting the feedback was easy (I will explain below) and supplying any extra info using moodle seemed to work. The second element essential to cover a lot of material is good handouts. Slides make terrible handouts, and instead I went by my philosophy that every lecture can be summarised down to two sides of paper. Below are the handouts of one of the lectures. You may notice that unless you were there, they may not make sense. These provide a reminder of the ideas discussed  along with values which are useful when implementing the ideas in the lab. They are not even the same examples as ones presented in the lecture, so they serve as an exercise for people who think they understood the lecture.

The cool-down is very important. If you have just sat though a 40 minute information blast and you feel grumpy, tired and you now want to forget everything, you need a reminder that whatever effort you make, will be appreciated. Here I would begin with a review of the lab progress (looking at the lines of committed code) followed by tips on how and why they could get this better. For the first couple weeks I talked about coding discipline showing how you can increase your productivity though self discipline.  Then moving onto how hard work will get you noticed. An interesting factoid is that several large companies pride themselves on hiring the top 5% of the graduates. If you look at the productivity of the 45 active students on the course you can see why. Below is a graph of the of the lines of code produced by week 4 for each student. The top 10% more work than the remaining 90%. So you can see why the companies do this (you get someone ten times more productive and you pay them say two times as much), and why you should work hard to get there. The companies who chase the students offer delicious carrots while working as a code monkey only gets you a regular beating with the motivating stick. If you want to be rewarded, then work hard without the threat of a stick. If you only work when threatened, then that’s what your employers will do.

The next point is to say “Do something!”. If you want to show you are capable of working without threats, then do something. You don’t achieve anything by just sitting there being clever. Use your brain to do something awesome. Computer science is the best subject in the world. There is no subject that comes close. The following needs to be shouted:

There are professors in English department saying if you study some course well you may be able to write a poem or something and someone might care about your pathetic worthless existence for another minute. In computer science you can, radically revolutionise the way everyone carries out everyday tasks. The people who created the processor or the software on your computer are producing more copies of their work than any book, song, movie or any other piece of work. If you want to change the world, here is where you start…

This then leads onto encouragement to try creating something using the skills they heave learned. There are so many open source projects out there that people can contribute to. Next time you meet someone you like at a party you could say “What web browser do you use? … Oh! I wrote parts of that!”. Next time you’re at a job interview, you can say “I’m not afraid of writing code, in fact your company already relies on things I have written”.

(cc) xkcd

Labs

The tasks in the lab remained the same. I couldn’t scrap the group system, but I could change the way the tasks were divided. This time there were groups of three, with each member being able to perform their portions independently.

Make submit

To make it easy for students to get help, I added a “submit” target in the makefile. This would create a tar file with their files and copy it into a write only directory in my home space, with their name, task name, time and date in the filename. Along with the source files, the tar would also contain the .make_dates file. Whenever the student would run make (every time they made a change), it would record the time and date in the file.

At my end, I had a series of scripts which made life easier. Every minute a notify-send call would tell me if there were submissions in the queue. When I was marking, a script would expand their submission into a marking code tree, and would compile everything to show me any compile errors. Secondly it would diff the current submission with the last submission in the old-submits directory. Finally it would open a Thunderbird write window with the address, subject and part of the body filled in.

I could then run the compiled program in valgrind or ddd, look at the diff and within seconds send back help or comments. The diff would also show the number and times of the compiles since the last submission which is useful to see how long someone has been struggling over a problem. About half the submissions were with erroring code. The majority of these were trivial issues they had never seen before and easy to explain. Some had bigger problems but lacked the knowledge of how to do deep debugging. Here I replied a series of commands I used to find the problem. I didn’t want to give away the answer, so sometimes I would get a reply saying they were still confused, which would return more details in the specific area they were lost in. Half of submissions had working code. I encouraged students to submit every hour of work or at every milestone in their program. They would do this and receive a reply saying what bits were good/bad, give tips and asking if they knew what they were going to do next. This was very useful as they would say what they were planning to do in the next hour, and then doing it. This helped them divide the problem up and gave them targets. Here they would also state any elements of the lectures they did not understand as they were about to implement something that they were unsure about. Most students had a unique misunderstanding, but when there was a series of problems, a moodle posting could clarify things.

When I timed myself, I could deal with most submissions in less than three minutes (trivial problems or just progress updates) and the average wait time was about five minutes. I kept this down to the absolute minimum as I knew that if students were stuck, they would walk away from the computer after ten minutes of frustration. This meant being on call pretty much all day and night. I was at the computer at those times anyway, so it was just a case of putting down what I was working for a few minutes. The graph below shows the number of times students ran make during the different hours of the day. The labs were 9-11, yet most work was done outside the labs. Also note that students do like to work quite late.

Memorizing the state of over 40 students’ code is pretty hard, but it is worth it. They can see that every bit of effort they make is appreciated and noted. Secondly, they don’t feel like a chump who is working harder and later than others for nothing. You can clearly see this with the first submit they make in the late evening and upon receiving the reply almost instantly they are assured that working hard is normal. Also, a few simple words of praise work very well for increasing their morale. Follow these with a reassurance that they should be able to tackle the next large element, and they will believe you and go for it.

In total there were over 700 code submissions and 1000 sent emails for the first task.

Code exchange

In the first task each student was working on a different unrelated part. In the second task, they rotate their work and perform optimisation on another group members’ code. To do this properly, they have to understand what their code does. Here the idea is that each student can proudly show their work to someone else. By this point they are hopefully comfortable in that their work is impressive and they are not shy about their accomplishments. Even after this initial explanation, the students would have to go to each-other to ask for help. Hopefully they would notice that if you want a colleague to help you, you have to be nice to them and writing good clear code would give them less work explaining things later.

Each optimisation patch had to first pass by the original coder, who would make sure it doesn’t damage the system. If it did, they would receive penalty marks for not testing thoroughly. This made people triple check code before sending it to the original coder, as their clumsiness may cost their friend some marks. Even a consistent indenting style was important (something few students bothered with until that point).

In this task there were over 400 code based email exchanges that I was a part of. Sadly the majority of these were in the last three days of the three week task. There was a slight deadlock as each group member was waiting for another to submit the first patch. Two groups that had an early patch exchange and these finished over a week before the deadline. A compulsory trivial patch exchange in the first week may have been a good idea to get rid of the fear of being the first to commit.

Summary

I volunteered to work on this course, but I was only allowed to do this on the condition that it would not impact my day job. So the work was done in my spare time, while keeping up with my normal work up to date. This was well worth the effort to get the experience in management. Sadly I will not be doing this next year, so I hope some of the methods I described above are attempted by people. I am very proud of the effort that the students have put in. The strong students produced an exceptional amount of work, and many of the disaffected students started engaging and putting in real effort. The biggest failure was the lecturers putting the increase in productivity down to their threats, encouraging weak students to go to another course at the beginning of the semester. I promised students recognition from the staff, and I somewhat failed to deliver on this.

I would like to thank all the students work working incredibly hard, and hope the skills they learned will come in useful. I am sure some individuals will go onto do great things.

Coding discipline

Over years of programming I have learned how to be most productive, but I found it strange we do not teach this to the students. We teach obscure features of academic systems they will never see again, but we fail to teach how to tackle large projects. Course lab tasks usually consist of writing around 30 lines of code, which doesn’t expose the students to the challenges of large code bases. Here are some things I tried to teach this year in COMP20252.

The one hour cycle

I code in cycles. During these I try to disconnect as much from environmental distractions as possible (headphones). These cycles are roughly one hour long, including a 5 minute break at the end to commit the code to the repository, have a biscuit and a stretch. There are three reasons for this and I think it makes sense for students to learn this as a sensible method of being productive.

Set-up and burnout

When starting a new change on the code base, I spend about five to ten minutes trying to locate all the areas that will need to be changes and generally planning how to tackle the problem. After about 15 minutes, I am in full swing as I have everything I need to know, cached in my head. At this point I can code for hours straight, but generally I target to finish after a further half an hour and move onto testing and debugging. The reason for stopping so early is because there is a good chance you will burn out (become tired and clumsy).

The first ten minutes are not productive, yet most students will only work for 10 minutes at a time before updating their status to “Bored now” and having a chat with someone before starting again from the beginning.

Testing and debugging

In half an hour of coding you will probably write about 50 lines scattered between several files. The rule of thumb is there being one bug every ten lines of code. So, you now have five bugs to find in 50 lines of code. Testing while you still remember what you may have broken is very easy compared to testing something that was written by someone else, or by you but months ago. Finding the bugs, knowing they are somewhere in the small blocks code you have just written, gives you a massive locating boost. This is the main reason why you should never abruptly walk away from the computer without doing some testing. Test and debug will hopefully take five or ten minutes, but can often take an hour. If it does take an hour you will be grateful that you didn’t carry on coding until you were exhausted.

There is one even greater sin than that of walking away from a computer leaving untested code, and that is leaving the code in a broken state. Each programming task involves breaking an already working piece of code in order to add functionality, change the behaviour etc. In the set-up you will probably want to examine the current behaviour of the system to ascertain the areas that need to be changed. This is very difficult if the system doesn’t work.

Students rarely do any testing as they never see their code coming back to them. After writing something and finding that it broke the system, they would generally walk away hoping the bug fix itself in their absence. The next week they would ask a demonstrator to fix it for them while saying “I can’t remember what I did”.  Their original buggy code coming back week after week scared many students in COMP20252. In the feedback forms, that was one of their main criticisms. I say “GOOD! Be afraid. Very afraid”.

Divide and conquer

Forcing yourself to have a working system every hour partitions large tasks down to sensible sized components. Many of the students, from the start, wanted to create a large system involving several ambitious components. They would begin by writing a massive monolithic block containing all the features expecting that, once you write that last line of code, the program will work. This is bad on two counts. Firstly, it won’t work due to bug problems outlined above, and secondly there is no way you are going to keep so much state in you head. Even if you can keep track of the state of every variable you’ve used, all the possible input combinations and every possible error that could arise, you are making the job unnecessarily hard for yourself and for anyone else reading your code.

There is one more bonus reason of why you want to return to working code sooner rather then later, and that is the frequency of random unblockable interrupts. I receive a “stop what you’re doing and help me with this” request several times a day and with small changes it is still possible to revert the changes (infinite undo in nedit) and play them back to remind myself as to what I was doing. If I do have to revert and start again, then I have wasted at most half an hour, but at least I know what I am doing the second time round.

Fedora on USB sticks

I ordered some USB sticks to give away to the better students to encourage them to contribute to open source software. The idea is that they can run their own installation where they can install development libraries etc. I’ll write more about this in a few weeks when I know how successful this has been.

Installing Fedora on the disks is relatively easy. Nowadays I install computers using a USB drive, by simply DDing the iso directly to the device.

dd if=Fedora-12-i686-Live.iso of=/dev/sdb

The target USB stick will look just like any other hard drive. You just have to make sure you install the bootloader onto the target stick by overriding the BIOS boot order in the grub installation screen.

Once installed, I didn’t want to actually boot the device as I wanted the students to go through the first boot process of setting up a their own user. But I wanted to install some development packages and do a full system update. This can be done by mounting the device, chrooting and running yum commands. The live image has a /mnt/sysimage which is already set up to do something like this by already having /proc and /dev correctly set up.

mount /dev/sdb1 /mnt/sysimage
chroot /mnt/sysimage

The biggest issue with running from USB sticks is that they have no on device cache, thus each fsync command takes absolutely ages. Yum, correctly, makes heavy use of fsync to make sure it leaves the system in a sensible state even if interrupted. To speed things up I tried libeatmydata, which worked surprisingly well. I updated the installation several times faster. LibEatMyData is named thusly because of it’s real ability to screw things up royally, but in this situation if anything went wrong, I could just restart. Maybe some yum devels could mention if this is outright dangerous, or a fairly safe trick if you can guarantee no interruptions.

Of cause at this point I only have one stick installed, and making six this way is out down right boring. So long as the other disks are the same size (or larger), you can clone the disks from one to another. You need a bit of storage space so best to do this from another machine.

dd if=/dev/sdb of=master_image
dd if=master_image if=/dev/sdc
dd if=master_image if=/dev/sdd
dd if=master_image if=/dev/sde
...

Watch out though if you use this method, all the partitions will gain the same UUID, which will confuse the system when more than one is plugged into a single machine.

The postage costs are annoying so I went with play.com, who offer free postage (which is nice). What was ridiculous is that they post each item in a separate box which is way too big. For a tiny piece of plastic, there is a Kingston presentation box, each placed in its own massive cardboard box and posted separately. I hear this is because they have some kind of tax loophole where parcels of value below some threshold are not taxed.

The entire CS department has be refitted with awful Dell machines which have some screwy USB chipsets which allow booting off a memory stick only from the back ports on some manufacturers. I did something really stupid by accidentally mentioning to duty-office that it was possible to boot the departmental machines off a USB device. Now they are now going to go through and disable this feature (Grrr).