Friday, November 05, 2010

Action points and plan B

1. I've been working on an N900 port of the character drawing exercise.
It demonstrates that FA technology is still valid and can be reused. Spend 2 week finalizing the proof-of-concept prototype.

2. Post an introduction to the N900 port.

3. Write a plea for CALL research partners into this blog.

4. Make a list of CALL researchers by browsing the researcher lists in Finnish university web pages.

5. Write cold call mails for CALL researchers, tell them about my track record in CALL, and ask if they want to talk about research. Add links to my thesis and blog.

If step 5 produces no contacts, it means that cold calling doesn't work (what a surprise.) In that case I have to use personal contacts to get a research topic for graduate studies in 2011. Usually people don't talk about their work, including researchers. This leaves me with few options.

6. Contact Yoe to ask for a research topic in bioinformatics. This post hints that she might have a suitable topic for someone trained in math and programming. I don't know Yoe, so to avoid cold calling, I would ask recommendations from Vera and Janka. ("We have tracked Simo for years and he is what he claims to be. If you have a suitable research topic in bioinformatics, you'd probably benefit a lot from assigning it for Simo.")

Good: Bioinformatics has reputation as down-to-earth and useful branch of applied researh. I'll learn new things because of the 'bio' part.

Bad: No earlier track record. Not enough background to evaluate if the research topic would pass the scrutiny in Hamming's advice.

If 6. fails:

7. Ask The Scientist for an applied research topic in model checking based on this post.

Good: I know The Scientist personally. Also FA already made me familiar with finite languages and state machines. For example, I implemented deterministic state machine minimization to make some vocabulary state machines faster to handle, while The Scientist is working with algoritms to simplify nondeterministic state machines.

Bad: When studying, we hade a course about DisCo and temporal logic of actions. While the theory part was a fun trip to a different worldview, DisCo toolset left a really bad taste in my mouth. It had "ivory towery" feel: it could never become useful for solving practical problems, no matter how well the researchers reach their research goals.

I'm aware that The Scientist doesn't work with DisCo, and it's just my stupidity that I don't understand the field. After all, he does writes more often, for more readers, about a wider range of topics and so on. But that does not change the fact that it would be insane for me to do research in an area where I don't understand the big picture.

Asking The Scientist for a research topic may lead to a very embarrassing situations where I have to say no even if he gives me everything I ask for.

If 7 fails:

8. Write a plan for 2011 which does not include graduate studies.


Tiedemies said...

I only just noticed the backtrack. No embarrasement necessary, I would have a topic for you, if you like, but there are two ifs'n'buts:
1. I have no funds for graduate students, so I can't hire you.
2. I cannot be an "official" supervisor for anyone just yet, as I don't have an adjunct professorship.

I'll be back in Finland, and at my office the week from 29.11. to 3.12, and again after the 13th, for a while. Please, come and talk to me.

The big picture is very simple. We take a computer program, make an automaton of it, we take a specification and make another automaton of it's negation and check the combination for nonemptiness. We have other methods as well, ones that would suit you fine, and I can help you with any and all that you might wish, to find a niche of your own.

In fact, I can even try to coax some people to give me money. I don't know how to do this, but I will ask around. If you are a good programmer, I would have use for you and you can make some science in the process. The only "buts" are listed above, the latter one might be fixed within a year or so. The money is a more severe problem.

Simo said...

Sorry for delay, I was away from Internet for the whole weekend.

Regarding funds, my current employer pays me monthly salary, so I don't need extra funding. Moving to academic position would involve a big pay cut, for example the 1690e a month offered by researcher schools is much lower than industry salaries. I don't love science that much. It's different for you as you made the right moves to become a lecturer already while studying.

The big picture question is a bit different than the one you answered. For example in Finnish Annotator, the goal was to make language learning easier. Building an annotator for Finnish was one approach to make Finnish reading comprehension easier. Writing a two-level morphology engine was one way to decode inflection for Finnish words. Writing a state macine minimization algorithm was needed to make the precomputation phase of two-level morphology run in reasonable time. Without minimization, combining various rule and vocabulary state machines took forever. There was a link between making language learning easier and implementing a state machine minimization algorithm. The link was very mediated, but it was there and each step was justified.

What I don't understand is how you research differs from solving Rubik's cube. There have been academic papers about solving Rubik's cube. They use formal methods and proofs, and can be justified by saying that it is basic research is abstract algebra, and the developed mathematical methods might some day be applied to some real-life problem. My opinion is that it is better to research useful questions directly.

Simo said...

Even many very productive theoretical researchers take effort to nail their research down to concrete goals. For example Mehryar Mohri says makes it clear that part of his research is "all about" speech recognition, although he has nowadays done too many things to have just one area of application. His research includes highly theoretical papers on state machine minimization. There are also very productive researchers like R.J. Lipton, for whom there is no such link. This is a question of research style.

Making a PhD thesis is hard. Tampere University CS department has about 100 graduate students and produces about 5 PhD thesis a year. For every graduate student who graduates, there are 1 - 3 who never reach the finishing line. Research deals with unknown things by definition. In Finnish Annotator I saw that it is easy to show bad judgement and go wrong way even when you have a clear goal. The risk is even bigger if the goal is unclear and I don't know how research in model checking differs from solving Rubik's cube.

I know that for example FAA imposes severe proving and testing requirements on all aviation software. However, in the course about temporal logic of actions and DisCo, there was no link whatsoever between the practical application of proving program correctness and the theoretical methods of TLA and DisCo toolset. There were no examples like "This 120-page proof about Boeing 747 gas engine can be reduced to 90 pages, thanks to the algorithms and tools we have developed". If the link is not there, I don't see how the work differs from solving Rubik's cube. This is what rang my alarm bells in that particular course.

Simo said...

Anyway, I'll come to talk about this to your office.

yoe said...


Why exactly do you want to have a PhD? (The answer to this question should guide you to your next move.)

- yoe

Simo said...

1. Research would be a constructive, goal-oriented, useful hobby with long time scale. It would replace studying Chinese.

2. Although I learned a lot from the Finnish Annotator project, the others put it to category "minor, insignificant hobby project". PhD certificates are comprehensible also to other people.

A colleague of mine, Jukka Ollikainen, is doing bioinformatical research in genetic fingerprints as a hobby, while working at the same company as I do. Tampere University DARG research group has produced many bioinformatical licensiate and PhD thesis.

yoe said...

Doing research as a hobby sounds fun:) And I am certain there are plenty of informatics projects that will benefit from expertise you have, and you will enjoy solving the problems.

I don't think you really need a PhD certificate for this purpose, however. The reason why so many people drop out of PhD studies is in fact that getting a PhD is often explicitly non-fun, rather than non-important ... ie., hard work requiring your entire concentration.

If you don't need the actual certificate for your work, and don't want to become a "scientist", there is no need for spending time and resources - both yours and whoever would be granting you the PhD - on that.

A friend of mine, currently nearing completion of his PhD in a field related to bioinformatics, recently calculated that going for the "certificate" has cost him 1 million USD in of tuition + lost income. Of course, we don't do this for the money, but I would suggest you (and everyone else considering taking the PhD) carefully consider what it is you actually believe you will gain from that piece of paper.

If you just want recognition ... the so called "lay people" also tend to appreciate science-related popular books, perhaps even more than PhDs. Write a book on your expertise area!

Cheers, and good luck on whichever route you choose to take,

- Yoe

Simo said...

The thing I want to get away from is that 80% of the projects I've worked on have never had a single end user. The projects are discontinued, or go on for years without delivering, or are prototypes to begin with, or the usability is so abysmal that users use different methods.

You wrote that the need and inability to do necromancy for a close relative with a brain disease drove you to neuromancy. You wouldn't think of firing up laptop during a film if 80% of the time some stipulation in the funding contract would render your research output totally useless, so that it could never play any role whatsoever in the long and winding path to curing diseases, and you could only sigh "Oh well, the stars weren't right in this project either." During the past 100 years, technology has made people's lives much easier. For 15 years I saw programming work as continuation of this trend, but not anymore. This has happened under 8 different managers and in the premises of 3 companies, so it is not something to be solved by changing employer.

What kind of PhD do I have in mind? Basically, I would write a program, which solves a new kind of problem using mathematical methods. In programming, there are many small tools written by a single person, for example Darcs or Winmerge. Limited scope ensures that the project is doable by a single person.

This would require a reviewer, who ensures that the program is useful for intended purpose. He or she would review progress every 3 months, concentrating on big picture questions like: What requirements have to be met before the program useful? Are there any unintuitive or lousy user interface defects? Is the program obsolete because new knowledge changes the big picture or are there new, closely related problems which the program could solve with little modification? Are there any assumptions sabotaging the progress? the reviewer should have a permanent interest to get quality output from the project; only mutual benefit ensures high-quality outputs.

Jukka's work illustrates this both in good and bad. His researches ways to search big databases of genetic fingerprints. He started in 2002, and got the research topic by personal relations. The problem was new; big genetic fingerprint databases were just emerging. You may also be in a position, where you hear first about such emerging problems, which are limited in scope and in the sidelines of science, but need to be addressed anyway.

My impression is that Jukka and the others had problems getting things done. They did get inquiries from potential customers, but were unable to answer them, even if their core algorithms were quite fast. I would never let such project wither on for 9 years, being more self-critical and more willing to cut the losses and move on to other pursuits. You said that a PhD imposes a cost on the institution giving it, and a 9 year drag certainly does. In my case, any such project would have to spread its wings and fly or sink in a few years at most. Good reviewers would hardly watch the project drag on for 9 years.

Simo said...

Finnish Annotator was my hobby project and master's thesis topic. It aimed to produce an MDBG-style annotator for Finnish. It is relevant, because it used mathematical methods to solve a new kind of problem. When I took the thesis to the professor, he said that it contained novel material for "one, maybe two" publications. This surprised me, because I only aimed at making a language-learning tool. I was reading Paul Graham and didn't even think of continuing studies. Had I had the sentence "I am a PhD student and to graduate, I need to gather publication points" etched to my head, it would have been easy to generate two publications: one about the minor algorithmic improvements, which the professor was talking about. Secondly, had I found a research-minded teacher of Finnish for foreigners, a solid field test in a Finnish course would have easily generated a publication in a journal like Language Learning & Technology.

FA started with a literature search for methods to parse Finnish, about which I had no prior knowledge.

The Achilles' heel of Finnish Annotator was exactly the lack of review. I didn't use free vocabularies, because they could not be used for commercial purpose like doing something like Lukutulkki; also their quality was low. Any reviewer with a stake at making it work would have easily noticed that. He or she would also have noticed that the site had cluttered look as it tried to do too many things. However, even the professor guiding the thesis saw it only as a writing assignment and not as a software project, so I never got any review. (Btw. you're providing quite good review on my PhD plans.)

I'm never again going to start anything of that scale without the review structure in place. Good, regular feedback from a reviewer with a stake beats any accumulated experience, even if the review is done quickly to save effort.

Finnish Annotator lasted for 2 summers, on which I didn't work. My previous monthly pay was 2000 euros a month, so it cost me at least 14000 euros in lost income - these calculation are familiar. Poor bank is going to have to wait its mortgage payments a few more years!

FA demonstrates that solving new problems with mathematical methods automatically produces publishable findings, especially if you aim at it. With a little 'faking it' attitude, those required 4 - 7 publications may come out as a side effect of uncompromising product development for a new problem.

Regarding the listed action points, The Scientist couldn't explain in plain terms what real-world problems his research is solving. The 3 CALL researchers in Tampere don't seem very interested in writing new software based on lack of contact and lack of specifications. If you don't have a suitable topic, or you believe that it is better to let real graduate students/post-docs look at it, then what is left is executing Action Point 8. Scouting for better options to surface is not a bad option at all. As you say, there are many downsides to graduate studies, as they take a lot of time and effort.