Friday, December 03, 2010

Slow maybe

So I found a contact person from Tampere University and met her on Tuesday. She's a practising teacher, which means that she has visibility to the frontline trenches of computer-aided language learning. The 30min discussion that we had already clarified many things.

The primary difference between educational establishment and the Chinese/Japanese self-study scene is that the educational establishment is content-rich (they have paid, experienced and talented professionals writing content) and algorithm-poor. By contrast, the Chinese-Japanese self-study scene is content-poor (they have few good databases like EDICT/CEDICT for vocabulary) and they squeeze the last ounce of juice from those by applying high-quality algorithmics.

AFAIK The flashcard senetence deck which I used was created by simply searching example sentences for each HSK character from Juku. Also MDBG annotator work by searching the characters from the one extensive free Chinese vocabulary that there exists, namely CEDICT, despite the fact that it doesn't even break the words into several meanings. This way, algorithmics take full use of the few available databases.

By contast, in educational establishment we have professionally collected dicitonaries where each entry give plenty of information about phrases and alternative meanings, but where the algorithmics are controlled by companies who have no interest to do Firefox extension plugins, flashcard programs or annotators. The difference is even bigger in textbooks.

It seems to me that the teachers on the trenches don't need another slow and difficult site like the translation sentence site. Instead, they get far with customizations to existing tools. Let's hope that my contact person has patience to meet week after week and months after month to discuss needs and uses of technology in language teaching. It will be like peeling onions: I need to peel their needs layer after layer by small customizations (at most 1 week) which give them what they want.

In the long run, to get a thesis topic I need to find out an area where longer computer science effort is needed to implement useful features to CALL tools. However, getting from small customizations to that level will take months. Let's hope that the teaching/language contact person has patience to go through the technology mapping discussions instead of getting impatient for the apparent lack of concrete results.

I'd rather spend months in analysis paralysis, considering various rerearch topics, rather than to pick a research topic only to find out years afterwards that my research output is completely useless, because I failed at due diligence in the beginning. That nightmare scenarion actually happened with Finnish Annotator. I'd rather spend months doing due diligence, only to find out that there is no CALL research topic, thatn start and waste effort prematurely.

No comments: