Archive for the ‘Features’ Category

Anime Study Set Smorgasbord

Thursday, November 6th, 2014

Summary: 127 new study sets for Anime series (1000s of episodes) have been added to the Sets repository, check them out!

KanjiBox’s dedicated Study Sets section is doing very well these days, with hundreds of public sets available to any user for subscription (and thousands of private ones).

I have a particular fondness for sets that compile vocab/kanji needed to tackle a specific film/anime/manga, as I think it is a great way to get the most out of an otherwise fun activity. I even posted a small tutorial on how to make your own vocab set based on subtitle files for a film.

Even though most Japanese anime series aren’t really my cuppa, I know that many KB users love them (indeed, it is even the main motivation to learn for some) and they are quite popular in the subscriptions. So, when I happened upon a massive archive of anime series Japanese subtitles, I thought it would be worth spending a bit of time on a semi-automated tool to bulk import them into KB sets.

The results so far:

– 127 new Study Sets in KB’s Anime section.

– over 3000 anime episodes (most sets contain vocab for about 20 episodes at a time).

– complete episode coverage of 10 anime series so far: Black Jack, Crayon Shin-chan, Detective Conan, Doraemon, From the New WorldNeon Genesis Evangelion, One Piece, Sailor Moon, Space Pirate Captain Harlock.

– for each series, in addition to a number of volumes covering new vocabulary words for episodes (bundled in groups of 10-20 episodes per volume), a separate set (“Volume 0”) offers an overview of common words across the entire series. This set has the added bonus of being a good starting point for people who may want to check other material related to the title (e.g., a manga version or a film adaptation of the anime).

Because even batch parsing takes time, and because there probably is such a thing as too many sets in the database (at least until I have the time to program better search tools), I had to select a few animes out of the hundreds I have subtitles for  (I mainly went for those I knew, liked and/or had heard of). I will happily take any request for other series and add them to the next batch (assuming I do have the subtitles for them). Even better if you do have the subtitles and would like me to use the batch-import tool on them: contact me.

As usual, welcoming any feedback and suggestions!

Fixing JLPT Lists

Monday, November 25th, 2013


1. JLPT vocab lists suck (all of them).

2. The ones used by KanjiBox now suck a little bit less, thanks to the magic of statistics and computational linguistics.

Long version:


Watch Movies with KanjiBox! (guide to Custom Study Sets)

Friday, November 4th, 2011

2012 update: Sets can now be used with both online and iOS versions of KB (and shared using the ‘Sync’ feature).
There is also a new public site where everybody can browse for sets without the need for log-in (still need to be logged-in to subscribe, edit or train with the sets).

Avid users of KB may have noticed that after being introduced in the iPhone version a while ago, Custom Study Sets have finally made their way to the Online version.

I thought I’d walk you through a cool example of what can be done with study sets in the online version. In addition to that specific example, the current post should give you a decent overview of how Study Sets work.


Learning to write your address in Japanese

Wednesday, September 7th, 2011

Today, a detailed walkthrough on using KanjiBox‘s Learning Sets and KanjiDraw features together to make your life in Japan much easier. This tip focusses on learning to write your own address in Japanese, but work just the same for any other particular set of kanji.

If you are anything like me (and the vast majority of Japanese students out there), your reading skills are vastly superior to your writing skills. Which is just OK most of the time, since we live in the 21st century and nobody handwrites anymore.

Well, beside the fact that you should definitely reconsider (I know I have) and spend some time learning to write (yay: KanjiDraw!) in order to better read… There are also these pesky occasions where (hand)writing is mandatory. Usually, that’s when you are standing at the Kuyakusho/Shiyakusho, sweating bullets in front of a standard form asking to know who you are and where you live. Of course, you could always try filling it in romaji but 1) that’s not always an option 2) it would be no fun.

Fear not: with the tip below and a couple spare minutes during your morning train commute, you will soon be able to impress any bored local government employee with your amazing kanji-writing skills.

Note: click on any of the picture to see a slideshow with full-size versions.

Built-in Japanese Screen Reader in iOS

Tuesday, May 24th, 2011

While working on implementing accessibility features for the upcoming release of KanjiBox (more on that soon), I realised that iOS’s built-in VoiceOver feature made an incredibly useful tool for reading Japanese text.

I could kick myself for not finding it earlier (it’s been in iOS for nearly a year).

Detailed instructions, complete with screenshots, on enabling and using iOS’s VoiceOver feature.

In slightly related news: upcoming version of KB (ETA: early June 2011) brings advanced support for VoiceOver which should make some of the drills and screens playable by blind and visually impaired users (please get in touch if you are interested in beta-testing it and giving me first-hand feedback on accessibility).

KanjiDraw and KanaDraw

Tuesday, November 9th, 2010

Version 1.4 of KanjiBox brought KanjiDraw, version 1.5 is bringing KanaDraw (among many other cool and exciting features).

These two features add a completely new dimension to KanjiBox, allowing you to improve something at the heart of Japanese studies (and, until now, extremely hard to practice without a real teacher): handwriting!

Like most Japanese students (me included), you probably barely ever need to write Japanese by hand. The ubiquitous use of phones/computers/etc. makes it nearly redundant. And yet, knowing how to properly write by hand is much more important than it may originally seem:

  • A perfect master of kana is obviously crucial. Without it, you are functionally illiterate in Japan, unable to properly fill-in any form or other piece of administrative paper that fills your daily life.
  • Aside from the obvious direct use of knowing how to write kanji and kana, knowing how to write them, having paid attention to ever single stroke, will dramatically improve your ability to read and remember them. You cannot hope to go beyond a certain level of Japanese without a working knowledge of stroke order (useful for lookups) and kanji sub-radicals: it is all too easy to learn a few hundred kanji by memorising their overall aspect (and not really paying attention to their radicals), but it will come and bite you in the arse when you start learning more and more complex variations.

Both KanjiBox and KanjiDraw use even more complex algorithms than the original Drill&Quiz methods to analyse your strokes and propose custom corrections. They are extremely strict on the stroke order (no way around that), but allow a fair bit of leeway on the shapes etc., in order to make up for the difficulty of tracing the characters with a finger on a touchscreen. The difficulty (including the level of strictness) goes increasing with your performance.

Of course, these new modes use the same adaptive learning algorithm used by all other parts of KanjiBox, meaning that entries are automatically selected on the basis of how well your past performances have been.

Have fun with these new features and don’t hesitate to leave your impressions here or contact me directly…

Supporting New JLPT Levels

Tuesday, September 28th, 2010

As you probably know if you are planning to take JLPT in the near future, the test has received a massive overhaul this year. Among other major changes (along with new passing requirements etc), was the introduction of a new level breakdown, replacing the old 4級-1級 (“4-kyuu” etc., henceforth referred to as J4-J1, for simplicity’s sake):

N1: the same passing level as the original level 1, but able to gauge slightly more advanced skills, possibly through equating of test scores.
N2: the same as the original level 2
N3: in between the original level 2 and level 3
N4: the same as the original level 3
N5: the same as the original level 4
Source: official guidelines, via Wikipedia

As you can see, the only real change is the addition of a new intermediate N3 level, between former J2 and J3 levels.

Unfortunately, another feature was introduced in the revised JLPT: there are no official content specification, “so as to discourage people from studying exclusively from lists of words of kanji”1. This means there is absolutely no way to know what words and kanji belong to level N3 (for our sake, we will pretend that the lists inherited from former levels, wouldn’t change all that much, since the required proficiency level remains the same).

What about all these webpages/applications that have “N3 Vocabulary/Kanji Lists” etc.?

They are all 100% based on guesswork. How sound a guesswork, depends a lot on the site/application (some lists out there are particularly wonky), as well as the section concerned: kanji can be reasonably broken down from the original J2 set (about 750 kanji) into two sets of approximately equal size using native Japanese school levels (grade 3 and 4 make N3, while grade 5 and 6 belong to N2). For vocabulary lists, however, there is no easy way to separate: most sites use newspaper word frequency to make their guess, a criterion that is anything but reliable, if past levels are any indication.

What about those textbooks that offer preparation for N3?

It is very *cough* unclear how publishers know what words and kanji to include in their N3 textbooks. On the off chance that they know what they are doing, we (the KanjiBox community at large) have started compiling lists of kanji and words spotted in N3 textbooks (as well as official exams and mock exams). But as you can imagine, it will be some time before such lists can reach sufficient maturity to be used in KanjiBox (feel free to contribute, though!).

So, what about KanjiBox, then?

Here is the plan for future versions of KanjiBox, with regard to revised JLPT levels:

1. In a near future, KanjiBox will start using new JLPT level names. Meaning that J4, J3, J2, J1 will become N5, N4, N2 and N1 respectively. Absolutely nothing about each level content will change.

2. Within a few months, KanjiBox will start offering an “experimental” N3 level, with lists based on a mix of guesswork and textbook curation (see above).

In the meantime, I would recommend people planning to take N3 to first set their goal at J3 and later update their settings to J2 (keeping in mind that N3 should cover considerably content than former J2, but since there are no clear lists, it cannot hurt to extend your knowledge a little).

PS: since we are on the topic of levels: many people have contacted me to request the possibility of setting separate levels for each mode (kanji, vocab etc). This is coming! (most likely in the next upgrade, sometime in October).

  1. A laudable goal, really… made ever so slightly suspicious by the fact that it hasn’t stopped publishers from putting out revised JLPT textbooks that mysteriously seem to know exactly what content belongs to what level. But only the most cynical minds would suggest some collusion between JLPT people and their publisher buddies, with lists changing hands over sake and dinner. So we won’t. []

New JLPT Levels: N5 to N1…

Sunday, March 21st, 2010

Many people have asked when I would add support for new JLPT levels (N5, N4, N3, N2 and N1… replacing old 4級 through 1級). Short answer is: Real Soon Now™.

The more nuanced answer is that we are working on it actively, but gathering data for the new level (only N3 differs significantly from previous levels: other should match roughly the old J4-J1 levels) is a very slow and uncertain process. There are no official lists of kanji or vocabulary for the new levels and the only way to proceed, is to cull them from existing textbook exercises and training tests (on the mostly valid assumption that said textbooks’ editors routinely have business “lunches” with JLPT people where secret lists of kanji might change hands). Since I would prefer the lists to be somewhat validated by official tests, it might be difficult to release an update before late Spring (when the next JLPT exam takes place)…

As an aside: take all the alleged “updated JLPT” lists you see floating around with a large grain of salt: they are all based on interpolation and essentially just break down former J2 level in two groups along grade level (for kanji) and frequency (for vocabulary): a breakdown that is far from verified in practice.

Until then, I would simply advise people who want to train for N3 to use 2級 for now (keeping in mind that it is largely above N3, but still the closest we have).

In the meantime, stats syncing and online scores should arrive soon!