I actually maintain several online Finnish learning resources now, including a flashcard deck of the most common 10,000 words from the YLE study way back [1], a command line lemmatizer [2], and a website whose permissions I need to refresh ASAP which archives Selkouutiset with YYYY/MM/DD URLs [3].
Indeed building these tools were what got me back into software development as a profession, after a long absence.
[1]: https://ankiweb.net/shared/info/1149950470
The tasks: scrape all the selko articles and then generate fi-> en translation tasks in the form of phrases (using an LLM). Same thing except en->fi. Free form task generation where I tend to use HS for articles (so like, adult Finnish). And then one I'm working on now, Finnish transcription from auto-generated speech, from selko again.
A lot of my evaluation is based on embedding + cosine distance. LLM's are truly bringing a golden age for language learning.
I was planning on doing a similar thing but my wife didn't seem terribly enthused by the current SOTA with Finnish text generation. Sometimes I do ask GPT-4 to "kerro minulle jotain kiinostavaa [aiheesta]" and throw the generated longform text into a single Anki card to read later on. The style definitely feels different, and it hallucinates a lot, but hey, it's still net beneficial on the margin.
Let's keep in touch, it's always good to meet fellow Finnward folk. My email is in my bio if you ever need anything.
Finnish has 15 noun cases, but it's probably better thought of as 4+6+5 cases. The first 4 are pretty straightforward, except for the partitive, which is kind of a catch-all case. The middle 6 correspond to certain spatial relationships. Very roughly you can imagine these as {inside, outside} × {unmoving, moving closer, moving further}. Huone = room, huoneessa = in room, huoneen = into (=moving closer to the inside of) room. That kind of stuff. The last 5 have niche, special uses. That's how I mentally imagine them at least, there are a lot of details you only pick up by reading a lot.
The trade-off is that Finnish has virtually no prepositions, which English has a lot of, and which are similarly very confusing for beginners and even intermediate English speakers. There are a few post-positions, but even these are mostly things you can pick up by ear.
Verbs have a similar story. If you've ever learned Latin, Russian or Spanish you'll feel right at home with Finnish verbs, which pack a lot of info into the conjugation, but with the benefit of requiring fewer actual words per sentence.
I'm pretty happy with translation, even with GPT-3.5. I haven't used it for native text generation. Happy to keep in touch :D.