alternative algorithm #191
Replies: 36 comments 59 replies
-
@Vilhelm-Ian I'm studying Chinese, so I don't have to worry about morphs vs lemmas. Is your desire that all morphs for the same lemma get treated the same in all cases (for calculating frequency, for recalc, other?) Is the lemma always a real word? What language are you learning? If it's not Japanese, we'll need to make sure that whatever gets implemented also works for Japanese. My current thoughts on calculating the card score are to take a combination of
I could be convinced that there are better ways, but my main goal is to focus on getting the most frequent morphs first. # 1 is different from the Ankimorphs difficulty calculation, because the frequency only is taken into account for the unknown morph(s). Ankimorphs difficulty adds up the frequency of all of the morphs in the sentence, which means that it there is a large effect of the other words in the sentence, and short sentences are prefered more than frequent morphs. # 2 tries to find cards that have close to that number of morphs. Right now I have it set to 5, but I'm not sure how good that is. The basic idea is to avoid cards that are too short (where you don't get context) and too long (where you get overwhelmed with reading the card). If we are trying to get a single algorithm, I think either emphasizing the length a lot might work for mortii, or using the Ankimorph difficulty and then scaling it back might work. Assuming that you just want "treat all morphs with a given lemma the same", I think we'd just have to figure out how to do that, and then if there were a flag to turn this on and off, we could make it work with the same algorithm. |
Beta Was this translation helpful? Give feedback.
-
I am learning german. Morphman had a check box that allowed you to treat all lemmas as the same. In the past I've used morphman to learn japanese. And even then I found the option really useful. But with german is crucial. One word because of thecases and singular, plural forms, tenses can have many many forms. The first thing you are trying to achieve. I think it's desirable to achieve a sentence with the most common possible words. Since that would make the sentence more comrphensible. Morphman had an option that would either skip or deprioritize(don't remember) cards that are too short or too long. I think that would fix both problems I like the idea of prioritizing cards with recently learned morphs. I have been yearning for that feature for years. So long that I forgot about it.
currently we penalize morphs even if there are know but not in the frequency list. Could we instead of penalizing, just skip them in the calculation. That would have the unintended effect of prioritizing words that are not in the frequency list over words that are. As I am typing this I get what the issue is. What if we addded all the known morphs to the end of the frequency list. That way they won't be prioritized over words in the list and won't be penalized as much words that are not in the list and are not known. I would like to also mention this discussion "include grammar difficulty/usefulness #115 " maybe you will get some inspiration. |
Beta Was this translation helpful? Give feedback.
-
I'm not going to interject into this discussion (unless I'm explicitly asked a question) because I think it would taint the process. I'm seeing interesting points already so this is great stuff! 👍 |
Beta Was this translation helpful? Give feedback.
-
@mortii @Vilhelm-Ian "Treat all the different forms of a lemma the same" feels like it is separate from the card scoring (other than you want all of the forms to be considered the same for the purpose of scorint). To actually treat all the forms the same, you'd have to collapse them in every place they exist, including the frequency list. I guess that would require all of the place where the morph is used to be replaced with something that indicates if you're using the inflected form or the base form. Does that sound right to you? If so, perhaps @Vilhelm-Ian could work on that?
This is in favor of using the Ankimorph way of calculating sentence difficulty. I'm liking the combination of this along with some constraints / preferences on sentence length, because I wouldn't want really short or really long sentence. I agree that we need to think carefully about how to count words that don't exist in the word frequency list. I will look at the grammar thread again, but I think that integrating it will be beyond my ability. |
Beta Was this translation helpful? Give feedback.
-
@xofm31 another idea I had. What if in the calculation we don't count young morphs. That way they won't add to the difficulty score to the card so more likely to appear in new cards. |
Beta Was this translation helpful? Give feedback.
-
I refactored |
Beta Was this translation helpful? Give feedback.
-
I hadn't gotten very far, so I will take your advice to start from scratch. I've been really busy, so I haven't had a chance to spend much time on it. I got sidetracked by wanting to bump up cards that had a word that was in the learning stage to reinforce those words. There was a separate discussion about the longest-interval, and I'd still like to investigate that a bit. But maybe I should table that to put together a first draft of a more flexible (and hopefully not overly complicated) score calculation. |
Beta Was this translation helpful? Give feedback.
-
Here is a sketch of what I was thinking. https://github.com/xofm31/anki-morphs/blob/calc_score/ankimorphs/calc_score.py The main idea is that for each of the criteria, you have a target, and then you have a penalty if the card misses that target. Each of the criteria has a weight. I was thinking that perhaps each of the penalties should be in its own function for readability purposes, but I think that would require looping through all of the morphs on the card multiple times. I defined the target values and weights at the top of the file, but if this were going to be integrated into Ankimorphs, at least some of those would have to be user selections in a menu.
I'm not sure exactly how this would be implemented. Is this part of the sentence_difficulty? |
Beta Was this translation helpful? Give feedback.
-
It is semi-functional. I haven't thought about all of the use cases, and I discovered today when I tried to use it on a frequency file rather than the Collection, that it crashed when the morph wasn't in the frequency list. I corrected it so now it sets it to the unknown morph score - 1. It gives me approximately what I was hoping for, which is to always give the unknown morph that is highest on the frequency list, but order the cards with that morph according to desired length and having another morph that is in the learning stage. I set the weight for the difficulty of the sentence (which should be essentially your algorithm) to 0. If you like this approach, then we'd need to work out the best way to score the different factors (for example, should there be one target length, or a range; how to measure the penalty for not being the right length). It might be tricky to figure out how to give options for a user to measure the relative weights of the different factors. I was thinking maybe your sentence-level difficulty could be changed to the average morph difficulty, which ought to make the scores more compatible with the scores from the "unknown morph usefulness". It would also mean that someone could opt for easy words but longer lengths. I won't have much time to look at this over the next week or two. |
Beta Was this translation helpful? Give feedback.
-
I cleaned it up a bit. Compared to my initial draft:
I did get get shorter sentences by making the length penalty significantly higher than the difficulty penalty and having a desired sentence length of 0. But unless you only want short sentences, I think the ideal settings will vary greatly based on how long your frequency list is and how many morphs you already know. I can't think of a way to make the equations take this into account, other than using something like the average morph_priority of the top 100 unknown morphs. That would get ugly, and I'm not sure if it would work. This applies also with the usefulness of an unknown morph, but for me is not so much of an issue because I want it to always prioritize the usefulness, and just have the other criteria be secondary, basically to order the sentences with that morph. Here's the branch: Previously there was discussion about treating all inflections of the same lemma the same. I didn't look at that at all - I'm not sure if @Vilhelm-Ian has or not. Basically, I think it would require changing |
Beta Was this translation helpful? Give feedback.
-
I don't want to step on your toes @Vilhelm-Ian, but finishing this algorithm is now first priority, so I'll start implementing the lemma stuff now to get it done. Maybe you could help test it out eventually since you have a good understanding of the problem? |
Beta Was this translation helpful? Give feedback.
-
I feel like these should be in the reverse order: card id: 1691325167622
maybe the Also this is bugged: it should be EditI pushed the code and some of changes to the https://github.com/mortii/anki-morphs/tree/algorithm branch. Let's use this branch as the origin and then make pull-requests on that when we want to make changes. Don't worry about git the git history, we will clean that up at the end. |
Beta Was this translation helpful? Give feedback.
-
@xofm31 Using names like "difficulty" and "usefulness" is not great because they are completely subjective, which essentially makes them opaque, so I renamed most of the identifiers. After doing that, I see that the same thing is basically done twice, but I don't see the reason for it: anki-morphs/ankimorphs/calc_score.py Lines 88 to 93 in bc43fee anki-morphs/ankimorphs/calc_score.py Lines 99 to 103 in bc43fee and the score looks like this: anki-morphs/ankimorphs/calc_score.py Lines 113 to 118 in bc43fee what is being achieved by EditNvm, I realize you want to disentangle the unknowns from the rest of the sentence since you mostly care about the unknowns, that's fine. |
Beta Was this translation helpful? Give feedback.
-
Originally posted by @mortii in #191 (reply in thread) Trying to come up with a general purpose algorithm is a fool's errand, and we should instead transition into making an api where people can specify their own algorithm. I'll make a sketch shortly. EDIT: thoughts? |
Beta Was this translation helpful? Give feedback.
-
So the default could be something like this: and then this would be how I would personally modify it: Edit: ^should be difference, not distance |
Beta Was this translation helpful? Give feedback.
-
Okay, I have a (very badly coded) test version that has implemented the new algorithm and an option to choose between lemma priority and inflection priority! This version can be found in the latest algorithm branch, or you can download it from google drive (github doesn't like .addon files): ankimorphs-v3-0-0-testing-1 The algorithm "morph targets" stuff looks complicated, but all you do is define a range with no punishment, and then either side of that range you can specify a punishment curve (ax^2+bx+c), the default is this: Any and all feedback would be very much appreciated! |
Beta Was this translation helpful? Give feedback.
-
I installed the new version and noticed some different behavior.
I have to disable the regular version of Ankimorphs to run it (lowercase
new version, uppercase stable). What I expected.
I left the Algorithm settings just as they installed. I included a
screenshot.
I had to run recalc twice to get the "shift cards that are not the first to
have an unknown..." part to run on and move duplicates out.
Since I had added the new fields into the notetype that ankimorphs uses, I
had to upload a full sync of the decks thru ankiweb. Ankiweb immediately
choked and spat the collection out as too big. I had a ridiculously big
collection, 192000 cards anyway, so I deleted a bunch of stuff to get it
down to a reasonable size. No problem, I had been meaning to do this
anyway.
I synced a again and I got some strange looking results in how things are
sorting. I included a screenshot showing a few fields for the top
priority card (no am-unknowns!), and another card where things may be
working .
I think I may have messed up my collection with my deletion of around
100000 cards to reduce the size of the the collection.
…On Sat, May 18, 2024 at 2:53 AM mortii ***@***.***> wrote:
Okay, I have a (very badly coded) test version that has implemented the
new algorithm and an option to choose between lemma priority and inflection
priority!
This version can be found in the latest algorithm
<https://github.com/mortii/anki-morphs/tree/algorithm> branch, or you can
download it from google drive (github doesn't like .addon files):
ankimorphs-v3-0-0-testing-1
<https://drive.google.com/file/d/1jMrVYax4aAXOTDYxMT1NMzXIuXL-_Mp_/view?usp=sharing>
The algorithm "morph targets" stuff looks complicated, but all you do is
define a range with no punishment, and then either side of that range you
can specify a punishment curve (ax^2+bx+c), the default is this:
Screenshot.from.2024-05-18.11-49-09.png (view on web)
<https://github.com/mortii/anki-morphs/assets/15674619/913466c4-1057-4a15-9499-e7516f738e52>
(graph link <https://www.geogebra.org/graphing/ta3eqb8y>)
Any and all feedback would be very much appreciated!
—
Reply to this email directly, view it on GitHub
<#191 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/APG7PUPS6UJJ233U2SDWBZLZC4QJVAVCNFSM6AAAAABE2NC532VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TINZXG44TC>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
So often I have stood there flummoxed in a state of tartlement!
|
Beta Was this translation helpful? Give feedback.
-
Oh, that m-unknowns column in my screenshot of my browser window is
am-unknowns, of course.
…On Sat, May 18, 2024 at 7:16 AM stephen fuqua ***@***.***> wrote:
I installed the new version and noticed some different behavior.
I have to disable the regular version of Ankimorphs to run it (lowercase
new version, uppercase stable). What I expected.
I left the Algorithm settings just as they installed. I included a
screenshot.
I had to run recalc twice to get the "shift cards that are not the first
to have an unknown..." part to run on and move duplicates out.
Since I had added the new fields into the notetype that ankimorphs uses, I
had to upload a full sync of the decks thru ankiweb. Ankiweb immediately
choked and spat the collection out as too big. I had a ridiculously big
collection, 192000 cards anyway, so I deleted a bunch of stuff to get it
down to a reasonable size. No problem, I had been meaning to do this
anyway.
I synced a again and I got some strange looking results in how things are
sorting. I included a screenshot showing a few fields for the top
priority card (no am-unknowns!), and another card where things may be
working .
I think I may have messed up my collection with my deletion of around
100000 cards to reduce the size of the the collection.
On Sat, May 18, 2024 at 2:53 AM mortii ***@***.***> wrote:
> Okay, I have a (very badly coded) test version that has implemented the
> new algorithm and an option to choose between lemma priority and inflection
> priority!
>
> This version can be found in the latest algorithm
> <https://github.com/mortii/anki-morphs/tree/algorithm> branch, or you
> can download it from google drive (github doesn't like .addon files):
> ankimorphs-v3-0-0-testing-1
> <https://drive.google.com/file/d/1jMrVYax4aAXOTDYxMT1NMzXIuXL-_Mp_/view?usp=sharing>
>
> The algorithm "morph targets" stuff looks complicated, but all you do is
> define a range with no punishment, and then either side of that range you
> can specify a punishment curve (ax^2+bx+c), the default is this:
>
> Screenshot.from.2024-05-18.11-49-09.png (view on web)
> <https://github.com/mortii/anki-morphs/assets/15674619/913466c4-1057-4a15-9499-e7516f738e52>
>
> (graph link <https://www.geogebra.org/graphing/ta3eqb8y>)
>
> Any and all feedback would be very much appreciated!
>
> —
> Reply to this email directly, view it on GitHub
> <#191 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/APG7PUPS6UJJ233U2SDWBZLZC4QJVAVCNFSM6AAAAABE2NC532VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TINZXG44TC>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
--
So often I have stood there flummoxed in a state of tartlement!
--
So often I have stood there flummoxed in a state of tartlement!
|
Beta Was this translation helpful? Give feedback.
-
I deleted the am-known-automatically tag and the am-ready tag from the
collection and did a recalc, and nothing happened.
I'll play with this some more (quietly). :-)
…On Sat, May 18, 2024 at 7:25 AM stephen fuqua ***@***.***> wrote:
Oh, that m-unknowns column in my screenshot of my browser window is
am-unknowns, of course.
On Sat, May 18, 2024 at 7:16 AM stephen fuqua ***@***.***>
wrote:
> I installed the new version and noticed some different behavior.
> I have to disable the regular version of Ankimorphs to run it (lowercase
> new version, uppercase stable). What I expected.
> I left the Algorithm settings just as they installed. I included a
> screenshot.
> I had to run recalc twice to get the "shift cards that are not the first
> to have an unknown..." part to run on and move duplicates out.
> Since I had added the new fields into the notetype that ankimorphs uses,
> I had to upload a full sync of the decks thru ankiweb. Ankiweb immediately
> choked and spat the collection out as too big. I had a ridiculously big
> collection, 192000 cards anyway, so I deleted a bunch of stuff to get it
> down to a reasonable size. No problem, I had been meaning to do this
> anyway.
> I synced a again and I got some strange looking results in how things are
> sorting. I included a screenshot showing a few fields for the top
> priority card (no am-unknowns!), and another card where things may be
> working .
>
> I think I may have messed up my collection with my deletion of around
> 100000 cards to reduce the size of the the collection.
>
> On Sat, May 18, 2024 at 2:53 AM mortii ***@***.***> wrote:
>
>> Okay, I have a (very badly coded) test version that has implemented the
>> new algorithm and an option to choose between lemma priority and inflection
>> priority!
>>
>> This version can be found in the latest algorithm
>> <https://github.com/mortii/anki-morphs/tree/algorithm> branch, or you
>> can download it from google drive (github doesn't like .addon files):
>> ankimorphs-v3-0-0-testing-1
>> <https://drive.google.com/file/d/1jMrVYax4aAXOTDYxMT1NMzXIuXL-_Mp_/view?usp=sharing>
>>
>> The algorithm "morph targets" stuff looks complicated, but all you do is
>> define a range with no punishment, and then either side of that range you
>> can specify a punishment curve (ax^2+bx+c), the default is this:
>>
>> Screenshot.from.2024-05-18.11-49-09.png (view on web)
>> <https://github.com/mortii/anki-morphs/assets/15674619/913466c4-1057-4a15-9499-e7516f738e52>
>>
>> (graph link <https://www.geogebra.org/graphing/ta3eqb8y>)
>>
>> Any and all feedback would be very much appreciated!
>>
>> —
>> Reply to this email directly, view it on GitHub
>> <#191 (comment)>,
>> or unsubscribe
>> <https://github.com/notifications/unsubscribe-auth/APG7PUPS6UJJ233U2SDWBZLZC4QJVAVCNFSM6AAAAABE2NC532VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TINZXG44TC>
>> .
>> You are receiving this because you were mentioned.Message ID:
>> ***@***.***>
>>
>
>
> --
>
> So often I have stood there flummoxed in a state of tartlement!
>
>
--
So often I have stood there flummoxed in a state of tartlement!
--
So often I have stood there flummoxed in a state of tartlement!
|
Beta Was this translation helpful? Give feedback.
-
Is it working the way it is supposed to? Is it giving me some cards with no am-unknowns that have morphs that are still in the learning stage? |
Beta Was this translation helpful? Give feedback.
-
I'll try the images on GitHub.
Added back some of the cards that I deleted, and the algorithm gave me a
bunch of cards (maybe 40) without am-unknowns.
My impression is they were very good cards, nice reviews.
The cards I added back in were from a couple of Murakami books, which are
often different from the the usual anime deck cards.
…On Sat, May 18, 2024, 11:11 AM mortii ***@***.***> wrote:
Ah, interesting. Could you give some examples?
—
Reply to this email directly, view it on GitHub
<#191 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/APG7PUMV5KYGIPOX3G4ZCIDZC6KVJAVCNFSM6AAAAABE2NC532VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TIOBRGM4DM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Let me try to attach the screenshots here. |
Beta Was this translation helpful? Give feedback.
-
Here are a couple more, focusing on the part of the screen that might be interesting. |
Beta Was this translation helpful? Give feedback.
-
I think what is happening is that the cards without am-unknowns are not moving to the end of the deck. location after recalc:10000639 am-unknowns:0 am-score:2047483647 tags:am-ready, Bakuman_S02∷14 |
Beta Was this translation helpful? Give feedback.
-
I'll do that in a couple of hours.
I did delete a bunch of cards just before I used the new version,and I
wonder if I broke something...
…On Sun, May 19, 2024, 9:12 AM mortii ***@***.***> wrote:
@fuquasteve <https://github.com/fuquasteve> I'm having problems
reproducing the behaviour, could you share your settings?
Go to: Tools -> Add-ons -> ankimorphs -> "Config" button on the lower
right sidebar, and then copy paste everything, .e.g.:
"algorithm_all_morphs_target_distance": 1,
"algorithm_average_priority_all_morphs": 0,
"algorithm_inflection_priority": true,
"algorithm_learning_morphs_target_distance": 5,
"algorithm_lemma_priority": false,
...
...
—
Reply to this email directly, view it on GitHub
<#191 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/APG7PUNFHFHRR5EVYSIB2NLZDDFP5AVCNFSM6AAAAABE2NC532VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TIOBYHA3TG>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Maybe mess d up tags or something...
…On Sun, May 19, 2024, 11:47 AM stephen fuqua ***@***.***> wrote:
I'll do that in a couple of hours.
I did delete a bunch of cards just before I used the new version,and I
wonder if I broke something...
On Sun, May 19, 2024, 9:12 AM mortii ***@***.***> wrote:
> @fuquasteve <https://github.com/fuquasteve> I'm having problems
> reproducing the behaviour, could you share your settings?
>
> Go to: Tools -> Add-ons -> ankimorphs -> "Config" button on the lower
> right sidebar, and then copy paste everything, .e.g.:
>
> "algorithm_all_morphs_target_distance": 1,
> "algorithm_average_priority_all_morphs": 0,
> "algorithm_inflection_priority": true,
> "algorithm_learning_morphs_target_distance": 5,
> "algorithm_lemma_priority": false,
> ...
> ...
>
> —
> Reply to this email directly, view it on GitHub
> <#191 (reply in thread)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/APG7PUNFHFHRR5EVYSIB2NLZDDFP5AVCNFSM6AAAAABE2NC532VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TIOBYHA3TG>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Beta Was this translation helpful? Give feedback.
-
Here is the result of I hope this helps { |
Beta Was this translation helpful? Give feedback.
-
Released a new testing build in the v3 megathread (#222)
@fuquasteve The bug looks pretty bad, and I don't think it's because you did anything wrong, so it should be fixed. I'm not able to reproduce it with my card collection, so could you potentially share yours? If you go to your anki profile folder there is a file called "collection.anki2", if you upload that to google drive, or any other file sharing platform where I could download it, that would be amazing 🙏 |
Beta Was this translation helpful? Give feedback.
-
Released a new test version:
Originally posted by @mortii in #222 (comment) @Vilhelm-Ian does this build work better? It includes the changes discussed in #141. |
Beta Was this translation helpful? Give feedback.
-
Included in the v3 update: https://github.com/mortii/anki-morphs/releases/tag/v3.0.0 Thank you all! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Continuing the discussion in this issue #189
Beta Was this translation helpful? Give feedback.
All reactions