Donmai

[Userscript] IndexedAutocomplete

Posted under General

CodeKyuubi said:

It would be nice if the tag completion sorted by usage, so that, say, trying to tag 'dark_skin' with just 'dark', it wouldn't lead off with 'dark_hair' and 'dark_eyes' first.

It sort of already does, as dark_hair -> black_hair has a count of 490K and dark_eyes -> black_eyes has a count of 71K, both which are greater than dark skin at just 65K. That's the order that Danbooru returns the data, so that's currently the order things are shown in.

Would your idea be to instead show all of the aliases last in the autocomplete popup?

BrokenEagle98 said:

It sort of already does, as dark_hair -> black_hair has a count of 490K and dark_eyes -> black_eyes has a count of 71K, both which are greater than dark skin at just 65K. That's the order that Danbooru returns the data, so that's currently the order things are shown in.

Would your idea be to instead show all of the aliases last in the autocomplete popup?

My bad, I meant sorted by personal usage, like it does with Danbooru's old autocompletion extension, or is that infeasible to program? And yeah I think it would be useful in my experience to move aliases beyond non-aliases.

CodeKyuubi said:

My bad, I meant sorted by personal usage, like it does with Danbooru's old autocompletion extension, or is that infeasible to program? And yeah I think it would be useful in my experience to move aliases beyond non-aliases.

I just looked into that, and it should be doable.

Tag order

So here's what I'm thinking the order should be:

1. Used tags sorted by usage
2. Unused tags sorted by post count
3. Aliased tags sorted by post count

Note: Limited to 10 tags as per the current setup.

For #1, there are a couple ways that usage could be sorted. It could be by most frequent. It could also be by most recent. I'm leaning more towards the latter, as with the former you could have just been tagging one thing in abundance, so when you switch it up to something else you'd be forced to continually use the order mandated by your previous usage. Unless there's a convincing need to go with frequency method, or perhaps even a different method altogether, I'll just plan on going with the recent usage method.

Other thoughts

Propagation

Propagate the tag down all the way to the first letter. For instance, the tag girls_und_panzer doesn't show up until you type "gi", but if you were to select that tag, then it would now show up at the top of the "g" listing.

Expiration

Have each used tag have an expiration so that stale tags don't continue to be propagated to the list. I'm thinking maybe a day for this. Whenever a tag gets used, the expiration will get renewed, but after a day of no use the system will in a sense reset itself.

The expiration period could always be extended after a trial period, but I don't want to expand too far initially then have to wait a period of time for things to reset themselves.

Pushed Version 16 which added a user choice mechanism to the autocomplete. Whenever a user selects a data item with Tab or Enter, that choice will appear at the top of results for at least the next 24 hour period. Reusing that same choice renews that period. Chosen items will appear in the most recently used order first. Beyond that, all aliased tags are pushed to the bottom of the list unless they are also a chosen item.

Minor versions
  • (2018-08-25)
    • 16.1 Added minimum length validation to autocomplete

Updated

Pushed Version 17 which modifies the choice mechanism to account for usage. I've been experimenting this last week for something that wasn't as jittery as Version 16, and eventually settled on using an incremental count with a decrementing power function backoff.

The new schema works as such:

  • For the chosen element, a value of 1 is added to its count
  • For all other unchosen shown elements, multiply their count by 0.9
  • The count will max out at 20
  • The expiration period has been extended to 2 days

I found that the above works pretty well through simulation and also testing. Since some may want to adjust the values or turn off the choice schema altogether, I'll be working on a user interface for this script, and hopefully deliver something by next week.

Minor versions
  • (2018-08-26)
    • 17.1 Fixed how usage maximum was being applied
  • (2018-08-28)
    • 17.2 Fixed several minor mispellings and usages
  • (2018-08-29)
    • 17.3 Removed asterisk on tag autocomplete as it is no longer required
  • (2018-08-30)
    • 17.4
      • Switched to alternate tag retrieval mechanism
      • See issue #3854

Updated

Pushed Version 18 which adds a settings menu available on the same page as the Danbooru settings (My Account >> Settings) under Userscript Menus. The specific settings for this script are under the tab IndexedAutocomplete. The settings "should" instantly reflect in all other open tags with the latest version of the script running.

Minor versions
  • (2018-09-04)
    • 18.1 Fixed limit for alternate tag method
  • (2018-09-07)
    • 18.2 Added setting for moving aliases to last
  • (2018-09-12)
    • 18.3 Updated library to Version 5
      • Fixed handling of tab autocomplete
      • Check for Danbooru namebound events
  • (2018-09-16)
    • 18.4 Update selection method for tab autocomplete
      • Fix rendering of inputs using a default render
      • Fix saved searches so it can handle multiple entries
      • Add in missing autocomplete inputs
      • Fix selection method so it doesn't replace with key arrows

Updated

Pushed Version 19 which adds styles and groupings for the various tag autocomplete sources.

Highlighting

By default, the autocomplete highlighting is turned on. The program adds CSS classes that facilitate this, which can be overridden by the user.

  • .iac-user-choice - The user selected data
  • .iac-tag-highlight - All tag sources
  • .iac-tag-exact - The exact source
  • .iac-tag-prefix - The prefix source
  • .iac-tag-alias - The alias source
  • .iac-tag-corrct - The fuzzy source, i.e. misspellings

The following is the current CSS code used:

.iac-user-choice a {
    font-weight: bold;
}
.iac-tag-alias a {
    font-style: italic;
}
.iac-tag-highlight {
    margin-top: -5px;
}
.iac-tag-highlight > div::before {
    content: "●";
    padding-right: 4px;
    font-weight: bold;
    font-size: 150%;
}
.iac-tag-exact > div::before {
    color: #EEE;
}
.iac-tag-prefix > div::before {
    color: hotpink;
}
.iac-tag-alias > div::before {
    color: gold;
}
.iac-tag-correct > div::before {
    color: cyan;
}

If the highlight option is turned off, then the classes will no longer be added, however the ability to style by source is still available with the HTML data fields, e.g. using [data-autocomplete-source="exact"] as the CSS selector. Replace "exact" with all of the other sources to address them.

Sorting

By default, source sorting is turned on. The default order is exact, prefix, alias, correct. This order can be modified.

If this option is turned off, then the weighted sorting scheme by Danbooru will be used.

sort_weight = source_weight x post_count

The current source weights are:

  • exact: 1
  • prefix: 0.8
  • alias: 0.2
  • correct: 0.1

At some point in the future, I may facilitate the ability to alter the weights and/or the equation as an alternate means to sort the data.

Pushed Version 20 which facilitates using an alternate weighting scheme for tag sources. Note that if the source grouping added in Version 19 is enabled, then this option will have no effect.

The default source weights are the same as those noted in the previous post. Valid values are any number between 0 and 1. Three different scaling functions are available to control how much of an effect that the post count has on the sort weight.

  • Linear: tag_weight = source_weight * post_count
  • Square root: tag_weight = source_weight * Sqrt( post_count )
  • Logarithmic: tag_weight = source_weight * Log( post_count )

The scaler really only functions when two sources have different weights. Sources with the same weight will be grouped purely by post count.

Square root has a medium effect at reducing the weight of the post count, whereas logarithmic has a huge effect.

Minor versions
  • (2018-12-31)
    • 20.1 - Fixed incorrect pruning parameter.
  • (2019-01-03)
    • 20.2
      • Updated library version
        • ~20% reduction in code size
      • Refactored code to standard
  • (2019-01-13)
    • 20.3
      • Fixed related tags cache
      • Fixed source data cache
      • Added compatibility with RecentTagsCalc

Updated

Pushed Version 21 which primarily adds autocomplete to the new bulk update request form.

.iac-tag-bur > div::before{
    color: #000;
}
.iac-tag-highlight .tag-type-400:link {
    color: #888;
}
.iac-tag-highlight .tag-type-400:hover {
    color: #ccc;
}

Besides that, there was some issues with the artist autocomplete which were also fixed.

Minor versions
  • (2018-01-24)
    • 21.1
      • Added autocomplete to BUR edit form
      • Removed superfluous print statements

Updated

Pushed Version 22 which adds a program/cache data editor to the menu. This is being prototyped on this script before rolling it out to the other applicable scripts. Currently, values can be looked at and/or deleted. Also, the cache data is editable whereas the program data is not, as there is currently no centralized mechanism for validating all program data.

So basically, this adds the scalpel approach besides the already existing hammer approach, i.e. purge cache data / factory reset.

  • (2018-01-27)
    • 22.1
      • Fixed wrong key being used for related data storage
      • Related data caching was effectively disabled
  • (2018-01-27)
    • 22.2
      • Exported autocomplete for use by other scripts

Updated

Pushed Version 23 which primarily integrates the new library code that was finalized today. This is coming out as a major version since a large majority of the code was changed while updating to the new library version and standardizing all of the affected userscripts.

Additionally, all of the menus were standardized, and fully working cache editors were incorporated. Also, a new control was added under cache settings which shows the size of all the userscript data amongst all of the various data storage.

Minor versions
  • (2019-02-16)
    • 23.1 - Fixed bug with incorrect function name on uploads page
  • (2019-02-16)
    • 23.2
      • Add in recheck mechanism that requeries data expiring within X number of days
      • Fixed race condition with the autocomplete render function
      • Fixed @ mention autocomplete and usage data
  • (2019-02-16)
    • 23.3 - Fixed the choice data not being saved for metatag data
  • (2019-04-02)
    • 23.4 - Fixed choice data selection on tag edit inputs
  • (2019-05-13)
    • 23.5 - Make use of only parameter
  • (2019-05-22)
    • 23.6 - Fix saving user settings
  • (2019-05-30)
    • 23.7 - Library fix for expirations
  • (2019-07-09)
    • 23.8 - Fixed wiki link function
  • (2019-07-15)
    • 23.9 - Allow for spaces where applicable

Updated

Pushed Version 24 which adds an "already used" mechanic to the tag edit box, highlighting any lines with already used tags yellow. The program uses a CSS class for this, and the following shows its default setting.

.iac-already-used {
    background-color: #FFFFAA;
}

Besides the above, there have been several other additions:

  • Prefix/acronym matching on choice usage data
    • i.e. "gup" would bring up "girls_und_panzer" if that was a previously selected item
    • Usage mechanism must be enabled for this to be in effect
    • This setting is enabled by default, but it can be turned off in the settings menu
  • The number of results returned is now settable in the settings menu (5 - 20)
    • The tag autocomplete does not currently support variable results returned
  • Added an input to the comments page which facilitates searching comments by post
  • Combined choice data internally so that it can be edited from the cache editor
  • Several other code refactors
Minor versions

Updated

Pushed Version 25 which adds metatags to the tag autocomplete for both query and edit inputs. For metatags with autocomplete themselves (order, status, rating, locked, child, parent, filetype, disapproval), once those tags have been selected it will automatically activate the autocomplete for them. The following is the CSS class and default setting for these.

.iac-tag-metatag > div::before{
    color: #000;
}
.iac-tag-highlight .tag-type-500:link {
    color: #000;
}
.iac-tag-highlight .tag-type-500:hover {
    color: #444;
}

The order for these can be set in the settings menu. Additionally, they can also be disabled there.

Besides the above, there have been several other additions/fixes:

  • Additions:
    • Made BUR autocomplete optional
    • Autocomplete in more places
  • Fixes:
    • Fixed items not rendering due to recent updates (forum #159833)
    • Trim autocomplete on inputs that are space sensitive
    • Fixed the render of autocomplete not being done correctly sometimes

Pushed Version 26 which primarily includes an update to the library code plus a few fixes:

  • Additions
    • Add new menu library elements/functions for cache editor
      • The raw program data can be extracted using the cache editor, which facilitates transferring program data to another browser/domain
      • Controls which are not applicable are hidden, for example Local storage must be selected before the raw data option becomes visible
  • Changes
    • Settings menu now adapts to the color changes of the chosen theme (light/dark)
  • Fixes
    • Fix the render function not being set in some cases
      • This caused the source/usage highlights to not be applied
  • Other
    • Multiple internal code changes and refactors

Pushed Version 27 which adds several new additional features:

  • Configurable limit for related tags results
    • The default is 25 results
    • This is only the tags column and not the recent/frequent/wiki columns
  • Alternate methods of related tags comparison/ordering
    • Default: This uses the default comparisons/orderings based upon whether a category is used or not
    • Frequent: Orders the tags by their frequency on posts
      • This is currently the default when searching by category
    • Similar: Uses a cosine similarity to order the tags by how similar they are to each other
      • This is currently the default when searching by no category
    • Like: Does a wildcard search using "*" at the front and beginning of the tag
  • Configurable default comparison/ordering
    • The is the value that is always selected upon initial page load
  • Show percentage statistics for the tag results
    • This is the percentage of posts the two tags have in common based upon the sample size used
  • Expandable related tags section
    • All tag columns are shown in a single expandable row instead of wrapping to multiple rows
    • Section navigation can be done by using the arrow keys or the scroll bar at the top or bottom
  • Facilitate the addition of tag autocomplete on non-autocomplete text fields
    • Alt+A activates/deactivates this mode while in the text field
    • This mode is deactivated once the autocomplete item is selected
    • It renders as a textile link with customizable title like the following: [[some_tag|insert text]]

Updated

Pushed Version 28 which primarily updates the text autocomplete feature, along with a major code refactor.

  • Additions
    • Text autocomplete updates
      • Source: (Alt+1)
        • Can be either tags or wiki pages now.
      • Mode: (Alt+2)
        • tag - Includes the underscores.
        • normal - Does not include the underscore.
        • pipe - Normal mode with a pipe "|" at the end.
          • This removes the final qualifier in parentheses if it exists.
          • Example: [[Arthur Pendragon (Fate)|]] -> Arthur Pendragon
        • custom - Allows user to enter the display text.
          • Example: [[arthur_pendragon_(fate)|Saber-chan]] -> Saber-chan
      • Capitalization: (Alt+3)
        • lowercase
        • uppercase
        • titlecase - First letter of tag/wiki capitalized.
        • propercase - First letter of each word capitalized.
        • exceptcase - Propercase except for "a", "an", "of", "the", "is".
        • romancase - Romancase and capitalize all letters in Roman numerals.
      • Current options are shown when turning autocomplete on (Alt+A).
    • Cache editor options
      • Added additional controls.
        • list Shows all of the available program/data keys.
        • refresh Refreshes all of the available keys.
  • Fixes
    • Fixed several CSS stylings in the settings menu.
  • Other
    • Updated external libraries.
    • Major internal code changes and refactors.
1 2 3