Donmai

Note Assist, automatic note generation & resize.

Posted under General

Note assist is a Userscript designed to make translating easier and faster by automatically resizing or creating the notes for you.

This userscript changes basic note editing, so it's highly recommended to at least read the "Main changes", the more you know, the easier it is to use :).

Main changes

Automatic resize to text:

When you make a note by dragging, it will automatically look for text inside the note you created, and resize itself.
The success rate is high, but not 100%, see "Tips & indepth details" for more info.
If you hold shift while releasing the mouse button, the note will not resize. There is a setting to make the script only resize notes while shift is pressed.

Ghost notes:

A quick way to get rid of notes you don't need, all unsaved notes (red striped background) can be deleted with a right-click.

Additional features

These features are found in the noteAssist menu, which pops up when you press the 'Add note' link or hotkey for the first time.

Generate all notes:

As it says, this buttons (attempts to) generate notes for all the text on the image.
Text color (override): 'Generate all notes' will detect the color of the text in the image and bold the text 'black' or 'white' to show what it selected, if incorrect, the radiobuttons can be used to override it.
Note, on large images (2500*2500 & up), this function can take some time & memory.

Text decoration functions:

A set of buttons is available to apply styles like 'bold', 'larger font', 'font color', etc.
An Eyedropper function is also activated when pressing the (font)'color' or 'Background-color' button. Drag the image to select a color.
A zoomed image will be show in the Note Assist menu. You can click it to select a specific color as well.

User settings:

A custom menu to edit the script settings without having to go in the actual code itself -> 'settings' link in the noteAssist menu.
(Note for people that prefer editing the code: once settings are saved, it will ignore the defaults from the code)

Performance details

Javascript isn't great at text detection, so there are some limitations.
Calculation speed: Generate-all takes 0.5~2 seconds on an average sample comic. (freezing your browser in the meanwhile)
On exceptionally large images, if the code takes more than 20 seconds to run, it will pause and ask if it should continue. This allows for canceling if it takes too long.

Memory Usage: Aka "RAM". Depends on image dimensions, the script may silently break if you don't have enough free ram. Some numbers: a 1k*1k image takes up to 8Mb ram -- 2k*2k = 30Mb ram -- 4k*4k = 125Mb ram.

Known issues

Bugs:
- For the text decoration functions, upon clicking outside of the textarea, chrome and firefox no longer highlight any selected text (clicking outside the textarea does not deselect). -browser bug-

Text detection issues:

  • texts with more than 2 paragraphs might not be detected as a single text
  • Words that have 0 space between them, flow into eachother, or are connected by a line. -cannot fix-
  • Text on a transparent background -won't fix, too rare-
  • Light colored text on a light background / same for dark -cannot fix-
  • Shading by dot pattern instead of gradient messes up detection, and slows the script drastically -trying to fix-

Found a new bug or interested in coding? Check the Github page

Tips & details:

  • Auto resize code - It's better to make your note a bit too big instead of too small, anything touching the border of the note is instantly seen as not a letter.
  • Letter recognition is impossible for handwritten Japanese, especially as a userscript.
  • Text decoration functions - pressing 'bold' again while having "Text" selected will remove the "" tags again. This only works for the outer tags, not for any nested tags. Similarly, applying color -> fontsize will merge those into the same <span> tag. applying color -> bold -> fontsize will generate 2 <span> tags (bad).

To-do / planned features

  • Automatic detection of square textboxes

Changelog:

2015-07-10 - v1.1:
- New button: "save all non-ghost notes" to make it easier to reposition a lot of notes.
- Notes resized by the script now are properly marked as "unsaved"

2015-05-15 - v1.0:
- New feature: Ctrl-click* an existing note to resize it (per type-kun's idea)
- Most of the logic & code re-written so people can actually read what is happening.
- Accuracy in detecting text remains about the same, I've pretty much hit a dead end trying to improve it.
- speed increase on everything!
- memory usage remains the same
- moved script objects to it's own namespace, improves cross-browser support & maintainability.
- much less likely to break when danbooru updates

2013-08-21 - v0.12:
- Updated to follow danbooru changes.

v0.11:
- Fix typo.

v0.10:
- Updated to follow danbooru changes.

2013-05-28 - v0.9:
- Added settings menu

2013-05-22 - v0.8:
- Moved eyedropper preview to below the buttons
- Added preview text to the eyedropper function
- Fixed rare bug where note could be 1-2px out of the image
- Added 'x' button to close the Note Assist window
- Changed "instantNote" setting to default "true"

2013-05-12 - v0.6:
- added 'instantNote' setting (default:false)
- generate-all now attempts to avoid text already covered by notes.
- Added text decoration buttons. (includes eyedropper/colorpicker function)

2013-05-09 - v0.4/v0.5
- Fixed wrong scaling when "Fit images to window" is enabled

2013-05-09 - v0.3
- Decreased memory usage by at least 66% (thanks Type-kun).
- Increased script speed by 2 to 4 times.
- Fixed chrome "back" bug.
- Fixed wrong dimensions bug after pressing "view original".
- Added info button linking back to this thread.

2013-05-07 - v0.2
- Increased script speed by up to 10 times.

2013-05-06 - v0.1
- First release, only has core function.

Updated

Woohoo, finally version 0.1 is out, when i had the idea "something to make note resizing easier", i'd never have thought it'd take almost 200 hours of coding just to get the very first version out.

The code's far from complete, it's a endless process of learning the script how to deal with certain situations.
What's making it even harder is that it's really hard to just find parts / discussions to base my code on... this may very well be the first javascript-based text detection script.

itsonlyaname said:

Click & drag support:

This is indeed quite awesome.

itsonlyaname said:

Automatic resize to text:

This works surprisingly well. I've only tested it briefly so far, but I can imagine it saving me a lot of time trying to get the exact positioning right.

itsonlyaname said:

Generate all notes:

Other than the performance problems, the main issue seems to be that it creates a lot of excess notes. For example, on character's faces/mouths, duplicate notes on top of other notes, etc. At a rough estimate, it usually seems to make as many incorrect ones as correct ones.

However, it does seem to successfully create correct notes for the majority of text boxes in your average comic. So this is pretty impressive, if not 100% practical yet.

Type-kun said:

Well, I can tell that Javascript OCR sounds like a developer's nightmare, so you have my respect.

Thanks :)
Having to loop over an array of several million values over 100 times is pretty hellish indeed. (canvas.. 8-direction floodfill..)

Type-kun said:

As for memory requirements - after a quick glance I see that you load entire image in the canvas, twice. http://stackoverflow.com/questions/8751020/how-to-get-a-pixels-color-of-an-image suggests a way to cut out a portion of an element into a canvas, unless I'm mistaken - maybe that would help?

One of those canvases helps keep down the amount of calculations by holding a black/white copy of the image. Turning each note black/white at the moment it's created could be a good memory/time tradeoff, i'll definitely be testing that.

Toks said:

the main issue seems to be that it creates a lot of excess notes. For example, on character's faces/mouths, duplicate notes on top of other notes, etc. At a rough estimate, it usually seems to make as many incorrect ones as correct ones.

However, it does seem to successfully create notes for the majority of text boxes in your average comic. So this is pretty impressive, if not 100% practical yet.

It shouldn't create many overlapping notes (unless you mean already-existing notes? perhaps i can do something with detecting already-saved notes), but yes, the 'generate all notes' has a lot of false positives.
Shape recognition on hand-draw Japanese-text jpg sample images is pretty much impossible, so I'm stuck with "lines".

For now, just ignoring or right-clicking the notes that are in your way is the only option.

itsonlyaname said:

It shouldn't create many overlapping notes

At least once I saw it generate two small notes next to each other and one larger note fully encompassing both of them, although it's not that frequent.

Also, an idea: What about an option to resize all existing notes? (Either growing or shrinking them to the nearby text depending on what's necessary.) Sometimes notes aren't placed well by other users, and a script that fit those to the text would be nice.

Updated

This is very cool.

I'm looking through the source trying to figure out how it works. Did you base your text detection on anything specific?

Maybe you could use a scaled-down version to skip pure-white/pure-black areas of the image, if that would help with the speed.

For the next version, please also read and translate the text, tia.

Fixed there being no "install" button on userscripts.org
Also commented out the style changes (background & such, line 102-120), sorry about that.

Toks said:

At least once I saw it generate two small notes next to each other and one larger note fully encompassing both of them, although it's not that frequent.

A note 100% covered by another note sounds like a bug, image ID?

Also, an idea: What about an option to resize all existing notes? (Either growing or shrinking them to the nearby text depending on what's necessary.) Sometimes notes aren't placed well by other users, and a script that fit those to the text would be nice.

Making notes larger is troublesome, it's hard to find out how much to expand. A button (or bookmarklet) to resize all existing notes is possible though. Also try the "noteAssist_noteButtons" option in advanced settings, it looks horrible but it adds buttons to resize already-existing notes.

zatchii said:

This is very cool.

I'm looking through the source trying to figure out how it works. Did you base your text detection on anything specific?

Thanks :)

The program flow is something like this (pastebin'd wall of text) http://pastebin.com/raw.php?i=tiG6p7d7
If you mean based of other code, nope ~ i could only find basic snippets, like the convert_grey function (which i already completely rewrote in the meantime) so every function is completely self-written.

Maybe you could use a scaled-down version to skip pure-white/pure-black areas of the image, if that would help with the speed.

Not sure what you mean, any scaling can only be applied to the whole image. I do have a few more idea's to improve both speed & memory, but I'm gonna add features first. (of course, any suggestions/idea's are always welcome)

For the next version, please also read and translate the text, tia.

Hah xD
If it wasn't Japanese i would have considered a read ;).

Edit; while chatting with kittey about some code, i got a nice idea ~ on most images, the script is now up to 10x faster.

Updated

This is really amazingly useful. Excellent work.

A few minor things:

  • I really prefer only having to hit the hotkey once to enter/exit note creation mode, rather than having to hit it once for every note, but I've also seen people saying they prefer the opposite -- is it possible to make this configurable?
  • It would be nice to be able to set the existing configurables (auto-resizing with shift and UI position) via the UI itself rather than having to edit the script manually, if any cross-platform solution for storing configuration data like that exists. (Also there's a small error in the existing configuration -- setting noteAssist_uiPositionLeft to false moves the UI to the right as the variable name suggests, not to the left as the comment suggests.)
  • A tooltip or small help dialogue in the UI itself describing the non-obvious feature additions (right-click to delete unsaved notes and shift-drag to stop auto-resizing) would be an additional source of documentation much more accessible to the end user, and may become outright necessary if the feature list (and attendant hotkeys to activate said features) becomes much bigger.
  • It looks like the script currently doesn't consider the positions of existing notes when generating all notes, which means ghost notes get placed on top of saved notes on partially translated posts -- not a big deal, but undesirable and relatively straightforward to prevent. (EDIT: Oops, you did already mention this one.)
  • Out of the unimplemented features you mention here and in the code, the 'Mark as Translated' button is, IMO, the one that would be the most helpful with the least investment of coding effort -- it saves the end user time on every post, and should only require a regex and an AJAX request.

EDIT: While the script is enabled, navigating pools via the 'prev' and 'next' buttons doesn't add to the browser's forward and back buttons properly; that is, if I click an image from the Comments index, move 10 images through a pool it's in via the pool bar, then hit Back, it'll take me back to the Comments index. Disabling the script fixes the problem. This is on Chrome 26, Windows 7 64-bit.

Updated

itsonlyaname said:

A note 100% covered by another note sounds like a bug, image ID?

I've seen it twice now, but can't find either of the images again...

itsonlyaname said:

Edit; while chatting with kittey about some code, i got a nice idea ~ on most images, the script is now up to 10x faster.

You're a wizard.

glasnost said:

if any cross-platform solution for storing configuration data like that exists.

GM_setvalue/GM_getvalue seem like the way to do this.

glasnost said:

  • I really prefer only having to hit the hotkey once to enter/exit note creation mode, rather than having to hit it once for every note, but I've also seen people saying they prefer the opposite -- is it possible to make this configurable?
  • Out of the unimplemented features you mention here and in the code, the 'Mark as Translated' button is, IMO, the one that would be the most helpful with the least investment of coding effort -- it saves the end user time on every post, and should only require a regex and an AJAX request.

These could be added to the site itself at some point in the near future as well. issue #1496, issue #1495

glasnost said:

This is really amazingly useful. Excellent work.

Glad people like it :)

  • I really prefer only having to hit the hotkey once to enter/exit note creation mode, rather than having to hit it once for every note, but I've also seen people saying they prefer the opposite -- is it possible to make this configurable?

As Toks said, it looks like this will be on the official danbooru in the near future (possibly replacing the current behavior). Input of users that prefer the current behavior would be helpful ~ issue #1496.

  • It would be nice to be able to set the existing configurables (auto-resizing with shift and UI position) via the UI itself rather than having to edit the script manually, if any cross-platform solution for storing configuration data like that exists. (Also there's a small error in the existing configuration -- setting noteAssist_uiPositionLeft to false moves the UI to the right as the variable name suggests, not to the left as the comment suggests.)

My other script already has this (image) once the number of settings grows a bit more it will be added.
Typo in the comments fixed in next version.

  • A tooltip or small help dialogue in the UI itself describing the non-obvious feature additions (right-click to delete unsaved notes and shift-drag to stop auto-resizing) would be an additional source of documentation much more accessible to the end user, and may become outright necessary if the feature list (and attendant hotkeys to activate said features) becomes much bigger.

I'm not sure where i should put such documentation, the checkboxes already have a title (mouseover) with some info. A link to the first post here should be enough.

A little idea, continuing on memory usage.

You are basically loading the entire image in the canvas (height x width x sizeof(int) x 4 bytes), then store a black-and-white copy in another canvas (same size), unless I'm missing something. Then you use the original canvas only to cut out a block under note.

So, suggestions:
1) Since you're using black&white, and pretty much work with pixel data directly, maybe switching to an array of boolean primitive per pixel would be a better idea? For example, true for black, false for white. Or better yet, true for black and undefined for white, as it should use no memory for white elements that way. Assuming javascript uses same size for boolean as java does, 1 byte, you gain at least 16x memory usage reduction for second canvas. 128x should be achievable by doing bitwise operations , but that involves complicated bit addressing :3 Well, not so complicated actually, you just predefine 32 integers representing Nth bit, then do | for set and & for check, but that can still be a headache. Then again, with bitwise operators, reversing the b/w is as easy as doing (~ x) for every element.

2) Continuing with previous point, I doubt that it's really necessary to keep the entire colored copy in memory. I still don't know if that's achievable, but if you can cut out a portion of image (x-y-w-h) in the canvas, you can use "window" method to fill b/w array - that is, set some fixed canvas size, like 100x100, then sequentially load parts of image inside it and convert/store them. Later, if you need part of colored image, just resize the canvas appropriately, load that exact part and get data out of it. This should further reduce the memory requirements.

Type-kun said:

2) Continuing with previous point, I doubt that it's really necessary to keep the entire colored copy in memory. I still don't know if that's achievable, but if you can cut out a portion of image (x-y-w-h)

This :)
I was already planning on scrapping the color copy, but by doing that i can scrap both.
Currently, for a note it creates a new temporary canvas based on the black/white data. During a generate-all that's 3 complete copies of the image in memory...
Normal image -> black/white canvas conversion takes but a fraction of the time it used to (less then half a second (opera, slowest browser) for a 2k*1.5k image) so any caching is no longer needed.
Without caching, memory usage during a generate-all will be 1/3 of current, and just a few Mb when drag-creating notes. (0 while idle)

Without caching, the entire point of optimizing the array becomes rather meaningless. Memory spikes is what kills scripts, since the original Uint8Array array cannot be modified in size, any size-changes involve making a new array -> spike memory even higher.
(Keeping the current format also has benefits, like space for storing image-optimization data, and the ability to draw a visual canvas for debugging)

Ps. i don't think there is a way to store canvases out of memory.
All data must be stored uncompressed, even the simplest math becomes slow when executed millions of times.

Yep, storing the entire image in memory unless you use generate-all is just a waste of kilobytes :3
But "generate-all" option still should benefit from optimizing array format, especially for large images. 0.5Mb(bitwise) or 4Mb(bytes) vs 64Mb(full) for 2K x 2K image is pretty serious difference. True, this way it loses the benefit of drawing canvas for debug, but it's just a matter of writing a few additional lines of code to transfer optimized array to canvas. Moreover, bit data storage is the most plain storage there is - in theory, bitwise and/or should be faster than regular int math. That depends on how it's implemented in JS, though.

The only way i know to initially get the data is only available in full (64Mb) though.
And since the array is unloaded/thrown away again after the notes are created, what's the difference between keeping the full 64Mb in memory for 2-10 seconds, or spending additional time to optimize the array?

Unless, storing the full image (compressed) after a generate-all is done to speed up following drag-notes. (not worth it)

The only way i know to initially get the data is only available in full (64Mb) though.

I already mentioned that :3 Implying you can directly load a part of image in a canvas, you can do that with loading image in blocks. E.g. get 100x100 canvas, then load a (x,y,w,h) block of image at (0,0,100,100), convert it, load (100,0,100,100), convert it, when done with first row load (0,100,100,100) etc, etc, or make 5x(width) canvas and do the same.

And since the array is unloaded/thrown away again after the notes are created, what's the difference between keeping the full 64Mb in memory for 2-10 seconds, or spending additional time to optimize the array?

2K x 2K image is a spherical cow, so to speak. For 10K x 1K (not that uncommon for long dialogue-heavy strips), it becomes 160MB. Maybe it's just me being outdated, but I still try to reduce memory consumption in things I write, at least when it comes to megabytes :3 Another thing is that GC can be stupid, and these 160MB will reside in memory until it decides that it's time to do its job and clean it.
As for additional time, there should be none. Converting to "plain" bit or byte array is done in black/white conversion code anyway, so it wouldn't require more time than it does now. Basically, where you do

luma > cut_off ? luma = 0 : luma = 255;
      pixels[i] = luma;
      pixels[i+1] = 0;
      pixels[i+2] = 0;

it'll be

bwarray[i] = (luma > cut_off);

I see, that's very interesting. (i misunderstood the part about storing 100x100 blocks)
Besides saving memory, it should also be faster and more customizable.

Not having the array locked opens up a lot more paths to improve speed.

itsonlyaname said:
I'm not sure where i should put such documentation, the checkboxes already have a title (mouseover) with some info. A link to the first post here should be enough.

I was thinking a little '(?)' link in the corner of the UI popup that spawned a help dialog, or maybe just a link back to this thread like you said. It's not a huge practical concern, but at present the script has no way to lead people who get it through a non-official channel back here; there's not even a link to the userscripts.org page in the source.

Also, repasting this bug report since I edited my first post too late for it to be seen: while the script is enabled, navigating pools via the 'prev' and 'next' buttons in the pool bar doesn't add to the browser's forward and back buttons properly; that is, if I click an image from the Comments index, move 10 images through a pool it's in via the pool bar, then hit Back, it'll take me back to the Comments index. Disabling the script fixes the problem. This is on Chrome 26, Windows 7 64-bit.

glasnost said:

Also, repasting this bug report since I edited my first post too late for it to be seen: while the script is enabled, navigating pools via the 'prev' and 'next' buttons in the pool bar doesn't add to the browser's forward and back buttons properly

Fixed (bug with location.replace).

Added info button linking back to this thread.

Got rid of the stored canvases, decreasing memory usage by at least 66%, and optimized other parts of the code giving another massive boost to speed.
Generate-all on an average sample comic takes about 1 second for me now.

Current priority list:
- bug reports (if any)
- skip text that already has a note
- eye dropper tool
- html insertion buttons
- settings menu
- detection of square textboxes
- more speed/memory increases

itsonlyaname said:

- eye dropper tool

THIS. Should be incredibly useful.

Also, a feature request :3 Could you make something like a, uh, "callable" box with multiple textareas, allowing to edit all the notes / easily delete some of them? This is necessary sometimes, as "go to note - edit - save - repeat again" routine is slow and tedious.

1 2 3