Update for Onetastic: Select Text from Image

May 05, 2012

OneNote can recognize the text in images in your notes and you can even search them. It also has a little hidden feature that allows you to copy the text from an image. You can just right click on an image and choose Copy Text From Picture. This doesn't always do what you want though. You may want to just copy an address or a tracking number from a screenshot. But copying the whole text and then pasting it somewhere and then finding what you want is cumbersome. Wouldn't it be nice if you could select and copy the text from the image similar to copying from the web? An update to Onetastic which adds this new feature is now available: Select Text from Image.

Let's say you have the following screenshot you captured from a web site and want to copy some text in it.

Image with text

You can access the Select Text from Image feature from the right click menu along with two other Onetastic features, Rotate Printout and Crop.

Select text from image

You can see here that the selectable text is highlighted with yellow. If there is some text you see on the image that is not highlighted, that won't be selectable, as OCR engine did not recognize it. You can also see that the text is arranged in two regions here (purple boxes). Depending on how the text is arranged, you may see a single region or several regions. Moving your mouse on the text and dragging while pressing your left mouse button will start selecting the text, just like you used to do pretty much anywhere else:

Text selected

Note that with a single drag selection, you can select text within a single region. It was too complicated to find a way to select across regions with a single drag. However once you select some text from a region, you can then press and hold the Ctrl key and select more text from other regions. Clicking anywhere on the window will clear the selection.

Also if you noticed the button below the image was reading "Copy All Text and Close" and once you make a selection it turns into "Copy Selection and Close". The button does what it says. If you didn't select anything, then it will copy all the text on the image, and if you selected some text it will copy just the selection and then close the dialog. You can also use the universal copy shortcut: Ctrl + C.

As simple as that. Hopefully this will save a bunch of time. One thing worth noting is that OCR is not a perfect technology and may not detect all the words 100% correctly. So you may end up having to fix the copied text, which is most of the time easier than re-typing the whole thing.

One minor issue is also fixed with this update. Custom styles with font names containing spaces (like "Times New Roman") is now working properly.


Philippe J. Bruno - 2018-08-16
Omer, I did some extra tests and you are right, it is more complicated than I expected! In fact, I just realized that when I connect my Surface Pro to my Surface Dock, my 28" screen is at a different scaling factor (150%), which complicates things even further when I switch OneNote from my Surface screen to my large screen! Wow, what a nightmare for developers. I looked at the page XML and I believe the coordinates of the OCRed text is saved along the extracted text, right? This leads me to another idea though… How is OneNote able to properly maintain the position of ink on a page relative to an inserted image (like a PowerPoint slide inserted via Print to OneNote and annotated using ink in OneNote)? Can't you use the same technique to calculate the same relative position of extracted text with relation to the image?
Philippe J. Bruno - 2018-08-16
Thanks Omer for your reply. I know it is not a trivial task. However, I once encountered a related problem with an AutoHotkey script, and I solved it using by reading the "HKEY_CURRENT_USER, Control Panel\Desktop\WindowMetrics" registry key, dividing the read value by 96 and storing it a variable called ScreenScaling. Using that calculated value, I was able to solve almost 90%+ of the weird scaling problems.
I see your point about the fact that the it can be different when using multiple computers, BUT in the case I reported, I can assure you that the problem is very real on an image inserted on my Surface Pro (200% scaling), OCRed on the same computer and text extracted on the same computer. If your dialog box could multiply the positioning and dimension of the yellow selection rectangles by ScreenScaling (as described above, i.e 2 on my Surface Pro), I believe it would solve the issue, at least when everything is happening on the same computer.
Omer Atay - 2018-08-16
Philippe: This is complicated. It depends on the DPI of the computer which inserted the image, as well as the computer which ran OCR on the image. These two can be different and they can both be different than the computer you are using to run Select Text From Image. Things should work if they are all the same.

Other Posts

Show all posts