Windows 10 OCR
Windows 10 OCR
HI
I am looking for a starting point to use the built in OCR function in windows 10.
any ideas, or starting point would be helpful
thank you.
M
I am looking for a starting point to use the built in OCR function in windows 10.
any ideas, or starting point would be helpful
thank you.
M
Re: Windows 10 OCR
OCR stands for Optical Character Recognition.
Infos:
https://en.wikipedia.org/wiki/Optical_c ... ecognition
https://docs.microsoft.com/en-us/uwp/ap ... inrt-20348
It seems to be possible but u need to deal with a lot of nonsense!
Here are the interfaces (if i got it right):
As you might notice most of the calls return other nonsense that needs to be implemented aswell (the language stuff)!
Im currently too busy otherwise i would try to write a small ocr lib with that nonsense.
How it (should) work - first steps:
To make it work with PB u need to make the thread use the windows runtime nonsense!
-> RoInitialize() https://docs.microsoft.com/en-us/window ... initialize
Now request/create the Classes (Interfaces) using WindowsCreateString() and RoGetActivationFactory().
-> WindowsCreateString() https://docs.microsoft.com/en-us/window ... eatestring
-> RoGetActivationFactory() https://docs.microsoft.com/en-us/window ... ionfactory
Prepeare to do some digging
Good luck
Infos:
https://en.wikipedia.org/wiki/Optical_c ... ecognition
https://docs.microsoft.com/en-us/uwp/ap ... inrt-20348
It seems to be possible but u need to deal with a lot of nonsense!
Here are the interfaces (if i got it right):
Code: Select all
Interface IInspectable Extends IUnknown
GetIids.i(*Count,*iids)
GetRuntimeClassName.s(*ClassName)
GetTrustLevel.i(*TrustLevel)
EndInterface
Interface IOcrEngine Extends IInspectable
RecognizeAsync.i(*Bitmap,*OcrResult);Windows.Graphics.Imaging.SoftwareBitmap // Windows.Foundation.IAsyncOperation<Windows.Media.Ocr.OcrResult
RecognizeLanguage.i(*Value);Windows.Globalization.Language
EndInterface
Interface IOcrEngineStatics Extends IInspectable
MaxDimensions.i(*Value)
AvailableRecognizerLanguages.i(*Value);Windows.Foundation.Collections.IVectorView<Windows.Globalization.Language
IsLanguageSupported.i(*Language,*Result);Windows.Globalization.Language
TryCreateFromLanguage.i(*Language,*Result);Windows.Globalization.Language
TryCreateFromUserProfileLanguages.i(*Result);Windows.Media.Ocr.OcrEngine
EndIf
Interface IOcrLine Extends IInspectable
Words.i(*Value);Windows.Foundation.Collections.IVectorView<Windows.Media.Ocr.OcrWord
Text.i(*Value)
EndInterface
Interface IOcrResult Extends IInspectable
Lines.i(*Value);Windows.Foundation.Collections.IVectorView<Windows.Media.Ocr.OcrLine
TextAngle.i(*Value);Windows.Foundation.IReference<DOUBLE>
Text.i(*Value)
EndInterface
Interface IOrcWord Extends IInspectable
BoundingRect.i(*Value);Windows.Foundation.Rect
Text.i(*Value)
EndInterface
As you might notice most of the calls return other nonsense that needs to be implemented aswell (the language stuff)!
Im currently too busy otherwise i would try to write a small ocr lib with that nonsense.
How it (should) work - first steps:
To make it work with PB u need to make the thread use the windows runtime nonsense!
-> RoInitialize() https://docs.microsoft.com/en-us/window ... initialize
Now request/create the Classes (Interfaces) using WindowsCreateString() and RoGetActivationFactory().
-> WindowsCreateString() https://docs.microsoft.com/en-us/window ... eatestring
-> RoGetActivationFactory() https://docs.microsoft.com/en-us/window ... ionfactory
Prepeare to do some digging
Good luck
Re: Windows 10 OCR
Had some time and so i wrote a small lib to at least help with that Windows Runtime nonsense.
Example how to get access to IOcrEngineStatics:
Download wrt.lib (x64) - written in fasm:
https://www.dropbox.com/s/ggxc2mlpp9wxn ... 2.zip?dl=0
Hope this helps a bit
Example how to get access to IOcrEngineStatics:
Code: Select all
EnableExplicit
; wrt.lib
; Version: Alpha 2
; Author: Mijikai
; Copyright 2021 by Mijikai all rights reserved
; License: Attribution-NonCommercial-NoDerivatives 4.0 International
; https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode
Import "wrt.lib"
wrtOpen.i()
wrtInterface.i(Name.s,Id.s,*Interface)
wrtStringCreate.i(String.s,*String)
wrtStringBuffer.i(*String,*Unicode = #Null,*Length = #Null)
wrtStringDelete.i(*String)
wrtClose.i()
wrtVersion.i()
EndImport
Interface IInspectable Extends IUnknown
GetIids.i(*Count,*iids)
GetRuntimeClassName.s(*ClassName)
GetTrustLevel.i(*TrustLevel)
EndInterface
Interface IOcrEngineStatics Extends IInspectable
MaxDimensions.i(*Value)
AvailableRecognizedLanguages.i(*Value);Windows.Foundation.Collections.IVectorView<Windows.Globalization.Language
IsLanguageSupported.i(*Language,*Result);Windows.Globalization.Language
TryCreateFromLanguage.i(*Language,*Result);Windows.Globalization.Language
TryCreateFromUserProfileLanguages.i(*Result);Windows.Media.Ocr.OcrEngine
EndInterface
Procedure.i Main()
Protected *OcrEngineStatics.IOcrEngineStatics
Protected hstring.i
If wrtOpen()
Debug wrtStringCreate("Hello World!",@hstring)
If hstring
Debug PeekS(wrtStringBuffer(hstring))
wrtStringDelete(hstring)
EndIf
Debug wrtInterface("Windows.Media.Ocr.OcrEngine","{5BFFA85A-3384-3540-9940-699120D428A8}",@*OcrEngineStatics)
Debug *OcrEngineStatics
*OcrEngineStatics\Release()
wrtClose()
EndIf
ProcedureReturn #Null
EndProcedure
Main()
End
https://www.dropbox.com/s/ggxc2mlpp9wxn ... 2.zip?dl=0
Hope this helps a bit
Re: Windows 10 OCR
It can be done in autohotkey. Surely can be translated to Purebasic.
Code: Select all
HBitmapFromScreen(X, Y, W, H) {
HDC := DllCall("GetDC", "Ptr", 0, "UPtr")
HBM := DllCall("CreateCompatibleBitmap", "Ptr", HDC, "Int", W, "Int", H, "UPtr")
PDC := DllCall("CreateCompatibleDC", "Ptr", HDC, "UPtr")
DllCall("SelectObject", "Ptr", PDC, "Ptr", HBM)
DllCall("BitBlt", "Ptr", PDC, "Int", 0, "Int", 0, "Int", W, "Int", H
, "Ptr", HDC, "Int", X, "Int", Y, "UInt", 0x00CC0020)
DllCall("DeleteDC", "Ptr", PDC)
DllCall("ReleaseDC", "Ptr", 0, "Ptr", HDC)
Return HBM
}
HBitmapToRandomAccessStream(hBitmap) {
static IID_IRandomAccessStream := "{905A0FE1-BC53-11DF-8C49-001E4FC686DA}"
, IID_IPicture := "{7BF80980-BF32-101A-8BBB-00AA00300CAB}"
, PICTYPE_BITMAP := 1
, BSOS_DEFAULT := 0
DllCall("Ole32\CreateStreamOnHGlobal", "Ptr", 0, "UInt", true, "PtrP", pIStream, "UInt")
VarSetCapacity(PICTDESC, sz := 8 + A_PtrSize*2, 0)
NumPut(sz, PICTDESC)
NumPut(PICTYPE_BITMAP, PICTDESC, 4)
NumPut(hBitmap, PICTDESC, 8)
riid := CLSIDFromString(IID_IPicture, GUID1)
DllCall("OleAut32\OleCreatePictureIndirect", "Ptr", &PICTDESC, "Ptr", riid, "UInt", false, "PtrP", pIPicture, "UInt")
; IPicture::SaveAsFile
DllCall(NumGet(NumGet(pIPicture+0) + A_PtrSize*15), "Ptr", pIPicture, "Ptr", pIStream, "UInt", true, "UIntP", size, "UInt")
riid := CLSIDFromString(IID_IRandomAccessStream, GUID2)
DllCall("ShCore\CreateRandomAccessStreamOverStream", "Ptr", pIStream, "UInt", BSOS_DEFAULT, "Ptr", riid, "PtrP", pIRandomAccessStream, "UInt")
ObjRelease(pIPicture)
ObjRelease(pIStream)
Return pIRandomAccessStream
}
CLSIDFromString(IID, ByRef CLSID) {
VarSetCapacity(CLSID, 16, 0)
if res := DllCall("ole32\CLSIDFromString", "WStr", IID, "Ptr", &CLSID, "UInt")
throw Exception("CLSIDFromString failed. Error: " . Format("{:#x}", res))
Return &CLSID
}
ocr(file, lang := "FirstFromAvailableLanguages")
{
static OcrEngineStatics, OcrEngine, MaxDimension, LanguageFactory, Language, CurrentLanguage, BitmapDecoderStatics, GlobalizationPreferencesStatics
if (OcrEngineStatics = "")
{
CreateClass("Windows.Globalization.Language", ILanguageFactory := "{9B0252AC-0C27-44F8-B792-9793FB66C63E}", LanguageFactory)
CreateClass("Windows.Graphics.Imaging.BitmapDecoder", IBitmapDecoderStatics := "{438CCB26-BCEF-4E95-BAD6-23A822E58D01}", BitmapDecoderStatics)
CreateClass("Windows.Media.Ocr.OcrEngine", IOcrEngineStatics := "{5BFFA85A-3384-3540-9940-699120D428A8}", OcrEngineStatics)
DllCall(NumGet(NumGet(OcrEngineStatics+0)+6*A_PtrSize), "ptr", OcrEngineStatics, "uint*", MaxDimension) ; MaxImageDimension
}
if (file = "ShowAvailableLanguages")
{
if (GlobalizationPreferencesStatics = "")
CreateClass("Windows.System.UserProfile.GlobalizationPreferences", IGlobalizationPreferencesStatics := "{01BF4326-ED37-4E96-B0E9-C1340D1EA158}", GlobalizationPreferencesStatics)
DllCall(NumGet(NumGet(GlobalizationPreferencesStatics+0)+9*A_PtrSize), "ptr", GlobalizationPreferencesStatics, "ptr*", LanguageList) ; get_Languages
DllCall(NumGet(NumGet(LanguageList+0)+7*A_PtrSize), "ptr", LanguageList, "int*", count) ; count
loop % count
{
DllCall(NumGet(NumGet(LanguageList+0)+6*A_PtrSize), "ptr", LanguageList, "int", A_Index-1, "ptr*", hString) ; get_Item
DllCall(NumGet(NumGet(LanguageFactory+0)+6*A_PtrSize), "ptr", LanguageFactory, "ptr", hString, "ptr*", LanguageTest) ; CreateLanguage
DllCall(NumGet(NumGet(OcrEngineStatics+0)+8*A_PtrSize), "ptr", OcrEngineStatics, "ptr", LanguageTest, "int*", bool) ; IsLanguageSupported
if (bool = 1)
{
DllCall(NumGet(NumGet(LanguageTest+0)+6*A_PtrSize), "ptr", LanguageTest, "ptr*", hText)
buffer := DllCall("Combase.dll\WindowsGetStringRawBuffer", "ptr", hText, "uint*", length, "ptr")
text .= StrGet(buffer, "UTF-16") "`n"
}
ObjRelease(LanguageTest)
}
ObjRelease(LanguageList)
return text
}
if (lang != CurrentLanguage) or (lang = "FirstFromAvailableLanguages")
{
if (OcrEngine != "")
{
ObjRelease(OcrEngine)
if (CurrentLanguage != "FirstFromAvailableLanguages")
ObjRelease(Language)
}
if (lang = "FirstFromAvailableLanguages")
DllCall(NumGet(NumGet(OcrEngineStatics+0)+10*A_PtrSize), "ptr", OcrEngineStatics, "ptr*", OcrEngine) ; TryCreateFromUserProfileLanguages
else
{
CreateHString(lang, hString)
DllCall(NumGet(NumGet(LanguageFactory+0)+6*A_PtrSize), "ptr", LanguageFactory, "ptr", hString, "ptr*", Language) ; CreateLanguage
DeleteHString(hString)
DllCall(NumGet(NumGet(OcrEngineStatics+0)+9*A_PtrSize), "ptr", OcrEngineStatics, ptr, Language, "ptr*", OcrEngine) ; TryCreateFromLanguage
}
if (OcrEngine = 0)
{
msgbox Can not use language "%lang%" for OCR, please install language pack.
ExitApp
}
CurrentLanguage := lang
}
IRandomAccessStream := file
DllCall(NumGet(NumGet(BitmapDecoderStatics+0)+14*A_PtrSize), "ptr", BitmapDecoderStatics, "ptr", IRandomAccessStream, "ptr*", BitmapDecoder) ; CreateAsync
WaitForAsync(BitmapDecoder)
BitmapFrame := ComObjQuery(BitmapDecoder, IBitmapFrame := "{72A49A1C-8081-438D-91BC-94ECFC8185C6}")
DllCall(NumGet(NumGet(BitmapFrame+0)+12*A_PtrSize), "ptr", BitmapFrame, "uint*", width) ; get_PixelWidth
DllCall(NumGet(NumGet(BitmapFrame+0)+13*A_PtrSize), "ptr", BitmapFrame, "uint*", height) ; get_PixelHeight
if (width > MaxDimension) or (height > MaxDimension)
{
msgbox Image is to big - %width%x%height%.`nIt should be maximum - %MaxDimension% pixels
ExitApp
}
BitmapFrameWithSoftwareBitmap := ComObjQuery(BitmapDecoder, IBitmapFrameWithSoftwareBitmap := "{FE287C9A-420C-4963-87AD-691436E08383}")
DllCall(NumGet(NumGet(BitmapFrameWithSoftwareBitmap+0)+6*A_PtrSize), "ptr", BitmapFrameWithSoftwareBitmap, "ptr*", SoftwareBitmap) ; GetSoftwareBitmapAsync
WaitForAsync(SoftwareBitmap)
DllCall(NumGet(NumGet(OcrEngine+0)+6*A_PtrSize), "ptr", OcrEngine, ptr, SoftwareBitmap, "ptr*", OcrResult) ; RecognizeAsync
WaitForAsync(OcrResult)
DllCall(NumGet(NumGet(OcrResult+0)+6*A_PtrSize), "ptr", OcrResult, "ptr*", LinesList) ; get_Lines
DllCall(NumGet(NumGet(LinesList+0)+7*A_PtrSize), "ptr", LinesList, "int*", count) ; count
loop % count
{
DllCall(NumGet(NumGet(LinesList+0)+6*A_PtrSize), "ptr", LinesList, "int", A_Index-1, "ptr*", OcrLine)
DllCall(NumGet(NumGet(OcrLine+0)+7*A_PtrSize), "ptr", OcrLine, "ptr*", hText)
buffer := DllCall("Combase.dll\WindowsGetStringRawBuffer", "ptr", hText, "uint*", length, "ptr")
text .= StrGet(buffer, "UTF-16") "`n"
ObjRelease(OcrLine)
}
Close := ComObjQuery(IRandomAccessStream, IClosable := "{30D5A829-7FA4-4026-83BB-D75BAE4EA99E}")
DllCall(NumGet(NumGet(Close+0)+6*A_PtrSize), "ptr", Close) ; Close
ObjRelease(Close)
Close := ComObjQuery(SoftwareBitmap, IClosable := "{30D5A829-7FA4-4026-83BB-D75BAE4EA99E}")
DllCall(NumGet(NumGet(Close+0)+6*A_PtrSize), "ptr", Close) ; Close
ObjRelease(Close)
ObjRelease(IRandomAccessStream)
ObjRelease(BitmapDecoder)
ObjRelease(BitmapFrame)
ObjRelease(BitmapFrameWithSoftwareBitmap)
ObjRelease(SoftwareBitmap)
ObjRelease(OcrResult)
ObjRelease(LinesList)
return text
}
CreateClass(string, interface, ByRef Class)
{
CreateHString(string, hString)
VarSetCapacity(GUID, 16)
DllCall("ole32\CLSIDFromString", "wstr", interface, "ptr", &GUID)
result := DllCall("Combase.dll\RoGetActivationFactory", "ptr", hString, "ptr", &GUID, "ptr*", Class)
if (result != 0)
{
if (result = 0x80004002)
msgbox No such interface supported
else if (result = 0x80040154)
msgbox Class not registered
else
msgbox error: %result%
ExitApp
}
DeleteHString(hString)
}
CreateHString(string, ByRef hString)
{
DllCall("Combase.dll\WindowsCreateString", "wstr", string, "uint", StrLen(string), "ptr*", hString)
}
DeleteHString(hString)
{
DllCall("Combase.dll\WindowsDeleteString", "ptr", hString)
}
WaitForAsync(ByRef Object)
{
AsyncInfo := ComObjQuery(Object, IAsyncInfo := "{00000036-0000-0000-C000-000000000046}")
loop
{
DllCall(NumGet(NumGet(AsyncInfo+0)+7*A_PtrSize), "ptr", AsyncInfo, "uint*", status) ; IAsyncInfo.Status
if (status != 0)
{
if (status != 1)
{
DllCall(NumGet(NumGet(AsyncInfo+0)+8*A_PtrSize), "ptr", AsyncInfo, "uint*", ErrorCode) ; IAsyncInfo.ErrorCode
msgbox AsyncInfo status error: %ErrorCode%
ExitApp
}
ObjRelease(AsyncInfo)
break
}
sleep 10
}
DllCall(NumGet(NumGet(Object+0)+8*A_PtrSize), "ptr", Object, "ptr*", ObjectResult) ; GetResults
ObjRelease(Object)
Object := ObjectResult
}
Re: Windows 10 OCR
Windows RunTime is mostly Async, what means you should never put it in your main thread.
Code: Select all
EnableExplicit
DeclareModule WinOCR
Declare.s get_Languages()
Declare.s get_TextFromFile(sFile.s, sLanguage.s = "")
Declare.s get_TextFromImageID(ImageID, sLanguage.s = "")
EndDeclareModule
Module WinOCR
EnableExplicit
#MyRoDebug = 0
Interface IInspectable Extends IUnknown
GetIids(*iidCount, *iids)
GetRuntimeClassName(*className)
GetTrustLevel(*trustLevel)
EndInterface
Interface IClosable Extends IInspectable
Close()
EndInterface
Interface ILanguageFactory Extends IInspectable
createLanguage(*string, *out)
EndInterface
Interface ILanguage Extends IInspectable
get_LanguageTag(*value)
get_DisplayName(*value)
get_NativeName(*value)
get_Script(*value)
EndInterface
Interface IBitmapDecoderStatics Extends IInspectable
BmpDecoderId(*value.guid)
JpegDecoderId(*value.guid)
PngDecoderId(*value.guid)
TiffDecoderId(*value.guid)
GifDecoderId(*value.guid)
JpegXRDecoderId(*value.guid)
IcoDecoderId(*value.guid)
GetDecoderInformationEnumerator(*out)
CreateAsync(*in, *out)
CreateWithIdAsync(*decoderId.guid, *in, *out)
EndInterface
Interface IBitmapDecoder Extends IInspectable
BitmapContainerProperties(*value)
DecoderInformation(*value)
FrameCount(*value)
GetPreviewAsync(*value)
GetFrameAsync(frameIndex, *value)
EndInterface
Interface IBitmapFrameWithSoftwareBitmap Extends IInspectable
GetSoftwareBitmapAsync(value)
EndInterface
Interface IBitmapFrame Extends IInspectable
GetThumbnailAsync(*asyncInfo)
BitmapProperties(*value)
BitmapPixelFormat(*value)
BitmapAlphaMode(*value)
DpiX(*value)
DpiY(*value)
PixelWidth(*value)
PixelHeight(*value)
OrientedPixelWidth(*value)
OrientedPixelHeight(*value)
GetPixelDataAsync(*asyncInfo)
GetPixelDataTransformedAsync(pixelFormat, alphaMode, transform, exifOrientationMode, colorManagementMode, *asyncInfo)
EndInterface
Interface ISoftwareBitmap Extends IInspectable
get_BitmapPixelFormat(*value)
get_BitmapAlphaMode(*value)
get_PixelWidth(*value)
get_PixelHeight(*value)
get_IsReadOnly(*value)
put_DpiX(*value)
get_DpiX(*value)
put_DpiY(*value)
get_DpiY(*value)
LockBuffer(mode, *value)
CopyTo(*bitmap)
CopyFromBuffer(*buffer)
GetReadOnlyView(*value)
EndInterface
Interface IOcrEngineStatics Extends IInspectable
MaxDimensions(*value)
AvailableRecognizerLanguages(*value)
IsLanguageSupported(*Language,*Result)
TryCreateFromLanguage(*Language,*Result)
TryCreateFromUserProfileLanguages(*Result)
EndInterface
Interface IOcrEngine Extends IInspectable
RecognizeAsync(*bitmap, *result)
RecognizerLanguage(*value)
EndInterface
Interface IOcrResult Extends IInspectable
Lines(*value)
TextAngle(*value)
Text(*value)
EndInterface
Interface IRandomAccessStream Extends IInspectable
get_Size(*value)
put_Size(*value)
GetInputStreamAt(position.q, *stream)
GetOutputStreamAt(position.q, *stream)
get_Position(*value)
Seek(position.q)
CloneStream(*stream)
get_CanRead(*value)
get_CanWrite(*value)
EndInterface
Interface IAsyncInfo Extends IInspectable
Id(*id)
Status(*status)
ErrorCode(*errorCode)
Cancel()
Close()
EndInterface
Interface IAsyncOperationWithProgress Extends IInspectable
Completed(asyncInfo, asyncStatus)
Progress(asyncInfo, progressInfo)
GetResults(*Object)
EndInterface
Interface IGlobalizationPreferencesStatics Extends IInspectable
get_Calendars(*value)
get_Clocks(*value)
get_Currencies(*value)
get_Languages(*value)
get_HomeGeographicRegion(*value)
get_WeekStartsOn(*value)
EndInterface
Interface IReadOnlyList Extends IInspectable
get_Item(index, *hString)
count(*value)
EndInterface
;- DataSection
DataSection
IID_IBitmapFrame:
Data.l $72A49A1C
Data.w $8081, $438D
Data.b $91, $BC, $94, $EC, $FC, $81, $85, $C6
IID_IRandomAccessStream:
Data.l $905A0FE1
Data.w $BC53, $11DF
Data.b $8C, $49, $0, $1E, $4F, $C6, $86, $DA
IID_IClosable:
Data.l $30D5A829
Data.w $7FA4, $4026
Data.b $83, $BB, $D7, $5B, $AE, $4E, $A9, $9E
IID_IBitmapFrameWithSoftwareBitmap:
Data.l $FE287C9A
Data.w $420C, $4963
Data.b $87, $AD, $69, $14, $36, $E0, $83, $83
IID_IAsyncInfo:
Data.l $00000036
Data.w $00, $00
Data.b $C0, $00, $00, $00, $00, $00, $00, $46
IID_IPicture:
Data.l $7BF80980
Data.w $BF32, $101A
Data.b $8B, $BB, $00, $AA, $0, $30, $0C, $AB
IID_ILanguageFactory:
Data.l $9B0252AC
Data.w $C27, $44F8
Data.b $B7, $92, $97, $93, $FB, $66, $C6, $3E
IID_IBitmapDecoderStatics:
Data.l $438CCB26
Data.w $BCEF, $4E95
Data.b $BA, $D6, $23, $A8, $22, $E5, $8D, $01
IID_IOcrEngineStatics:
Data.l $5BFFA85A
Data.w $3384, $3540
Data.b $99, $40, $69, $91, $20, $D4, $28, $A8
IID_IGlobalizationPreferencesStatics:
Data.l $1BF4326
Data.w $ED37, $4E96
Data.b $B0, $E9, $C1, $34, $0D, $1E, $A1, $58
EndDataSection
Prototype pRoInitialize(initType)
Prototype pWindowsCreateString(a, b, c)
Prototype pRoGetActivationFactory(a, b, c)
Prototype pWindowsDeleteString(a)
Prototype pWindowsGetStringRawBuffer(a, length)
Prototype pCreateRandomAccessStreamOnFile(file, iRead, *GUID, *out)
Prototype pCreateRandomAccessStreamOverStream(*stream, options, *riid, *ppv)
Structure DynamicRuntimeFuncs
WindowsCreateString.pWindowsCreateString
WindowsDeleteString.pWindowsDeleteString
RoGetActivationFactory.pRoGetActivationFactory
WindowsGetStringRawBuffer.pWindowsGetStringRawBuffer
CreateRandomAccessStreamOnFile.pCreateRandomAccessStreamOnFile
CreateRandomAccessStreamOverStream.pCreateRandomAccessStreamOverStream
RoInitialize.pRoInitialize
EndStructure
Global RT_Funcs.DynamicRuntimeFuncs
Global RT_INIT_DONE
Structure MyRTIntArray
i.i[0]
EndStructure
Procedure RT_Init()
Protected hr.l
#RO_INIT_SINGLETHREADED = 0
#RO_INIT_MULTITHREADED = 1
Protected hCombase = OpenLibrary(#PB_Any, "combase.dll")
If Not hCombase
ProcedureReturn 0
EndIf
RT_Funcs\WindowsCreateString = GetFunction(hCombase, "WindowsCreateString")
RT_Funcs\WindowsDeleteString = GetFunction(hCombase, "WindowsDeleteString")
RT_Funcs\RoGetActivationFactory = GetFunction(hCombase, "RoGetActivationFactory")
RT_Funcs\WindowsGetStringRawBuffer = GetFunction(hCombase, "WindowsGetStringRawBuffer")
RT_Funcs\RoInitialize = GetFunction(hCombase, "RoInitialize")
hCombase = OpenLibrary(#PB_Any, "SHCore.dll")
If Not hCombase
ProcedureReturn 0
EndIf
RT_Funcs\CreateRandomAccessStreamOnFile = GetFunction(hCombase, "CreateRandomAccessStreamOnFile")
RT_Funcs\CreateRandomAccessStreamOverStream = GetFunction(hCombase, "CreateRandomAccessStreamOverStream")
RT_INIT_DONE = 1
Protected *i.MyRTIntArray = @RT_Funcs, i
For i = 0 To (SizeOf(RT_Funcs) / SizeOf(Integer)) - 1
If Not *i\i[i]
RT_INIT_DONE = 0
Break
EndIf
Next
If RT_INIT_DONE
hr = RT_Funcs\RoInitialize(#RO_INIT_MULTITHREADED)
CompilerIf #MyRoDebug : Debug "RoInitialize = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
;RPC_E_CHANGED_MODE = $80010106
EndIf
ProcedureReturn RT_INIT_DONE
EndProcedure
Procedure CreateHString(sString.s, *hString)
ProcedureReturn RT_Funcs\WindowsCreateString(@sString, Len(sString), *hString)
EndProcedure
Procedure DeleteHString(*hString)
ProcedureReturn RT_Funcs\WindowsDeleteString(*hString)
EndProcedure
Procedure CreateClass(sString.s, sGUID.s, *OutClass)
Protected hString, GUID.GUID, iReturn
CreateHString(sString, @hString)
CLSIDFromString_(@sGUID, @GUID)
iReturn = RT_Funcs\RoGetActivationFactory(hString, @GUID, *OutClass)
RT_Funcs\WindowsDeleteString(hString)
ProcedureReturn iReturn
EndProcedure
Procedure WaitForAsync(*InOut.Integer)
Protected *Object.IAsyncOperationWithProgress = *InOut\i
Protected IAsyncInfo.IAsyncInfo, status.l, ErrorCode.l, hr.l
hr = *Object\QueryInterface(?IID_IAsyncInfo, @IAsyncInfo)
;Debug "WaitForAsync QueryInterface = 0x" + Hex(hr)
If Not IAsyncInfo
Debug "ER QueryInterface IID_IAsyncInfo failed!"
ProcedureReturn 0
EndIf
While 1
hr = IAsyncInfo\Status(@status)
; Debug "IAsyncInfo::Status = 0x" + Hex(hr) + " - " + Hex(status)
If (status <> 0)
If (status <> 1)
IAsyncInfo\ErrorCode(@ErrorCode)
Debug "ER IAsyncInfo status error 0x" + Hex(ErrorCode, #PB_Long)
End
EndIf
IAsyncInfo\Release()
Break
EndIf
Sleep_(10)
Wend
Protected ObjectResult
*Object\GetResults(@ObjectResult)
If ObjectResult
*Object\Release()
*InOut\i = ObjectResult
EndIf
ProcedureReturn ObjectResult
EndProcedure
Structure PICTDESC_bmp
hbitmap.i
hpal.i
EndStructure
Structure PICTDESC_wmf
hmete.i
xExt.l
yExt.l
EndStructure
Structure PICTDESC_icon
hicon.i
EndStructure
Structure PICTDESC_emf
hemf.i
EndStructure
Structure PICTDESC
cbSizeofstruct.l
picType.l
StructureUnion
bmp.PICTDESC_bmp
wmf.PICTDESC_wmf
icon.PICTDESC_icon
emf.PICTDESC_emf
EndStructureUnion
EndStructure
Procedure HBitmapToRandomAccessStream(hBitmap)
Protected pIStream.IStream, pIPicture.IPicture, hr.l
hr = CreateStreamOnHGlobal_(0, #True, @pIStream)
If pIStream
#PICTYPE_BITMAP = 1
Protected PD.PICTDESC
PD\cbSizeofstruct = SizeOf(PICTDESC)
PD\picType = #PICTYPE_BITMAP
PD\bmp\hbitmap = hBitmap
hr = OleCreatePictureIndirect_(@PD, ?IID_IPicture, #False, @pIPicture)
If pIPicture
Protected cbSize
hr = pIPicture\SaveAsFile(pIStream, #True, 0)
Protected pIRandomAccessStream.IRandomAccessStream
#BSOS_DEFAULT = 0
hr = RT_Funcs\CreateRandomAccessStreamOverStream(pIStream, #BSOS_DEFAULT, ?IID_IRandomAccessStream, @pIRandomAccessStream)
pIPicture\Release()
EndIf
pIStream\Release()
EndIf
ProcedureReturn pIRandomAccessStream
EndProcedure
Procedure.s GetOCRFromImage(sFile.s, hBitmap = 0, sLanguage.s = "", bReturnLanguages = 0)
If Not RT_INIT_DONE
RT_Init()
If Not RT_INIT_DONE
ProcedureReturn ""
EndIf
EndIf
Static ILanguageFactory.ILanguageFactory
Static IBitmapDecoderStatics.IBitmapDecoderStatics
Static IOcrEngineStatics.IOcrEngineStatics
Static IGlobalizationPreferencesStatics.IGlobalizationPreferencesStatics
Static MaxDimension
Protected hString, hr.l, bSupported, i, sOCRText.s
Protected hText, length, *p, count, hLanguage, iFrames, width, height, iUsedLanguage
Protected IOcrEngine.IOcrEngine
Protected ILanguageList.IReadOnlyList
Protected ILanguageTest.ILanguage
Protected IRandomAccessStream.IRandomAccessStream
Protected IBitmapDecoder.IBitmapDecoder
Protected IBitmapFrame.IBitmapFrame
Protected IClosable.IClosable
Protected IOcrResult.IOcrResult
Protected ILinesList.IReadOnlyList
Protected IBitmapFrameWithSoftwareBitmap.IBitmapFrameWithSoftwareBitmap
Protected ISoftwareBitmap.ISoftwareBitmap
Protected IOCRLine.ILanguage
If ILanguageFactory = 0 Or
IBitmapDecoderStatics = 0 Or
IOcrEngineStatics = 0 Or
IGlobalizationPreferencesStatics = 0
hr = CreateClass("Windows.Globalization.Language", "{9B0252AC-0C27-44F8-B792-9793FB66C63E}", @ILanguageFactory)
CompilerIf #MyRoDebug : Debug "CreateClass Windows.Globalization.Language = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
hr = CreateClass("Windows.Graphics.Imaging.BitmapDecoder", "{438CCB26-BCEF-4E95-BAD6-23A822E58D01}", @IBitmapDecoderStatics)
CompilerIf #MyRoDebug : Debug "CreateClass Windows.Graphics.Imaging.BitmapDecoder = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
hr = CreateClass("Windows.Media.Ocr.OcrEngine", "{5BFFA85A-3384-3540-9940-699120D428A8}", @IOcrEngineStatics)
CompilerIf #MyRoDebug : Debug "CreateClass Windows.Media.Ocr.OcrEngine = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
hr = CreateClass("Windows.System.UserProfile.GlobalizationPreferences", "{01BF4326-ED37-4E96-B0E9-C1340D1EA158}", @IGlobalizationPreferencesStatics)
CompilerIf #MyRoDebug : Debug "CreateClass Windows.System.UserProfile.GlobalizationPreferences = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
EndIf
If ILanguageFactory = 0 Or
IBitmapDecoderStatics = 0 Or
IOcrEngineStatics = 0 Or
IGlobalizationPreferencesStatics = 0
Goto Release
EndIf
If Not MaxDimension
IOcrEngineStatics\MaxDimensions(@MaxDimension)
EndIf
If sLanguage = ""
hr = IGlobalizationPreferencesStatics\get_Languages(@ILanguageList)
CompilerIf #MyRoDebug : Debug "IGlobalizationPreferencesStatics::get_Languages = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If ILanguageList
hr = ILanguageList\count(@count)
CompilerIf #MyRoDebug : Debug "ILanguageList::count = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
For i = 0 To count -1
hr = ILanguageList\get_Item(0, @hString)
CompilerIf #MyRoDebug : Debug "ILanguageList::get_Item = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
hr = ILanguageFactory\createLanguage(hString, @ILanguageTest)
CompilerIf #MyRoDebug : Debug "ILanguageFactory::createLanguage = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If ILanguageTest
hr = IOcrEngineStatics\IsLanguageSupported(ILanguageTest, @bSupported)
CompilerIf #MyRoDebug : Debug "IOcrEngineStatics::IsLanguageSupported = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If bSupported
hText = 0
ILanguageTest\get_LanguageTag(@hText)
CompilerIf #MyRoDebug : Debug "ILanguageTest::get_LanguageTag = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If hText
*p = RT_Funcs\WindowsGetStringRawBuffer(hText, @length)
If *p
sLanguage = PeekS(*p)
If bReturnLanguages
sOCRText + sLanguage + #CRLF$
EndIf
;Debug sLanguage
EndIf
EndIf
EndIf
ILanguageTest\Release()
EndIf
Next
ILanguageList\Release()
EndIf
EndIf
If bReturnLanguages
Goto Release
EndIf
hString = 0
CreateHString(sLanguage, @hString)
If hString
hr = ILanguageFactory\createLanguage(hString, @iUsedLanguage)
CompilerIf #MyRoDebug : Debug "ILanguageFactory::createLanguage = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
hr = RT_Funcs\WindowsDeleteString(hString)
CompilerIf #MyRoDebug : Debug "WindowsDeleteString = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
hr = IOcrEngineStatics\TryCreateFromLanguage(iUsedLanguage, @IOcrEngine)
CompilerIf #MyRoDebug : Debug "IOcrEngineStatics::TryCreateFromLanguage = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If IOcrEngine
If hBitmap
IRandomAccessStream = HBitmapToRandomAccessStream(hBitmap)
Else
hr = RT_Funcs\CreateRandomAccessStreamOnFile(@sFile, 0, ?IID_IRandomAccessStream, @IRandomAccessStream)
CompilerIf #MyRoDebug : Debug "CreateRandomAccessStreamOnFile = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
EndIf
If IRandomAccessStream
hr = IBitmapDecoderStatics\CreateAsync(IRandomAccessStream, @IBitmapDecoder)
CompilerIf #MyRoDebug : Debug "IBitmapDecoderStatics::CreateAsync = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If IBitmapDecoder
WaitForAsync(@IBitmapDecoder)
hr = IBitmapDecoder\FrameCount(@iFrames)
CompilerIf #MyRoDebug : Debug "IBitmapDecoder::FrameCount = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
hr = IBitmapDecoder\QueryInterface(?IID_IBitmapFrame, @IBitmapFrame)
CompilerIf #MyRoDebug : Debug "IBitmapDecoder::QueryInterface = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If IBitmapFrame
hr = IBitmapFrame\PixelWidth(@width)
CompilerIf #MyRoDebug : Debug "IBitmapFrame::PixelWidth = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
hr = IBitmapFrame\PixelHeight(@height)
CompilerIf #MyRoDebug : Debug "IBitmapFrame::PixelHeight = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If width > MaxDimension Or height > MaxDimension
Debug "ER: Image is to big"
Else
hr = IBitmapDecoder\QueryInterface(?IID_IBitmapFrameWithSoftwareBitmap, @IBitmapFrameWithSoftwareBitmap)
CompilerIf #MyRoDebug : Debug "IBitmapDecoder::QueryInterface = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If IBitmapFrameWithSoftwareBitmap
hr = IBitmapFrameWithSoftwareBitmap\GetSoftwareBitmapAsync(@ISoftwareBitmap)
CompilerIf #MyRoDebug : Debug "IBitmapFrameWithSoftwareBitmap::GetSoftwareBitmapAsync = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If ISoftwareBitmap
WaitForAsync(@ISoftwareBitmap)
IOcrEngine\RecognizeAsync(ISoftwareBitmap, @IOcrResult)
CompilerIf #MyRoDebug : Debug "IOcrEngine::RecognizeAsync = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If IOcrResult
WaitForAsync(@IOcrResult)
hr = IOcrResult\Lines(@ILinesList)
CompilerIf #MyRoDebug : Debug "IOcrResult::Lines = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If ILinesList
count = 0
hr = ILinesList\count(@count)
CompilerIf #MyRoDebug : Debug "ILinesList::count = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
For i = 0 To count -1
hText = 0
hr = ILinesList\get_Item(i, @IOCRLine)
CompilerIf #MyRoDebug : Debug "ILinesList::get_Item = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
hr = IOCRLine\get_DisplayName(@hText)
CompilerIf #MyRoDebug : Debug "IOCRLine::get_DisplayName = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If hText
*p = RT_Funcs\WindowsGetStringRawBuffer(hText, 0)
If *p
sOCRText + PeekS(*p) + #CRLF$
EndIf
EndIf
IOCRLine\Release()
Next
EndIf
EndIf
EndIf
EndIf
EndIf
EndIf
EndIf
hr = IRandomAccessStream\QueryInterface(?IID_IClosable, @IClosable)
CompilerIf #MyRoDebug : Debug "IRandomAccessStream::QueryInterface = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If IClosable
IClosable\Close()
IClosable\Release()
IClosable = 0
EndIf
IRandomAccessStream\Release()
If ISoftwareBitmap
hr = ISoftwareBitmap\QueryInterface(?IID_IClosable, @IClosable)
CompilerIf #MyRoDebug : Debug "ISoftwareBitmap::QueryInterface = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
If IClosable
IClosable\Close()
IClosable\Release()
IClosable = 0
EndIf
EndIf
EndIf
IOcrEngine\Release()
EndIf
EndIf
Release:
If IBitmapDecoder : IBitmapDecoder\Release() : EndIf
If IBitmapFrame : IBitmapFrame\Release() : EndIf
If IBitmapFrameWithSoftwareBitmap : IBitmapFrameWithSoftwareBitmap\Release() : EndIf
If ISoftwareBitmap : ISoftwareBitmap\Release() : EndIf
If IOcrResult : IOcrResult\Release() : EndIf
If ILinesList : ILinesList\Release() : EndIf
CompilerIf #MyRoDebug : Debug "### Func End ###" + #CRLF$ : CompilerEndIf
ProcedureReturn sOCRText
EndProcedure
;-
Procedure.s get_Languages()
ProcedureReturn GetOCRFromImage("", 0, "", 1)
EndProcedure
Procedure.s get_TextFromFile(sFile.s, sLanguage.s = "")
ProcedureReturn GetOCRFromImage(sFile, 0, sLanguage, 0)
EndProcedure
Procedure.s get_TextFromImageID(ImageID, sLanguage.s = "")
ProcedureReturn GetOCRFromImage("", ImageID, sLanguage, 0)
EndProcedure
EndModule
Structure tOCR_DEMO
sFile.s
ImageID.i
;
sLanguages.s
sFromFile.s
sFromImage.s
EndStructure
Procedure OCR_DEMO(*OCR_DEMO.tOCR_DEMO)
*OCR_DEMO\sLanguages = WinOCR::get_Languages()
If *OCR_DEMO\sFile <> ""
*OCR_DEMO\sFromFile = WinOCR::get_TextFromFile(*OCR_DEMO\sFile)
EndIf
If *OCR_DEMO\ImageID
*OCR_DEMO\sFromImage = WinOCR::get_TextFromImageID(*OCR_DEMO\ImageID)
EndIf
EndProcedure
Procedure HBitmapFromScreen(X, Y, W, H)
Protected HDC = GetDC_(0)
Protected HBM = CreateCompatibleBitmap_(HDC, W, H)
Protected PDC = CreateCompatibleDC_(HDC)
SelectObject_(PDC, HBM)
BitBlt_(PDC, 0, 0, W, H, HDC, X, Y, #SRCCOPY)
DeleteDC_(PDC)
ReleaseDC_(0, HDC)
ProcedureReturn HBM
EndProcedure
UsePNGImageDecoder()
UseJPEGImageDecoder()
Define tOCR_DEMO.tOCR_DEMO
tOCR_DEMO\sFile = OpenFileRequester("Choose Image File", "Image.png", "ImageFiles (*.png;jpg;gif;bmp)|*.png;*.jpg;*.gif;*.bmp", 0)
;LoadImage(0, tOCR_DEMO\sFile)
;tOCR_DEMO\ImageID = ImageID(0)
tOCR_DEMO\ImageID = HBitmapFromScreen(0, 0, 800, 600)
Define thread = CreateThread(@OCR_DEMO(), @tOCR_DEMO)
If thread
WaitThread(thread)
MessageRequester("Avalible Languages", tOCR_DEMO\sLanguages)
MessageRequester("TextFromFile", tOCR_DEMO\sFromFile)
MessageRequester("TextFromImageID", tOCR_DEMO\sFromImage)
EndIf
Re: Windows 10 OCR
Beautiful, I tested it on a file properties screenshot, I have a few characters missing but the result is superb and perfectly readable for the 3: Languages, screenshot image and the BitmapFromScreen
It could be in the Tricks 'n' Tips section.
It could be in the Tricks 'n' Tips section.
Re: Windows 10 OCR
fryquez, that code works great! Thanks!
Re: Windows 10 OCR
Yes, I'll post a separate Tricks 'n' Tips topic.
I just found something that wants to be added to the code
I just found something that wants to be added to the code
Re: Windows 10 OCR
Awesome!
Re: Windows 10 OCR
WaitForAsync could be implemented without sleep.
In C++ it looks like this according to https://kennykerr.ca/2018/03/08/cppwinr ... ompletion/:
But i dont know how to convert it to PB
In C++ it looks like this according to https://kennykerr.ca/2018/03/08/cppwinr ... ompletion/:
Code: Select all
template <typename T>
auto get(T const& async)
{
if (async.Status() != AsyncStatus::Completed)
{
handle signal = CreateEvent(nullptr, true, false, nullptr);
async.Completed([&](auto&&, auto&&)
{
SetEvent(signal.get());
});
WaitForSingleObject(signal.get(), INFINITE);
}
return async.GetResults();
}
Re: Windows 10 OCR
If this 10ms bother you try to setup and completion event handler.