Windows 10 OCR

Windows specific forum
MBall
User
User
Posts: 12
Joined: Wed Dec 18, 2019 11:33 pm

Windows 10 OCR

Post by MBall »

HI

I am looking for a starting point to use the built in OCR function in windows 10.

any ideas, or starting point would be helpful

thank you.

M
User avatar
Mijikai
Addict
Addict
Posts: 1360
Joined: Sun Sep 11, 2016 2:17 pm

Re: Windows 10 OCR

Post by Mijikai »

OCR stands for Optical Character Recognition.

Infos:
https://en.wikipedia.org/wiki/Optical_c ... ecognition
https://docs.microsoft.com/en-us/uwp/ap ... inrt-20348

It seems to be possible but u need to deal with a lot of nonsense! :P

Here are the interfaces (if i got it right):

Code: Select all

Interface IInspectable Extends IUnknown
  GetIids.i(*Count,*iids)
  GetRuntimeClassName.s(*ClassName)
  GetTrustLevel.i(*TrustLevel)
EndInterface

Interface IOcrEngine Extends IInspectable
  RecognizeAsync.i(*Bitmap,*OcrResult);Windows.Graphics.Imaging.SoftwareBitmap // Windows.Foundation.IAsyncOperation<Windows.Media.Ocr.OcrResult
  RecognizeLanguage.i(*Value);Windows.Globalization.Language
EndInterface

Interface IOcrEngineStatics Extends IInspectable
  MaxDimensions.i(*Value)
  AvailableRecognizerLanguages.i(*Value);Windows.Foundation.Collections.IVectorView<Windows.Globalization.Language
  IsLanguageSupported.i(*Language,*Result);Windows.Globalization.Language
  TryCreateFromLanguage.i(*Language,*Result);Windows.Globalization.Language
  TryCreateFromUserProfileLanguages.i(*Result);Windows.Media.Ocr.OcrEngine
EndIf

Interface IOcrLine Extends IInspectable
  Words.i(*Value);Windows.Foundation.Collections.IVectorView<Windows.Media.Ocr.OcrWord
  Text.i(*Value)
EndInterface

Interface IOcrResult Extends IInspectable
  Lines.i(*Value);Windows.Foundation.Collections.IVectorView<Windows.Media.Ocr.OcrLine
  TextAngle.i(*Value);Windows.Foundation.IReference<DOUBLE>
  Text.i(*Value)
EndInterface

Interface IOrcWord Extends IInspectable
  BoundingRect.i(*Value);Windows.Foundation.Rect
  Text.i(*Value)
EndInterface

As you might notice most of the calls return other nonsense that needs to be implemented aswell (the language stuff)!
Im currently too busy otherwise i would try to write a small ocr lib with that nonsense.

How it (should) work - first steps:

To make it work with PB u need to make the thread use the windows runtime nonsense!
-> RoInitialize() https://docs.microsoft.com/en-us/window ... initialize

Now request/create the Classes (Interfaces) using WindowsCreateString() and RoGetActivationFactory().
-> WindowsCreateString() https://docs.microsoft.com/en-us/window ... eatestring
-> RoGetActivationFactory() https://docs.microsoft.com/en-us/window ... ionfactory

Prepeare to do some digging :mrgreen:

Good luck :)
User avatar
Mijikai
Addict
Addict
Posts: 1360
Joined: Sun Sep 11, 2016 2:17 pm

Re: Windows 10 OCR

Post by Mijikai »

Had some time and so i wrote a small lib to at least help with that Windows Runtime nonsense.

Example how to get access to IOcrEngineStatics:

Code: Select all

EnableExplicit

; wrt.lib
; Version: Alpha 2
; Author: Mijikai
; Copyright 2021 by Mijikai all rights reserved
; License: Attribution-NonCommercial-NoDerivatives 4.0 International
; https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode

Import "wrt.lib"
  wrtOpen.i()
  wrtInterface.i(Name.s,Id.s,*Interface)
  wrtStringCreate.i(String.s,*String)
  wrtStringBuffer.i(*String,*Unicode = #Null,*Length = #Null)
  wrtStringDelete.i(*String)
  wrtClose.i()
  wrtVersion.i()
EndImport

Interface IInspectable Extends IUnknown
  GetIids.i(*Count,*iids)
  GetRuntimeClassName.s(*ClassName)
  GetTrustLevel.i(*TrustLevel)
EndInterface

Interface IOcrEngineStatics Extends IInspectable
  MaxDimensions.i(*Value)
  AvailableRecognizedLanguages.i(*Value);Windows.Foundation.Collections.IVectorView<Windows.Globalization.Language
  IsLanguageSupported.i(*Language,*Result);Windows.Globalization.Language
  TryCreateFromLanguage.i(*Language,*Result);Windows.Globalization.Language
  TryCreateFromUserProfileLanguages.i(*Result);Windows.Media.Ocr.OcrEngine
EndInterface

Procedure.i Main()
  Protected *OcrEngineStatics.IOcrEngineStatics
  Protected hstring.i
  If wrtOpen()
    Debug wrtStringCreate("Hello World!",@hstring)
    If hstring
      Debug PeekS(wrtStringBuffer(hstring))
      wrtStringDelete(hstring)
    EndIf
    Debug wrtInterface("Windows.Media.Ocr.OcrEngine","{5BFFA85A-3384-3540-9940-699120D428A8}",@*OcrEngineStatics)
    Debug *OcrEngineStatics
    *OcrEngineStatics\Release()
    wrtClose()
  EndIf
  ProcedureReturn #Null
EndProcedure

Main()

End
Download wrt.lib (x64) - written in fasm:
https://www.dropbox.com/s/ggxc2mlpp9wxn ... 2.zip?dl=0

Hope this helps a bit :)
acreis
Enthusiast
Enthusiast
Posts: 182
Joined: Fri Jun 01, 2012 12:20 am

Re: Windows 10 OCR

Post by acreis »

It can be done in autohotkey. Surely can be translated to Purebasic.

Code: Select all

HBitmapFromScreen(X, Y, W, H) {
   HDC := DllCall("GetDC", "Ptr", 0, "UPtr")
   HBM := DllCall("CreateCompatibleBitmap", "Ptr", HDC, "Int", W, "Int", H, "UPtr")
   PDC := DllCall("CreateCompatibleDC", "Ptr", HDC, "UPtr")
   DllCall("SelectObject", "Ptr", PDC, "Ptr", HBM)
   DllCall("BitBlt", "Ptr", PDC, "Int", 0, "Int", 0, "Int", W, "Int", H
                   , "Ptr", HDC, "Int", X, "Int", Y, "UInt", 0x00CC0020)
   DllCall("DeleteDC", "Ptr", PDC)
   DllCall("ReleaseDC", "Ptr", 0, "Ptr", HDC)
   Return HBM
}

HBitmapToRandomAccessStream(hBitmap) {
   static IID_IRandomAccessStream := "{905A0FE1-BC53-11DF-8C49-001E4FC686DA}"
        , IID_IPicture            := "{7BF80980-BF32-101A-8BBB-00AA00300CAB}"
        , PICTYPE_BITMAP := 1
        , BSOS_DEFAULT   := 0
        
   DllCall("Ole32\CreateStreamOnHGlobal", "Ptr", 0, "UInt", true, "PtrP", pIStream, "UInt")
   
   VarSetCapacity(PICTDESC, sz := 8 + A_PtrSize*2, 0)
   NumPut(sz, PICTDESC)
   NumPut(PICTYPE_BITMAP, PICTDESC, 4)
   NumPut(hBitmap, PICTDESC, 8)
   riid := CLSIDFromString(IID_IPicture, GUID1)
   DllCall("OleAut32\OleCreatePictureIndirect", "Ptr", &PICTDESC, "Ptr", riid, "UInt", false, "PtrP", pIPicture, "UInt")
   ; IPicture::SaveAsFile
   DllCall(NumGet(NumGet(pIPicture+0) + A_PtrSize*15), "Ptr", pIPicture, "Ptr", pIStream, "UInt", true, "UIntP", size, "UInt")
   riid := CLSIDFromString(IID_IRandomAccessStream, GUID2)
   DllCall("ShCore\CreateRandomAccessStreamOverStream", "Ptr", pIStream, "UInt", BSOS_DEFAULT, "Ptr", riid, "PtrP", pIRandomAccessStream, "UInt")
   ObjRelease(pIPicture)
   ObjRelease(pIStream)
   Return pIRandomAccessStream
}

CLSIDFromString(IID, ByRef CLSID) {
   VarSetCapacity(CLSID, 16, 0)
   if res := DllCall("ole32\CLSIDFromString", "WStr", IID, "Ptr", &CLSID, "UInt")
      throw Exception("CLSIDFromString failed. Error: " . Format("{:#x}", res))
   Return &CLSID
}


ocr(file, lang := "FirstFromAvailableLanguages")
{
   static OcrEngineStatics, OcrEngine, MaxDimension, LanguageFactory, Language, CurrentLanguage, BitmapDecoderStatics, GlobalizationPreferencesStatics
   if (OcrEngineStatics = "")
   {
      CreateClass("Windows.Globalization.Language", ILanguageFactory := "{9B0252AC-0C27-44F8-B792-9793FB66C63E}", LanguageFactory)
      CreateClass("Windows.Graphics.Imaging.BitmapDecoder", IBitmapDecoderStatics := "{438CCB26-BCEF-4E95-BAD6-23A822E58D01}", BitmapDecoderStatics)
      CreateClass("Windows.Media.Ocr.OcrEngine", IOcrEngineStatics := "{5BFFA85A-3384-3540-9940-699120D428A8}", OcrEngineStatics)
      DllCall(NumGet(NumGet(OcrEngineStatics+0)+6*A_PtrSize), "ptr", OcrEngineStatics, "uint*", MaxDimension)   ; MaxImageDimension
   }
   if (file = "ShowAvailableLanguages")
   {
      if (GlobalizationPreferencesStatics = "")
         CreateClass("Windows.System.UserProfile.GlobalizationPreferences", IGlobalizationPreferencesStatics := "{01BF4326-ED37-4E96-B0E9-C1340D1EA158}", GlobalizationPreferencesStatics)
      DllCall(NumGet(NumGet(GlobalizationPreferencesStatics+0)+9*A_PtrSize), "ptr", GlobalizationPreferencesStatics, "ptr*", LanguageList)   ; get_Languages
      DllCall(NumGet(NumGet(LanguageList+0)+7*A_PtrSize), "ptr", LanguageList, "int*", count)   ; count
      loop % count
      {
         DllCall(NumGet(NumGet(LanguageList+0)+6*A_PtrSize), "ptr", LanguageList, "int", A_Index-1, "ptr*", hString)   ; get_Item
         DllCall(NumGet(NumGet(LanguageFactory+0)+6*A_PtrSize), "ptr", LanguageFactory, "ptr", hString, "ptr*", LanguageTest)   ; CreateLanguage
         DllCall(NumGet(NumGet(OcrEngineStatics+0)+8*A_PtrSize), "ptr", OcrEngineStatics, "ptr", LanguageTest, "int*", bool)   ; IsLanguageSupported
         if (bool = 1)
         {
            DllCall(NumGet(NumGet(LanguageTest+0)+6*A_PtrSize), "ptr", LanguageTest, "ptr*", hText)
            buffer := DllCall("Combase.dll\WindowsGetStringRawBuffer", "ptr", hText, "uint*", length, "ptr")
            text .= StrGet(buffer, "UTF-16") "`n"
         }
         ObjRelease(LanguageTest)
      }
      ObjRelease(LanguageList)
      return text
   }
   if (lang != CurrentLanguage) or (lang = "FirstFromAvailableLanguages")
   {
      if (OcrEngine != "")
      {
         ObjRelease(OcrEngine)
         if (CurrentLanguage != "FirstFromAvailableLanguages")
            ObjRelease(Language)
      }
      if (lang = "FirstFromAvailableLanguages")
         DllCall(NumGet(NumGet(OcrEngineStatics+0)+10*A_PtrSize), "ptr", OcrEngineStatics, "ptr*", OcrEngine)   ; TryCreateFromUserProfileLanguages
      else
      {
         CreateHString(lang, hString)
         DllCall(NumGet(NumGet(LanguageFactory+0)+6*A_PtrSize), "ptr", LanguageFactory, "ptr", hString, "ptr*", Language)   ; CreateLanguage
         DeleteHString(hString)
         DllCall(NumGet(NumGet(OcrEngineStatics+0)+9*A_PtrSize), "ptr", OcrEngineStatics, ptr, Language, "ptr*", OcrEngine)   ; TryCreateFromLanguage
      }
      if (OcrEngine = 0)
      {
         msgbox Can not use language "%lang%" for OCR, please install language pack.
         ExitApp
      }
      CurrentLanguage := lang
   }
   IRandomAccessStream := file
   DllCall(NumGet(NumGet(BitmapDecoderStatics+0)+14*A_PtrSize), "ptr", BitmapDecoderStatics, "ptr", IRandomAccessStream, "ptr*", BitmapDecoder)   ; CreateAsync
   WaitForAsync(BitmapDecoder)
   BitmapFrame := ComObjQuery(BitmapDecoder, IBitmapFrame := "{72A49A1C-8081-438D-91BC-94ECFC8185C6}")
   DllCall(NumGet(NumGet(BitmapFrame+0)+12*A_PtrSize), "ptr", BitmapFrame, "uint*", width)   ; get_PixelWidth
   DllCall(NumGet(NumGet(BitmapFrame+0)+13*A_PtrSize), "ptr", BitmapFrame, "uint*", height)   ; get_PixelHeight
   if (width > MaxDimension) or (height > MaxDimension)
   {
      msgbox Image is to big - %width%x%height%.`nIt should be maximum - %MaxDimension% pixels
      ExitApp
   }
   BitmapFrameWithSoftwareBitmap := ComObjQuery(BitmapDecoder, IBitmapFrameWithSoftwareBitmap := "{FE287C9A-420C-4963-87AD-691436E08383}")
   DllCall(NumGet(NumGet(BitmapFrameWithSoftwareBitmap+0)+6*A_PtrSize), "ptr", BitmapFrameWithSoftwareBitmap, "ptr*", SoftwareBitmap)   ; GetSoftwareBitmapAsync
   WaitForAsync(SoftwareBitmap)
   DllCall(NumGet(NumGet(OcrEngine+0)+6*A_PtrSize), "ptr", OcrEngine, ptr, SoftwareBitmap, "ptr*", OcrResult)   ; RecognizeAsync
   WaitForAsync(OcrResult)
   DllCall(NumGet(NumGet(OcrResult+0)+6*A_PtrSize), "ptr", OcrResult, "ptr*", LinesList)   ; get_Lines
   DllCall(NumGet(NumGet(LinesList+0)+7*A_PtrSize), "ptr", LinesList, "int*", count)   ; count
   loop % count
   {
      DllCall(NumGet(NumGet(LinesList+0)+6*A_PtrSize), "ptr", LinesList, "int", A_Index-1, "ptr*", OcrLine)
      DllCall(NumGet(NumGet(OcrLine+0)+7*A_PtrSize), "ptr", OcrLine, "ptr*", hText) 
      buffer := DllCall("Combase.dll\WindowsGetStringRawBuffer", "ptr", hText, "uint*", length, "ptr")
      text .= StrGet(buffer, "UTF-16") "`n"
      ObjRelease(OcrLine)
   }
   Close := ComObjQuery(IRandomAccessStream, IClosable := "{30D5A829-7FA4-4026-83BB-D75BAE4EA99E}")
   DllCall(NumGet(NumGet(Close+0)+6*A_PtrSize), "ptr", Close)   ; Close
   ObjRelease(Close)
   Close := ComObjQuery(SoftwareBitmap, IClosable := "{30D5A829-7FA4-4026-83BB-D75BAE4EA99E}")
   DllCall(NumGet(NumGet(Close+0)+6*A_PtrSize), "ptr", Close)   ; Close
   ObjRelease(Close)
   ObjRelease(IRandomAccessStream)
   ObjRelease(BitmapDecoder)
   ObjRelease(BitmapFrame)
   ObjRelease(BitmapFrameWithSoftwareBitmap)
   ObjRelease(SoftwareBitmap)
   ObjRelease(OcrResult)
   ObjRelease(LinesList)
   return text
}



CreateClass(string, interface, ByRef Class)
{
   CreateHString(string, hString)
   VarSetCapacity(GUID, 16)
   DllCall("ole32\CLSIDFromString", "wstr", interface, "ptr", &GUID)
   result := DllCall("Combase.dll\RoGetActivationFactory", "ptr", hString, "ptr", &GUID, "ptr*", Class)
   if (result != 0)
   {
      if (result = 0x80004002)
         msgbox No such interface supported
      else if (result = 0x80040154)
         msgbox Class not registered
      else
         msgbox error: %result%
      ExitApp
   }
   DeleteHString(hString)
}

CreateHString(string, ByRef hString)
{
    DllCall("Combase.dll\WindowsCreateString", "wstr", string, "uint", StrLen(string), "ptr*", hString)
}

DeleteHString(hString)
{
   DllCall("Combase.dll\WindowsDeleteString", "ptr", hString)
}

WaitForAsync(ByRef Object)
{
   AsyncInfo := ComObjQuery(Object, IAsyncInfo := "{00000036-0000-0000-C000-000000000046}")
   loop
   {
      DllCall(NumGet(NumGet(AsyncInfo+0)+7*A_PtrSize), "ptr", AsyncInfo, "uint*", status)   ; IAsyncInfo.Status
      if (status != 0)
      {
         if (status != 1)
         {
            DllCall(NumGet(NumGet(AsyncInfo+0)+8*A_PtrSize), "ptr", AsyncInfo, "uint*", ErrorCode)   ; IAsyncInfo.ErrorCode
            msgbox AsyncInfo status error: %ErrorCode%
            ExitApp
         }
         ObjRelease(AsyncInfo)
         break
      }
      sleep 10
   }
   DllCall(NumGet(NumGet(Object+0)+8*A_PtrSize), "ptr", Object, "ptr*", ObjectResult)   ; GetResults
   ObjRelease(Object)
   Object := ObjectResult
}
fryquez
Enthusiast
Enthusiast
Posts: 362
Joined: Mon Dec 21, 2015 8:12 pm

Re: Windows 10 OCR

Post by fryquez »

Windows RunTime is mostly Async, what means you should never put it in your main thread.

Code: Select all

EnableExplicit

DeclareModule WinOCR
  Declare.s get_Languages()  
  Declare.s get_TextFromFile(sFile.s, sLanguage.s = "")
  Declare.s get_TextFromImageID(ImageID, sLanguage.s = "")
EndDeclareModule

Module WinOCR
  
  EnableExplicit
  
  
  #MyRoDebug = 0
  
  Interface IInspectable Extends IUnknown
    GetIids(*iidCount, *iids)
    GetRuntimeClassName(*className)
    GetTrustLevel(*trustLevel)
  EndInterface
  
  Interface IClosable Extends IInspectable
    Close()
  EndInterface
  
  Interface ILanguageFactory Extends IInspectable
    createLanguage(*string, *out)
  EndInterface
  
  Interface ILanguage Extends IInspectable
    get_LanguageTag(*value)
    get_DisplayName(*value)
    get_NativeName(*value)
    get_Script(*value)
  EndInterface
  
  Interface IBitmapDecoderStatics Extends IInspectable
    BmpDecoderId(*value.guid)
    JpegDecoderId(*value.guid)
    PngDecoderId(*value.guid)  
    TiffDecoderId(*value.guid)
    GifDecoderId(*value.guid)
    JpegXRDecoderId(*value.guid)
    IcoDecoderId(*value.guid)  
    GetDecoderInformationEnumerator(*out)
    CreateAsync(*in, *out)
    CreateWithIdAsync(*decoderId.guid, *in, *out)
  EndInterface
  
  Interface IBitmapDecoder Extends IInspectable
    BitmapContainerProperties(*value)
    DecoderInformation(*value)
    FrameCount(*value)
    GetPreviewAsync(*value)
    GetFrameAsync(frameIndex, *value)
  EndInterface
  
  Interface IBitmapFrameWithSoftwareBitmap Extends IInspectable
    GetSoftwareBitmapAsync(value)
  EndInterface
  
  Interface IBitmapFrame Extends IInspectable
    GetThumbnailAsync(*asyncInfo)
    BitmapProperties(*value)
    BitmapPixelFormat(*value)
    BitmapAlphaMode(*value)
    DpiX(*value)
    DpiY(*value)
    PixelWidth(*value)
    PixelHeight(*value)
    OrientedPixelWidth(*value)
    OrientedPixelHeight(*value)
    GetPixelDataAsync(*asyncInfo)
    GetPixelDataTransformedAsync(pixelFormat, alphaMode, transform, exifOrientationMode, colorManagementMode, *asyncInfo)
  EndInterface
  
  Interface ISoftwareBitmap Extends IInspectable
    get_BitmapPixelFormat(*value)
    get_BitmapAlphaMode(*value)
    get_PixelWidth(*value)
    get_PixelHeight(*value)
    get_IsReadOnly(*value)
    put_DpiX(*value)
    get_DpiX(*value)
    put_DpiY(*value)
    get_DpiY(*value)
    LockBuffer(mode, *value)
    CopyTo(*bitmap)
    CopyFromBuffer(*buffer)
    GetReadOnlyView(*value)
  EndInterface
  
  Interface IOcrEngineStatics Extends IInspectable
    MaxDimensions(*value)
    AvailableRecognizerLanguages(*value)
    IsLanguageSupported(*Language,*Result)
    TryCreateFromLanguage(*Language,*Result)
    TryCreateFromUserProfileLanguages(*Result)
  EndInterface
  
  Interface IOcrEngine Extends IInspectable
    RecognizeAsync(*bitmap, *result)
    RecognizerLanguage(*value)
  EndInterface
  
  Interface IOcrResult Extends IInspectable
    Lines(*value)
    TextAngle(*value)
    Text(*value)
  EndInterface
  
  Interface IRandomAccessStream Extends IInspectable 
    get_Size(*value)
    put_Size(*value)
    GetInputStreamAt(position.q, *stream)
    GetOutputStreamAt(position.q, *stream)
    get_Position(*value)
    Seek(position.q)
    CloneStream(*stream)
    get_CanRead(*value)
    get_CanWrite(*value)
  EndInterface
  
  Interface IAsyncInfo Extends IInspectable 
    Id(*id)
    Status(*status)
    ErrorCode(*errorCode)
    Cancel()
    Close()
  EndInterface
  
  Interface IAsyncOperationWithProgress Extends IInspectable 
    Completed(asyncInfo, asyncStatus)
    Progress(asyncInfo, progressInfo)
    GetResults(*Object)
  EndInterface
  
  
  Interface IGlobalizationPreferencesStatics Extends IInspectable
    get_Calendars(*value)
    get_Clocks(*value)
    get_Currencies(*value)
    get_Languages(*value)
    get_HomeGeographicRegion(*value)
    get_WeekStartsOn(*value)
  EndInterface
  
  Interface IReadOnlyList Extends IInspectable
    get_Item(index, *hString)
    count(*value)  
  EndInterface
  
  ;- DataSection
  
  DataSection
    IID_IBitmapFrame:
    Data.l $72A49A1C
    Data.w $8081, $438D
    Data.b $91, $BC, $94, $EC, $FC, $81, $85, $C6
    
    IID_IRandomAccessStream:
    Data.l $905A0FE1
    Data.w $BC53, $11DF
    Data.b $8C, $49, $0, $1E, $4F, $C6, $86, $DA
    
    IID_IClosable:
    Data.l $30D5A829
    Data.w $7FA4, $4026
    Data.b $83, $BB, $D7, $5B, $AE, $4E, $A9, $9E
    
    IID_IBitmapFrameWithSoftwareBitmap:
    Data.l $FE287C9A
    Data.w $420C, $4963
    Data.b $87, $AD, $69, $14, $36, $E0, $83, $83
    
    IID_IAsyncInfo:
    Data.l $00000036
    Data.w $00, $00
    Data.b $C0, $00, $00, $00, $00, $00, $00, $46
    
    IID_IPicture:
    Data.l $7BF80980
    Data.w $BF32, $101A
    Data.b $8B, $BB, $00, $AA, $0, $30, $0C, $AB
    
    IID_ILanguageFactory:
    Data.l $9B0252AC
    Data.w $C27, $44F8
    Data.b $B7, $92, $97, $93, $FB, $66, $C6, $3E
    
    IID_IBitmapDecoderStatics:
    Data.l $438CCB26
    Data.w $BCEF, $4E95
    Data.b $BA, $D6, $23, $A8, $22, $E5, $8D, $01
    
    IID_IOcrEngineStatics:
    Data.l $5BFFA85A
    Data.w $3384, $3540
    Data.b $99, $40, $69, $91, $20, $D4, $28, $A8
    
    IID_IGlobalizationPreferencesStatics:
    Data.l $1BF4326
    Data.w $ED37, $4E96
    Data.b $B0, $E9, $C1, $34, $0D, $1E, $A1, $58
    
  EndDataSection
  
  
  Prototype pRoInitialize(initType)
  Prototype pWindowsCreateString(a, b, c)
  Prototype pRoGetActivationFactory(a, b, c)
  Prototype pWindowsDeleteString(a)
  Prototype pWindowsGetStringRawBuffer(a, length)  
  Prototype pCreateRandomAccessStreamOnFile(file, iRead, *GUID, *out)
  Prototype pCreateRandomAccessStreamOverStream(*stream, options, *riid, *ppv)
  
  Structure DynamicRuntimeFuncs
    WindowsCreateString.pWindowsCreateString
    WindowsDeleteString.pWindowsDeleteString
    RoGetActivationFactory.pRoGetActivationFactory
    WindowsGetStringRawBuffer.pWindowsGetStringRawBuffer
    CreateRandomAccessStreamOnFile.pCreateRandomAccessStreamOnFile
    CreateRandomAccessStreamOverStream.pCreateRandomAccessStreamOverStream
    RoInitialize.pRoInitialize
  EndStructure
  
  Global RT_Funcs.DynamicRuntimeFuncs
  Global RT_INIT_DONE
  
  Structure MyRTIntArray
    i.i[0]
  EndStructure
  
  
  
  Procedure RT_Init()
    
    Protected hr.l
    
    #RO_INIT_SINGLETHREADED = 0
    #RO_INIT_MULTITHREADED = 1
    
    Protected hCombase = OpenLibrary(#PB_Any, "combase.dll")
    If Not hCombase
      ProcedureReturn 0
    EndIf
    
    RT_Funcs\WindowsCreateString = GetFunction(hCombase, "WindowsCreateString")
    RT_Funcs\WindowsDeleteString = GetFunction(hCombase, "WindowsDeleteString")
    RT_Funcs\RoGetActivationFactory = GetFunction(hCombase, "RoGetActivationFactory")
    RT_Funcs\WindowsGetStringRawBuffer = GetFunction(hCombase, "WindowsGetStringRawBuffer")
    RT_Funcs\RoInitialize = GetFunction(hCombase, "RoInitialize")
    
    hCombase = OpenLibrary(#PB_Any, "SHCore.dll")
    If Not hCombase
      ProcedureReturn 0
    EndIf
    
    RT_Funcs\CreateRandomAccessStreamOnFile = GetFunction(hCombase, "CreateRandomAccessStreamOnFile")
    RT_Funcs\CreateRandomAccessStreamOverStream = GetFunction(hCombase, "CreateRandomAccessStreamOverStream")
    
    
    RT_INIT_DONE = 1
    
    Protected *i.MyRTIntArray = @RT_Funcs, i
    For i = 0 To (SizeOf(RT_Funcs) / SizeOf(Integer)) - 1
      If Not *i\i[i]
        RT_INIT_DONE = 0
        Break
      EndIf
    Next
    
    If RT_INIT_DONE
      hr = RT_Funcs\RoInitialize(#RO_INIT_MULTITHREADED)
      CompilerIf #MyRoDebug : Debug "RoInitialize = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
      ;RPC_E_CHANGED_MODE = $80010106
    EndIf
    
    ProcedureReturn RT_INIT_DONE
  EndProcedure
  
  
  
  Procedure CreateHString(sString.s, *hString)
    ProcedureReturn RT_Funcs\WindowsCreateString(@sString, Len(sString), *hString)
  EndProcedure
  
  Procedure DeleteHString(*hString)
    ProcedureReturn RT_Funcs\WindowsDeleteString(*hString)
  EndProcedure
  
  Procedure CreateClass(sString.s, sGUID.s, *OutClass)
    
    Protected hString, GUID.GUID, iReturn
    CreateHString(sString, @hString)
    CLSIDFromString_(@sGUID, @GUID)
    
    iReturn = RT_Funcs\RoGetActivationFactory(hString, @GUID, *OutClass)
    RT_Funcs\WindowsDeleteString(hString)
    ProcedureReturn iReturn
    
  EndProcedure
  
  
  
  Procedure WaitForAsync(*InOut.Integer)
    
    Protected *Object.IAsyncOperationWithProgress = *InOut\i  
    Protected IAsyncInfo.IAsyncInfo, status.l, ErrorCode.l, hr.l
    
    hr = *Object\QueryInterface(?IID_IAsyncInfo, @IAsyncInfo)
    ;Debug "WaitForAsync QueryInterface = 0x" + Hex(hr)
    
    If Not IAsyncInfo
      Debug "ER QueryInterface IID_IAsyncInfo failed!"
      ProcedureReturn 0
    EndIf
    
    While 1    
      hr = IAsyncInfo\Status(@status)
      ;     Debug "IAsyncInfo::Status = 0x" + Hex(hr) + " - " + Hex(status)
      
      If (status <> 0)
        If (status <> 1)
          IAsyncInfo\ErrorCode(@ErrorCode)
          Debug "ER IAsyncInfo status error 0x" + Hex(ErrorCode, #PB_Long)
          End
        EndIf      
        IAsyncInfo\Release()
        Break
      EndIf
      Sleep_(10)
    Wend
    
    Protected ObjectResult
    
    *Object\GetResults(@ObjectResult)
    If ObjectResult
      *Object\Release()
      
      *InOut\i = ObjectResult
    EndIf
    ProcedureReturn ObjectResult
    
  EndProcedure
  
  
  
  Structure PICTDESC_bmp
    hbitmap.i
    hpal.i
  EndStructure
  
  Structure PICTDESC_wmf
    hmete.i
    xExt.l
    yExt.l
  EndStructure
  
  Structure PICTDESC_icon
    hicon.i
  EndStructure
  
  Structure PICTDESC_emf
    hemf.i
  EndStructure
  
  Structure PICTDESC
    cbSizeofstruct.l
    picType.l
    StructureUnion
      bmp.PICTDESC_bmp
      wmf.PICTDESC_wmf
      icon.PICTDESC_icon
      emf.PICTDESC_emf    
    EndStructureUnion
  EndStructure
  
  
  Procedure HBitmapToRandomAccessStream(hBitmap)
    
    Protected pIStream.IStream, pIPicture.IPicture, hr.l
    hr = CreateStreamOnHGlobal_(0, #True, @pIStream)
    If pIStream
      
      #PICTYPE_BITMAP = 1
      
      Protected PD.PICTDESC
      PD\cbSizeofstruct = SizeOf(PICTDESC)
      PD\picType = #PICTYPE_BITMAP
      PD\bmp\hbitmap = hBitmap
      
      hr = OleCreatePictureIndirect_(@PD, ?IID_IPicture, #False, @pIPicture)
      If pIPicture
        
        Protected cbSize
        hr = pIPicture\SaveAsFile(pIStream, #True, 0)
        
        Protected pIRandomAccessStream.IRandomAccessStream
        #BSOS_DEFAULT = 0
        hr = RT_Funcs\CreateRandomAccessStreamOverStream(pIStream, #BSOS_DEFAULT, ?IID_IRandomAccessStream, @pIRandomAccessStream)
        pIPicture\Release()
      EndIf
      
      pIStream\Release()
    EndIf
    
    ProcedureReturn pIRandomAccessStream
    
  EndProcedure
  
  
  
  Procedure.s GetOCRFromImage(sFile.s, hBitmap = 0, sLanguage.s = "", bReturnLanguages = 0)
    
    If Not RT_INIT_DONE
      RT_Init()
      If Not RT_INIT_DONE
        ProcedureReturn ""
      EndIf
    EndIf
    
    Static ILanguageFactory.ILanguageFactory
    Static IBitmapDecoderStatics.IBitmapDecoderStatics
    Static IOcrEngineStatics.IOcrEngineStatics
    Static IGlobalizationPreferencesStatics.IGlobalizationPreferencesStatics
    Static MaxDimension
    
    Protected hString, hr.l, bSupported, i, sOCRText.s
    Protected hText, length, *p, count, hLanguage, iFrames, width, height, iUsedLanguage
    
    Protected IOcrEngine.IOcrEngine
    Protected ILanguageList.IReadOnlyList
    Protected ILanguageTest.ILanguage
    Protected IRandomAccessStream.IRandomAccessStream
    
    Protected IBitmapDecoder.IBitmapDecoder
    Protected IBitmapFrame.IBitmapFrame
    Protected IClosable.IClosable
    Protected IOcrResult.IOcrResult
    
    Protected ILinesList.IReadOnlyList
    Protected IBitmapFrameWithSoftwareBitmap.IBitmapFrameWithSoftwareBitmap
    Protected ISoftwareBitmap.ISoftwareBitmap
    Protected IOCRLine.ILanguage
    
    If ILanguageFactory = 0 Or
       IBitmapDecoderStatics = 0 Or
       IOcrEngineStatics = 0 Or
       IGlobalizationPreferencesStatics = 0
      hr = CreateClass("Windows.Globalization.Language", "{9B0252AC-0C27-44F8-B792-9793FB66C63E}", @ILanguageFactory)    
      CompilerIf #MyRoDebug : Debug "CreateClass Windows.Globalization.Language = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
      
      hr = CreateClass("Windows.Graphics.Imaging.BitmapDecoder", "{438CCB26-BCEF-4E95-BAD6-23A822E58D01}", @IBitmapDecoderStatics)
      CompilerIf #MyRoDebug : Debug "CreateClass Windows.Graphics.Imaging.BitmapDecoder = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
      
      hr = CreateClass("Windows.Media.Ocr.OcrEngine", "{5BFFA85A-3384-3540-9940-699120D428A8}", @IOcrEngineStatics)
      CompilerIf #MyRoDebug : Debug "CreateClass Windows.Media.Ocr.OcrEngine = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
      
      hr = CreateClass("Windows.System.UserProfile.GlobalizationPreferences", "{01BF4326-ED37-4E96-B0E9-C1340D1EA158}", @IGlobalizationPreferencesStatics)
      CompilerIf #MyRoDebug : Debug "CreateClass Windows.System.UserProfile.GlobalizationPreferences = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
    EndIf
    
    
    If ILanguageFactory = 0 Or
       IBitmapDecoderStatics = 0 Or
       IOcrEngineStatics = 0 Or
       IGlobalizationPreferencesStatics = 0
      Goto Release
    EndIf
    
    If Not MaxDimension
      IOcrEngineStatics\MaxDimensions(@MaxDimension)
    EndIf
    
    
    
    
    
    If sLanguage = ""
      
      hr = IGlobalizationPreferencesStatics\get_Languages(@ILanguageList)
      CompilerIf #MyRoDebug : Debug "IGlobalizationPreferencesStatics::get_Languages = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
      
      
      
      If ILanguageList
        hr = ILanguageList\count(@count)
        CompilerIf #MyRoDebug : Debug "ILanguageList::count = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
        
        For i = 0 To count -1
          hr = ILanguageList\get_Item(0, @hString)
          CompilerIf #MyRoDebug : Debug "ILanguageList::get_Item = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
          
          hr = ILanguageFactory\createLanguage(hString, @ILanguageTest)
          CompilerIf #MyRoDebug : Debug "ILanguageFactory::createLanguage = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
          
          If ILanguageTest
            hr = IOcrEngineStatics\IsLanguageSupported(ILanguageTest, @bSupported)
            CompilerIf #MyRoDebug : Debug "IOcrEngineStatics::IsLanguageSupported = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
            
            If bSupported
              hText = 0
              ILanguageTest\get_LanguageTag(@hText)
              CompilerIf #MyRoDebug : Debug "ILanguageTest::get_LanguageTag = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
              
              If hText
                *p = RT_Funcs\WindowsGetStringRawBuffer(hText, @length)
                If *p
                  sLanguage = PeekS(*p)
                  If bReturnLanguages
                    sOCRText + sLanguage + #CRLF$
                  EndIf
                  ;Debug sLanguage
                EndIf
                
              EndIf
              
            EndIf
            
            ILanguageTest\Release()
          EndIf
        Next
        
        ILanguageList\Release()
      EndIf
      
    EndIf
    
    
    If bReturnLanguages
      Goto Release
    EndIf
    
    hString = 0
    CreateHString(sLanguage, @hString)
    If hString
      hr = ILanguageFactory\createLanguage(hString, @iUsedLanguage)
      CompilerIf #MyRoDebug : Debug "ILanguageFactory::createLanguage = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
      
      hr = RT_Funcs\WindowsDeleteString(hString)
      CompilerIf #MyRoDebug : Debug "WindowsDeleteString = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
      
      hr = IOcrEngineStatics\TryCreateFromLanguage(iUsedLanguage, @IOcrEngine)
      CompilerIf #MyRoDebug : Debug "IOcrEngineStatics::TryCreateFromLanguage = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
      
      If IOcrEngine
        
        If hBitmap
          IRandomAccessStream = HBitmapToRandomAccessStream(hBitmap)
        Else
          hr = RT_Funcs\CreateRandomAccessStreamOnFile(@sFile, 0, ?IID_IRandomAccessStream, @IRandomAccessStream)
          CompilerIf #MyRoDebug : Debug "CreateRandomAccessStreamOnFile = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
        EndIf
        
        If IRandomAccessStream
          hr = IBitmapDecoderStatics\CreateAsync(IRandomAccessStream, @IBitmapDecoder)
          CompilerIf #MyRoDebug : Debug "IBitmapDecoderStatics::CreateAsync = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
          
          If IBitmapDecoder
            WaitForAsync(@IBitmapDecoder)
            
            hr = IBitmapDecoder\FrameCount(@iFrames)
            CompilerIf #MyRoDebug : Debug "IBitmapDecoder::FrameCount = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
            
            hr = IBitmapDecoder\QueryInterface(?IID_IBitmapFrame, @IBitmapFrame)
            CompilerIf #MyRoDebug : Debug "IBitmapDecoder::QueryInterface = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
            
            If IBitmapFrame
              hr = IBitmapFrame\PixelWidth(@width)
              CompilerIf #MyRoDebug : Debug "IBitmapFrame::PixelWidth = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
              
              hr = IBitmapFrame\PixelHeight(@height)
              CompilerIf #MyRoDebug : Debug "IBitmapFrame::PixelHeight = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
              
              If width > MaxDimension Or height > MaxDimension
                Debug "ER: Image is to big"
              Else
                
                
                hr = IBitmapDecoder\QueryInterface(?IID_IBitmapFrameWithSoftwareBitmap, @IBitmapFrameWithSoftwareBitmap)
                CompilerIf #MyRoDebug : Debug "IBitmapDecoder::QueryInterface = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
                
                If IBitmapFrameWithSoftwareBitmap
                  hr = IBitmapFrameWithSoftwareBitmap\GetSoftwareBitmapAsync(@ISoftwareBitmap)
                  CompilerIf #MyRoDebug : Debug "IBitmapFrameWithSoftwareBitmap::GetSoftwareBitmapAsync = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
                  
                  If ISoftwareBitmap
                    WaitForAsync(@ISoftwareBitmap)
                    
                    IOcrEngine\RecognizeAsync(ISoftwareBitmap, @IOcrResult)
                    CompilerIf #MyRoDebug : Debug "IOcrEngine::RecognizeAsync = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
                    
                    If IOcrResult
                      WaitForAsync(@IOcrResult)
                      
                      hr = IOcrResult\Lines(@ILinesList)
                      CompilerIf #MyRoDebug : Debug "IOcrResult::Lines = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
                      
                      If ILinesList
                        count = 0
                        hr = ILinesList\count(@count)
                        CompilerIf #MyRoDebug : Debug "ILinesList::count = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
                        
                        For i = 0 To count -1
                          hText = 0
                          hr = ILinesList\get_Item(i, @IOCRLine)
                          CompilerIf #MyRoDebug : Debug "ILinesList::get_Item = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
                          
                          hr = IOCRLine\get_DisplayName(@hText)
                          CompilerIf #MyRoDebug : Debug "IOCRLine::get_DisplayName = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
                          
                          If hText
                            *p = RT_Funcs\WindowsGetStringRawBuffer(hText, 0)
                            If *p
                              sOCRText + PeekS(*p) + #CRLF$
                            EndIf
                          EndIf
                          IOCRLine\Release()
                          
                        Next
                        
                      EndIf
                    EndIf
                  EndIf
                EndIf
                
                
              EndIf
            EndIf
            
          EndIf
          
          
          
          
          hr = IRandomAccessStream\QueryInterface(?IID_IClosable, @IClosable)
          CompilerIf #MyRoDebug : Debug "IRandomAccessStream::QueryInterface = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
          
          If IClosable
            IClosable\Close()
            IClosable\Release()
            IClosable = 0
          EndIf
          IRandomAccessStream\Release()
          
          If ISoftwareBitmap
            hr = ISoftwareBitmap\QueryInterface(?IID_IClosable, @IClosable)
            CompilerIf #MyRoDebug : Debug "ISoftwareBitmap::QueryInterface = 0x" + Hex(hr, #PB_Long) : CompilerEndIf
            If IClosable
              IClosable\Close()
              IClosable\Release()
              IClosable = 0
            EndIf
          EndIf
          
        EndIf
        
        IOcrEngine\Release()
        
      EndIf
      
    EndIf
    
    Release:
    
    If IBitmapDecoder : IBitmapDecoder\Release() : EndIf
    If IBitmapFrame : IBitmapFrame\Release() : EndIf
    If IBitmapFrameWithSoftwareBitmap : IBitmapFrameWithSoftwareBitmap\Release() : EndIf
    If ISoftwareBitmap : ISoftwareBitmap\Release() : EndIf
    If IOcrResult : IOcrResult\Release() : EndIf
    If ILinesList : ILinesList\Release() : EndIf
    
    CompilerIf #MyRoDebug : Debug "### Func End ###" + #CRLF$ : CompilerEndIf
    
    ProcedureReturn sOCRText
    
  EndProcedure
  
  ;-
  
  Procedure.s get_Languages()
    ProcedureReturn GetOCRFromImage("", 0, "", 1)
  EndProcedure
  
  Procedure.s get_TextFromFile(sFile.s, sLanguage.s = "")
    ProcedureReturn GetOCRFromImage(sFile, 0, sLanguage, 0)
  EndProcedure
  
  Procedure.s get_TextFromImageID(ImageID, sLanguage.s = "")
    ProcedureReturn GetOCRFromImage("", ImageID, sLanguage, 0)
  EndProcedure
  
EndModule



Structure tOCR_DEMO
  sFile.s
  ImageID.i
  ;
  sLanguages.s  
  sFromFile.s
  sFromImage.s
EndStructure



Procedure OCR_DEMO(*OCR_DEMO.tOCR_DEMO)
  *OCR_DEMO\sLanguages = WinOCR::get_Languages()
  If *OCR_DEMO\sFile <> ""
    *OCR_DEMO\sFromFile = WinOCR::get_TextFromFile(*OCR_DEMO\sFile)
  EndIf
  If *OCR_DEMO\ImageID
    *OCR_DEMO\sFromImage = WinOCR::get_TextFromImageID(*OCR_DEMO\ImageID)
  EndIf
EndProcedure



Procedure HBitmapFromScreen(X, Y, W, H)
  Protected HDC = GetDC_(0)
  Protected HBM = CreateCompatibleBitmap_(HDC, W, H)
  Protected PDC = CreateCompatibleDC_(HDC)
  SelectObject_(PDC, HBM)
  BitBlt_(PDC, 0, 0, W, H, HDC, X, Y, #SRCCOPY)
  DeleteDC_(PDC)
  ReleaseDC_(0, HDC)
  ProcedureReturn HBM
EndProcedure


UsePNGImageDecoder()
UseJPEGImageDecoder()

Define tOCR_DEMO.tOCR_DEMO

tOCR_DEMO\sFile = OpenFileRequester("Choose Image File", "Image.png", "ImageFiles (*.png;jpg;gif;bmp)|*.png;*.jpg;*.gif;*.bmp", 0)

;LoadImage(0, tOCR_DEMO\sFile)
;tOCR_DEMO\ImageID = ImageID(0)

tOCR_DEMO\ImageID = HBitmapFromScreen(0, 0, 800, 600)

Define thread = CreateThread(@OCR_DEMO(), @tOCR_DEMO)
If thread
  WaitThread(thread)
  
  MessageRequester("Avalible Languages", tOCR_DEMO\sLanguages)
  
  MessageRequester("TextFromFile", tOCR_DEMO\sFromFile)
  
  MessageRequester("TextFromImageID", tOCR_DEMO\sFromImage)
  
EndIf
User avatar
ChrisR
Addict
Addict
Posts: 1127
Joined: Sun Jan 08, 2017 10:27 pm
Location: France

Re: Windows 10 OCR

Post by ChrisR »

Beautiful, I tested it on a file properties screenshot, I have a few characters missing but the result is superb and perfectly readable for the 3: Languages, screenshot image and the BitmapFromScreen :)
It could be in the Tricks 'n' Tips section.
BarryG
Addict
Addict
Posts: 3292
Joined: Thu Apr 18, 2019 8:17 am

Re: Windows 10 OCR

Post by BarryG »

fryquez, that code works great! Thanks!
fryquez
Enthusiast
Enthusiast
Posts: 362
Joined: Mon Dec 21, 2015 8:12 pm

Re: Windows 10 OCR

Post by fryquez »

Yes, I'll post a separate Tricks 'n' Tips topic.
I just found something that wants to be added to the code :D
User avatar
Caronte3D
Addict
Addict
Posts: 1027
Joined: Fri Jan 22, 2016 5:33 pm
Location: Some Universe

Re: Windows 10 OCR

Post by Caronte3D »

Awesome! :D
User avatar
Mijikai
Addict
Addict
Posts: 1360
Joined: Sun Sep 11, 2016 2:17 pm

Re: Windows 10 OCR

Post by Mijikai »

WaitForAsync could be implemented without sleep.

In C++ it looks like this according to https://kennykerr.ca/2018/03/08/cppwinr ... ompletion/:

Code: Select all

template <typename T>
auto get(T const& async)
{
    if (async.Status() != AsyncStatus::Completed)
    {
        handle signal = CreateEvent(nullptr, true, false, nullptr);
 
        async.Completed([&](auto&&, auto&&)
        {
            SetEvent(signal.get());
        });
 
        WaitForSingleObject(signal.get(), INFINITE);
    }
 
    return async.GetResults();
}
But i dont know how to convert it to PB :?
fryquez
Enthusiast
Enthusiast
Posts: 362
Joined: Mon Dec 21, 2015 8:12 pm

Re: Windows 10 OCR

Post by fryquez »

If this 10ms bother you try to setup and completion event handler.
User avatar
Mijikai
Addict
Addict
Posts: 1360
Joined: Sun Sep 11, 2016 2:17 pm

Re: Windows 10 OCR

Post by Mijikai »

fryquez wrote: Tue Aug 31, 2021 3:18 pm ...bother you...
...it was just a suggestion to improve the code.
Post Reply