Monitoring a directory for new files being added

Just starting out? Need help? Post your questions and find answers here.
Arbitrage
New User
New User
Posts: 4
Joined: Sun Sep 21, 2003 3:21 pm

Post by Arbitrage »

Something cool I just discovered. Office XP and Office 2003 have OCR capabilities. There is also a COM OCX interface to the OCR. It seems
you could monitor a directory for incoming TIFF images then OCR the
data and store the data in MYSQL by document and page number. You
could then write an interface to search for a particular document or
document all documents that have certain words in them and return a
link to the scanned image. Do a search for MODI and microsoft office
dell_jockey
Enthusiast
Enthusiast
Posts: 767
Joined: Sat Jan 24, 2004 6:56 pm

Post by dell_jockey »

@ebs: OR-ing them, of course... :oops: Thanks!
cheers,
dell_jockey
________
http://blog.forex-trading-ideas.com
Sparkie
PureBatMan Forever
PureBatMan Forever
Posts: 2307
Joined: Tue Feb 10, 2004 3:07 am
Location: Ohio, USA

Post by Sparkie »

Ok, here's what I have so far in regards to ReadDirectoryChangesW. It requires NT4/2000/XP/2003.

Any and all comments and suggestions are welcome, especially about using threads as in this code. :wink:

Code: Select all

;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; App Name: ReadDirectoryChangesW
; Author  : Sparkie
; Date    : April 23, 2005
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Check for NT4/2000/XP/2003
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Select OSVersion()
  Case #PB_OS_Windows_NT_4
    osResult = #True
  Case #PB_OS_Windows_2000
    osResult = #True
  Case #PB_OS_Windows_XP
    osResult = #True
  Case #PB_OS_Windows_Server_2003
    osResult = #True
  Default
    osResult = #False
EndSelect
os$ = "Windows NT4" + #CRLF$ + "Windows 2000" + #CRLF$ + "Windows XP" + #CRLF$ + "Windows Server 2003"
If osResult = #False
  MessageRequester("Incorrect OS", "This program requires one of the following OS's:" + #CRLF$ + os$)
  End
EndIf
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Window Enumerations
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Enumeration
  #Win_Main
EndEnumeration
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Gadget Enumerations
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Enumeration
  #Button_Dir
  #Text_Dir
  #ListIcon_RDCW
EndEnumeration
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Constants for use in ReadDirectoryChangesW function
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Use these flags in the CallFunctionFast(*pRDCW, ...) line
#FILE_NOTIFY_CHANGE_FILE_NAME = 1 
; Any file name change in the watched directory or subtree 
; causes a change notification wait operation to return. 
; Changes include renaming, creating, or deleting a file name. 
#FILE_NOTIFY_CHANGE_DIR_NAME = 2 
; Any directory-name change in the watched directory or 
; subtree causes a change notification wait operation to return. 
; Changes include creating or deleting a directory. 
#FILE_NOTIFY_CHANGE_ATTRIBUTES = 4 
; Any attribute change in the watched directory or subtree causes 
; a change notification wait operation to return. 
#FILE_NOTIFY_CHANGE_SIZE = 8 
; Any file-size change in the watched directory or subtree causes 
; a change notification wait operation to return. The operating 
; system detects a change in file size only when the file is written 
; to the disk. For operating systems that use extensive caching, 
; detection occurs only when the cache is sufficiently flushed. 
#FILE_NOTIFY_CHANGE_LAST_WRITE = $10 
; Any change to the last write-time of files in the watched directory 
; or subtree causes a change notification wait operation to return. The 
; operating system detects a change to the last write-time only when 
; the file is written to the disk. For operating systems that use extensive 
; caching, detection occurs only when the cache is sufficiently flushed. 
#FILE_NOTIFY_CHANGE_LAST_ACCESS = $20
;Any change to the last access time of files in the watched directory or
; subtree causes a change notification wait operation to return.
#FILE_NOTIFY_CHANGE_CREATION = $40
;Any change to the creation time of files in the watched directory or subtree
; causes a change notification wait operation to return.
#FILE_NOTIFY_CHANGE_SECURITY = $100 
; Any security-descriptor change in the watched directory or subtree causes a
; change notification wait operation to return.
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Constants for myFILE_NOTIFY_INFORMATION (*fni) actions
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
#FILE_SHARE_DELETE = 4
#FILE_ACTION_ADDED = 1
#FILE_ACTION_REMOVED = 2
#FILE_ACTION_MODIFIED = 3
#FILE_ACTION_RENAMED_OLD_NAME = 4
#FILE_ACTION_RENAMED_NEW_NAME = 5
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Constants for myFILE_NOTIFY_INFORMATION offsets
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; This is how we access information when more than 1 action
; occurs for a file in our watched directory
#ActionOffset = 4
#CharLenOffset = 8
#FileNameOffset = 12
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; FILE_NOTIFY_INFORMATION structure
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Structure myFILE_NOTIFY_INFORMATION
  NextEntryOffset.l
  action.l
  FileNameLength.l
  FileName.w[1]
EndStructure
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Globals
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Pointer to ReadDirectoryChangesW function
Global *pRDCW
; pointer ot myFILE_NOTIFY_INFORMATION structure
Global *fni.myFILE_NOTIFY_INFORMATION
; Handle to our watched directory
Global hDir
; Variable for ending app when #True
Global quit
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Open KERNEL32.DLL for ReadDirectoryChangesW function
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If OpenLibrary(0, "KERNEL32.DLL")
  ; Get pointer to ReadDirectoryChangesW function
  *pRDCW = IsFunction(0, "ReadDirectoryChangesW")
  If *pRDCW = 0
    ; If function not found, close KERNEL32.DLL and quit
    CloseLibrary(0)
    MessageRequester("Error", "ReadDirectoryChangesW function not found.")
    End
  EndIf
Else
  ; If KERNEL32.DLL not found, quit
  MessageRequester("Error", "KERNEL32.DLL not found.")
  End
EndIf
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Procedure to convert Unicode file name to ANSI
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Procedure.s Uni2Ansi(*uni.l, uLen) 
  ansi$ = Space(uLen) 
  WideCharToMultiByte_(#CP_ACP, 0, *uni, -1, @ansi$, uLen, #Null, #Null) 
  ProcedureReturn ansi$  
EndProcedure 
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Procedure to retrieve our directory changes
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; fniOffset is used for muliple action on 1 file
Procedure GetRDCWInfo(fniOffset)
  ; Get the action
  action = PeekL(*fni + #ActionOffset + fniOffset)
  ; Get a time stamp for this action
  timeStamp$ = FormatDate("%mm/%dd/%yyyy %hh:%ii:%ss", Date())
  ; Get the UNI lenght of the file name
  uniLen = PeekL(*fni + #CharLenOffset + fniOffset)
  ; Get the ANSI version of the file name
  fileName$ = Uni2Ansi(*fni + #FileNameOffset + fniOffset, uniLen/2)
  ; Set the string to be displayed for the action
  Select action
    Case #FILE_ACTION_ADDED
      action$ = "was added to directory."
    Case #FILE_ACTION_REMOVED
      action$ = "was removed from directory."
    Case #FILE_ACTION_MODIFIED
      action$ = "attrubute or time stamp modified."
    Case #FILE_ACTION_RENAMED_OLD_NAME
      action$ = "was renamed."
    Case #FILE_ACTION_RENAMED_NEW_NAME
      action$ = "is the new file name."
  EndSelect
  ; Add the info to the ListIconGadget
  AddGadgetItem(#ListIcon_RDCW, -1, fileName$ + Chr(10) + action$ + Chr(10) + timeStamp$)
EndProcedure
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Procedure (Thread) to set the ReadDirectoryChangesW calls
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Procedure RDCW(dir) 
  ; Allocate memory for our myFILE_NOTIFY_INFORMATION pointer
  *fni = AllocateMemory(1024)
  ; Loop calls to ReadDirectoryChangesW
  ; We're watching for rename, delete, create, write
  While CallFunctionFast(*pRDCW, dir, *fni, 1024, #False, #FILE_NOTIFY_CHANGE_FILE_NAME | #FILE_NOTIFY_CHANGE_DIR_NAME | #FILE_NOTIFY_CHANGE_LAST_WRITE, @size, #Null, #Null) 
    ; Get the info for offset 0
    GetRDCWInfo(0)
    ; See if there are more entries for this call
    nextEntry = *fni\NextEntryOffset
    ; If so, get the info
    While nextEntry > 0
      If previousEntry = nextEntry 
        Break 
      EndIf 
      GetRDCWInfo(nextEntry)
      ; See if there are more entries for this call
      nextEntry = PeekL(*fni + nextEntry)
      previousEntry = nextEntry
    Wend
    ; No more entries, go to next ReadDirectoryChangesW call
  Wend
EndProcedure 
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Main window and gadgets
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
If OpenWindow(0, 0, 0, 545, 350, #PB_Window_SystemMenu | #PB_Window_ScreenCentered, "ReadDirectoryChangesW" ) And CreateGadgetList(WindowID(0)) 
  ButtonGadget(#Button_Dir, 5, 5, 150, 20, "Select Directory to watch") 
  TextGadget(#Text_Dir, 5, 5, 535, 20, "") 
  DisableGadget(#Text_Dir, 1)
  ListIconGadget(#ListIcon_RDCW, 5, 40, 535, 300, "File", 200, #PB_ListIcon_FullRowSelect | #PB_ListIcon_AlwaysShowSelection | #PB_ListIcon_GridLines) 
  AddGadgetColumn(#ListIcon_RDCW, 1, "Action", 200)
  AddGadgetColumn(#ListIcon_RDCW, 2, "Time", 130)
  quit = #False
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; Main event loop
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
  Repeat 
    event = WaitWindowEvent() 
    Select event 
      Case #PB_EventGadget 
        Select EventGadgetID() 
          Case (#Button_Dir) 
            ; Button pressed for getting directory to watch
            myPath$ = PathRequester("Select a Folder to watch", "c:\")
            ; If there is a valid path, start our ReadDirectoryChangesW thread
            If myPath$
              DisableGadget(#Button_Dir, 1)
              ; Get a handle to our watched directory
              hDir = CreateFile_(myPath$, #FILE_LIST_DIRECTORY, #FILE_SHARE_WRITE | #FILE_SHARE_READ | #FILE_SHARE_DELETE, #Null, #OPEN_EXISTING, #FILE_FLAG_BACKUP_SEMANTICS, #Null) 
              ; --> Our thread for change notifications with the directory handle
              myThread = CreateThread(@RDCW(), hDir) 
              DisableGadget(#Text_Dir, 0)
              SetGadgetText(#Text_Dir, myPath$ + " is being watched.")
            EndIf
        EndSelect 
      Case #PB_Event_CloseWindow
        FreeMemory(*fni)
        ; Kill our ReadDirectoryChangesW thread
        KillThread(myThread)
        ; Close handle to our watched directory
        CloseHandle_(hDir)
        ; If KERNEL32.DLL is open, close it
        If IsLibrary(0)
          CloseLibrary(0)
        EndIf
        quit = #True
    EndSelect 
  Until quit  
EndIf 
End
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
; ExecutableFormat=Windows
; Requires=NT4/2000/XP/Server2003
; EOF
;>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
* Edited to add >> FreeMemory(*fni)
* Edited to fix bug found by blueb
Last edited by Sparkie on Sun Apr 24, 2005 4:41 pm, edited 2 times in total.
What goes around comes around.

PB 5.21 LTS (x86) - Windows 8.1
User avatar
NoahPhense
Addict
Addict
Posts: 1999
Joined: Thu Oct 16, 2003 8:30 pm
Location: North Florida

Post by NoahPhense »

sweet!

- np
User avatar
blueb
Addict
Addict
Posts: 1044
Joined: Sat Apr 26, 2003 2:15 pm
Location: Cuernavaca, Mexico

Post by blueb »

:cry:

If I drag a file to the desired folder... everything works

If I copy the file and paste to the new folder... the program goes into an endless loop, displaying:

"attrubute or time stamp modified."


--blueb
Sparkie
PureBatMan Forever
PureBatMan Forever
Posts: 2307
Joined: Tue Feb 10, 2004 3:07 am
Location: Ohio, USA

Post by Sparkie »

Thanks blueb. I'll see if I can fix that. ;)
What goes around comes around.

PB 5.21 LTS (x86) - Windows 8.1
Sparkie
PureBatMan Forever
PureBatMan Forever
Posts: 2307
Joined: Tue Feb 10, 2004 3:07 am
Location: Ohio, USA

Post by Sparkie »

I'll still have to find out what's happening, but in the meantime see if the edited code ^up there^ works any better. I made a slight change in the following section of code..

Code: Select all

    While nextEntry > 0 
      If previousEntry = nextEntry 
        Break 
      EndIf 
      GetRDCWInfo(nextEntry) 
      ; See if there are more entries for this call 
      nextEntry = PeekL(*fni + nextEntry) 
      previousEntry = nextEntry 
    Wend 
What goes around comes around.

PB 5.21 LTS (x86) - Windows 8.1
dell_jockey
Enthusiast
Enthusiast
Posts: 767
Joined: Sat Jan 24, 2004 6:56 pm

Post by dell_jockey »

Nice Sparkie!

pasting a file into a monitored directory doesn't create en endless loop anymore (at least not on my XP-pro client), however the "attribute or timestamp modified." message is given emitted three times, printed on three consecutive lines.
cheers,
dell_jockey
________
http://blog.forex-trading-ideas.com
User avatar
blueb
Addict
Addict
Posts: 1044
Joined: Sat Apr 26, 2003 2:15 pm
Location: Cuernavaca, Mexico

Post by blueb »

Same here... I'll keep testing :)

--blueb
Sparkie
PureBatMan Forever
PureBatMan Forever
Posts: 2307
Joined: Tue Feb 10, 2004 3:07 am
Location: Ohio, USA

Post by Sparkie »

I get anywhere from 1 to 3 "attribute or timestamp modified" messages. Googled around to find that other people are getting the same results. Some seem to think it's due to a timestamp change being signaled during each write operation. Doesn't seem right to me but I'll keep searching. ;)
What goes around comes around.

PB 5.21 LTS (x86) - Windows 8.1
agent
New User
New User
Posts: 8
Joined: Fri Dec 03, 2004 1:23 pm
Contact:

Post by agent »

@Sparkie

Really nice code. But if there is a change or modification in a subdirectory, your code won't work, right? So you can't watch C:\ (including subdirs)...
Agent_Sasori
Sparkie
PureBatMan Forever
PureBatMan Forever
Posts: 2307
Joined: Tue Feb 10, 2004 3:07 am
Location: Ohio, USA

Post by Sparkie »

@agent: Thank you. :)

Truth be told I haven't even used this code since my orignal posting of it. My best guess is you would have to watch each directory/sub directory individually. Maybe someone who has more experience with this will stop by and offer a suggestion. :)

If I find any more info, I'll come back and let you know. :cool:
What goes around comes around.

PB 5.21 LTS (x86) - Windows 8.1
User avatar
Fangbeast
PureBasic Protozoa
PureBasic Protozoa
Posts: 4749
Joined: Fri Apr 25, 2003 3:08 pm
Location: Not Sydney!!! (Bad water, no goats)

Post by Fangbeast »

Sparkie wrote:@agent: Thank you. :)

Truth be told I haven't even used this code since my orignal posting of it. My best guess is you would have to watch each directory/sub directory individually. Maybe someone who has more experience with this will stop by and offer a suggestion. :)

If I find any more info, I'll come back and let you know. :cool:
Sparkie, in answer to you and agent, you posted these constants in the code above.


#FILE_NOTIFY_CHANGE_FILE_NAME = 1
; Any file name change in the watched directory or subtree
; causes a change notification wait operation to return.
; Changes include renaming, creating, or deleting a file name.
#FILE_NOTIFY_CHANGE_DIR_NAME = 2
; Any directory-name change in the watched directory or
; subtree
causes a change notification wait operation to return.
; Changes include creating or deleting a directory.

Note the bolding in the explanation for the constants. The keyword is in the word "subtree". So, why wouldn't monitoring "C:" work? The conclusion; based on that keyword is that it should.

@agent, why don't you just try it?
Amateur Radio, D-STAR/VK3HAF
rsts
Addict
Addict
Posts: 2736
Joined: Wed Aug 24, 2005 8:39 am
Location: Southwest OH - USA

Post by rsts »

Need something like this, and as usual, PureBasic already has a solution.

Thanks Sparkie. (You're listed as the main contributor on my latest) :)
cas
Enthusiast
Enthusiast
Posts: 597
Joined: Mon Nov 03, 2008 9:56 pm

Re:

Post by cas »

agent wrote:But if there is a change or modification in a subdirectory, your code won't work, right? So you can't watch C:\ (including subdirs)...
Sparkie wrote:My best guess is you would have to watch each directory/sub directory individually.

Solution is really simple:
In RDCW(dir) procedure change 5th parameter in CallFunctionFast() from #False to #True:

change this line

Code: Select all

While CallFunctionFast(*pRDCW, dir, *fni, 1024, #False, #FILE_NOTIFY_CHANGE_FILE_NAME | #FILE_NOTIFY_CHANGE_DIR_NAME | #FILE_NOTIFY_CHANGE_LAST_WRITE, @size, #Null, #Null)
to this

Code: Select all

While CallFunctionFast(*pRDCW, dir, *fni, 1024, #True, #FILE_NOTIFY_CHANGE_FILE_NAME | #FILE_NOTIFY_CHANGE_DIR_NAME | #FILE_NOTIFY_CHANGE_LAST_WRITE, @size, #Null, #Null)
For OS check i modified it to this:

Code: Select all

If OSVersion() < #PB_OS_Windows_NT_4
  MessageRequester("Incorrect OS", "This program requires at least NT4.")
  End
EndIf
and when you close window without using watch directory then you get invalid memory access on FreeMemory() and KillThread(). Simple fix, change:

Code: Select all

FreeMemory(*fni)
; Kill our ReadDirectoryChangesW thread
KillThread(myThread)
to this:

Code: Select all

If myThread<>0
  KillThread(myThread)
  If *fni<>0
    FreeMemory(*fni)
  EndIf
EndIf
Post Reply