dinsdag 19 maart 2019

IFilter troubles

For AI one needs indexed documents. Therefore, each document should be converted to plain text. My company has a nice component to convert .msg files to the individual parts. For other file types I have found a project that uses IFilters to parse them.

https://www.codeproject.com/articles/13391/using-ifilter-in-c

However, this alone doesn't do the job (anymore?). This has mainly to do with the change to 64 bits systems. To let it use the Adobe pdf 64 bit IFilter I had to
  •  Ensure my project compiled to 64 bit instead of 32.
  •  Ensure the registry code looked in the 64 part of the registry, by using the RegistryView:
            using (var hklm = RegistryKey.OpenBaseKey(RegistryHive.LocalMachine,
                                                       RegistryView.Registry64))
            {
                RegistryKey rk = hklm.OpenSubKey(key);
                ....
  • Finally: The Adobe 11 Ifilter hasn't all the necessary methods implemented! Very strange. Luckily I found a link where I could download the 9 version: 
    ftp://ftp.adobe.com/pub/adobe/acrobat/win/9.x/

Geen opmerkingen:

Een reactie posten