It seems like nutch indexes only (some) parse results. It runs the indexing filters which detremine what is indexed.
These indexing filters get a Parse result as a parameter.
How can I achieve file names and other file metadata like owner being indexed?
Of course I need to add an indexing filter, but to do I also have to add a parser for parsing all filetypes and getting their metadata?
Comments
Post a Comment