It seems like nutch indexes only (some) parse results. It runs the indexing filters which detremine what is indexed.
  These indexing filters get a Parse result as a parameter.
  How can I achieve file names and other file metadata like owner being indexed?
  Of course I need to add an indexing filter, but to do I also have to add a parser for parsing all filetypes and getting their metadata?
Comments
Post a Comment