Hello Friend, based on your doubt this is what i am sharing, may be this proves helpful to you....
Key ingredients in a file system search solution typically include:
- Capturing all available file-path metadata
- Creating a fully integrated browse then search, or search then browse information discovery environment
- Making full use of search navigation and results sorting opportunities
- Where appropriate, complimenting search with entity extraction and categorization technologies
- Fully respecting document-level security systems
- Ensuring that repository crawl procedures are friendly and cause no unacceptable stress to the file system
Some techniques to make Google crawl your File System
Building File System (Spider) Verity Indexes
You can index file systems that are local to the application server. This refers to any file system on the physical server on which your application server domain runs, and it also refers to any drives that are accessible from the application server machine. File systems might include file servers, report repositories, and so on.
The index is compiled by using vspider. The program descends into the directory structure recursively and indexes the file types that you've selected to be indexed. It indexes only files that Verity supports for collections.
This section discusses how to:
1. Set file system options
2. Define what to index
Setting File System Options
Select select PeopleTools, then select Search Engine, then select Filesystem Indexes to access the Design a Search Index page.
Image: Design a Search Index page (File system)
List local filesystem paths to spider
- Specify the network file system path that contains the documents to index. Ensure that the local application server has the proper access to the file systems that you include in the list.
- For Microsoft Windows, this means the drive mappings must be set up from the applications server. For UNIX, this means the correct network file system (NFS) mappings must be set on the application server.
- To add a system path to the list, click the plus button. To remove a file system, click the minus button.
Remap Path to This URL
- Do not use.
Defining What to Index
Select select PeopleTools, then select Search Engine, then select Filesystem Indexes, then select What to Index to access the What to Index page.
Image: What to Index page
This example illustrates the fields and controls on the What to Index page. You can find definitions for the fields and controls later on this page.
Index all Mime-types
Select to index all MIME types on a website.
Index only these Mime-types
Select to index only a certain MIME type, and specify the file type in the MIME/Types Allowed list box. Separate multiple MIME types with a space.
Exclude these Mime-types
Select to exclude a set of MIME types, and specify the MIME types to exclude. Separate multiple MIME types with a space.
Add a list of MIME types, separated by spaces, if you selected Index only these Mime-types orExclude these Mime-types.
Index all filenames
Select to index all file types.
Index only these filenames
Select to index only a certain file type, and specify the file type in the Pathname Globs List list box.
Exclude these filenames
Select to exclude a set of file types, such as temporary files, but to index all others. Also specify the file types to exclude.
Pathname Globs List
Add the files that you want to incorporate into your index. Separate the entries with spaces. You can use wildcard characters (*) to denote a string and “?” to denote a single character. For example, the string '*.doc 19??.excel' means select all files that end with the “.doc” suffix and Microsoft Excel files that start with 19, followed by 2 characters.