support 14.1 XML product feeds 14.2 RDF, RSD, RSS and Atom feeds 15. Result cache for text and media queries 16. Multiple database support 16.1 Overview 16.2 Definition and configuration 16.3 Activate / disable database 16.4 Backup & Restore of databases 16.5 Copy & and Move 16.6 Enhancing functionality of Multiple database support 17. Search . . .
. . .
database support 17. Search in categories 17.1 Hierachical structure 17.2 Parallel structure 18. User suggested sites 19. Vulnerability protection 19.1 Prevent queries from Meta search engines and crawler known to be evil 19.2 Basic input validation against vulnerability attacks 19.3 Admin backend protection against remote access 19.4 Log file . . .
. . .
to abuse Sphider-plus 20. Bound database 21. Suggest framework 22. Integration of Sphider-plus into existing sites 22.1 Integration into existing site by use of Sphider-plus templates 22.2 Embed the search engine into existing HTML code 22.3 The different style sheet files 23. JSON, XML and RSS result output [ Documentation ] 1. Settings, . . .
. . .
interface. There is a wide range of settings foreseen for Sphider-plus. Separated into different submenus like: Sites: - Add Site - Index only the new - Re-index all - Re-index only preferred URLs - Erase Re-index (available also for individual URLs) - Import/export URL list - Approve sites - Banned domains Categories: - Add, edit, delete - . . .
. . .
- Copy / Move - Optimize Templates: In order to enable customer's integration of Sphider-plus into existing sites, HTML templates are prepared for Search form Text result listing Media result listing Most popular queries etc. Three different designs are offered, which may be selected in submenu 'Settings'. If the layout does not fit the . . .
. . .
database before the re-index process. It will leave the following untouched: - Categories - Query log - Sites and all options: spider-depth, last indexed, can leave domain, title, description, - URL Must include, URL must Not include. If settings have been modified in Admin section this mod should be selected to update the database. . . .
. . .
is entered (for English stemming). 2.4 Periodical Re-indexing This mode offers automatically Re-indexing of all sites, or site specific, started periodically at a defined time interval. In Admin backend the time interval is selectable to 3 hours, 12 hours, 1 day, 1 week or 1 month. Also the count of periodically performed re-indexing . . .
. . .
Re-indexer could be started and aborted in Admin backend, by selecting the 'Periodical Re-index' submenu in 'Sites' view. Instead for site individual Re-indexing, the periodical Re-indexer could be started and aborted in the "Options" menu of each site. 2.5 Preferred Re-indexing Each new URL added to the Admin backend, could be supplied . . .
. . .
backend, could be supplied with a priority level. This level will be used by the option 'Re-index only preferred sites'. Level 1 will be interpreted as most important, while level 5 should be used for non-prioritized sites. In case that new URLs are not manually supplied with any priority, the level will automatically set to '1'. While invoking . . .
. . .
with any priority, the level will automatically set to '1'. While invoking the option 'Re-index only preferred sites', the admin may select a suitable level for the next index procedure. Thus, only those URLs, containing the according level, will be re-indexed. 2.6 Multithreaded indexing The Admin setting: Define number of threads allowed for . . .
. . .
Admin setting: Define number of threads allowed for index procedures (max. 10) activates parallel indexing. For Multiple site indexing, this option will speed up the procedure significant. If this option is activated, browser output of logging data, as well as real-time output in a second browser window (tab), is suppressed. Never the less all . . .
. . .
sitemap file and new links are not searched during index / re-index. If is detected in a sitemap.xml file, and if Multiple Sitemap files are available, Sphider-plus will process the secondary Sitemaps and extract all links for index / re-index. Also gzip-compressed files (Index Sitemap files as well as the Sitemap files) will be processed, . . .
. . .
to reindex a site. -m <string> Set the string(s) that an URL must include (use \\n as a delimiter between Multiple strings). -n <string> Set the string(s) that an URL must not include (use \\n as a delimiter between Multiple strings). For example, for spidering and indexing http://www.domain.com/test.html to depth 2, use: php . . .
. . .
in this common list may end with a wildcard, so that 'menu*' will work for ids like menu1, menu2, menu_left, etc. Multiple and nested divs will be attended. For even more flexability, the file /include/common/divs_not.txt may alternately contain a regexp pattern. The regexp needs to be introduced by */ and must be ended with another slash. . . .
. . .
in this common list may end with a wildcard, so that 'menu*' will work for ids like menu1, menu2, menu_left, etc. Multiple and nested divs will be attended. For even more flexability, the file /include/common/divs_use.txt may alternately contain a regexp pattern. The regexp needs to be introduced by */ and must be ended with another slash. . . .
. . .
of result hits for queries with wildcards To be found in ‘Settings’ = ‘Search Settings’. If you like to know the Multiple words found in the database to be highlighted: In your editor open the script: /include/searchfuncs.php Find the row containing the text: Multiple words found in the database to be highlighted: '$hi' Uncomment this row. Now, . . .
. . .
'$hi' Uncomment this row. Now, if there is more than one result word, on top of the result listing, all the Multiple words found in the database will be presented. Strict search ! This variant is invoked by entering a ! as first character of the search query. If you search for '!plus' only results for the word 'plus' will be presented in . . .
. . .
thumbnails and to index ID3 and EXIF information, it is necessary to download the media file. For pages with Multiple media content, the time for index /re-index procedure may increase dramatically. As ID3 information is not available for all audio and video files, the minimum play time in order to be indexed was not yet implemented. In . . .
support 14.1 XML product feeds 14.2 RDF, RSD, RSS and Atom feeds 15. Result cache for text and media queries 16. Multiple database support 16.1 Overview 16.2 Definition and configuration 16.3 Activate / disable database 16.4 Backup & Restore of databases 16.5 Copy & and Move 16.6 Enhancing functionality of multiple database support 17. Search . . .
. . .
database support 17. Search in categories 17.1 Hierachical structure 17.2 Parallel structure 18. User suggested sites 19. Vulnerability protection 19.1 Prevent queries from Meta search engines and crawler known to be evil 19.2 Basic input validation against vulnerability attacks 19.3 Admin backend protection against remote access 19.4 Log file . . .
. . .
to abuse Sphider-plus 20. Bound database 21. Suggest framework 22. Integration of Sphider-plus into existing sites 22.1 Integration into existing site by use of Sphider-plus templates 22.2 Embed the search engine into existing HTML code 22.3 The different style sheet files 23. JSON, XML and RSS result output [ Documentation ] 1. Settings, . . .
. . .
interface. There is a wide range of settings foreseen for Sphider-plus. Separated into different submenus like: sites: - Add Site - Index only the new - Re-index all - Re-index only preferred URLs - Erase Re-index (available also for individual URLs) - Import/export URL list - Approve sites - Banned domains Categories: - Add, edit, delete - . . .
. . .
- Copy / Move - Optimize Templates: In order to enable customer's integration of Sphider-plus into existing sites, HTML templates are prepared for Search form Text result listing Media result listing Most popular queries etc. Three different designs are offered, which may be selected in submenu 'Settings'. If the layout does not fit the . . .
. . .
database before the re-index process. It will leave the following untouched: - Categories - Query log - sites and all options: spider-depth, last indexed, can leave domain, title, description, - URL Must include, URL must Not include. If settings have been modified in Admin section this mod should be selected to update the database. . . .
. . .
is entered (for English stemming). 2.4 Periodical Re-indexing This mode offers automatically Re-indexing of all sites, or site specific, started periodically at a defined time interval. In Admin backend the time interval is selectable to 3 hours, 12 hours, 1 day, 1 week or 1 month. Also the count of periodically performed re-indexing . . .
. . .
Re-indexer could be started and aborted in Admin backend, by selecting the 'Periodical Re-index' submenu in 'sites' view. Instead for site individual Re-indexing, the periodical Re-indexer could be started and aborted in the "Options" menu of each site. 2.5 Preferred Re-indexing Each new URL added to the Admin backend, could be supplied . . .
. . .
backend, could be supplied with a priority level. This level will be used by the option 'Re-index only preferred sites'. Level 1 will be interpreted as most important, while level 5 should be used for non-prioritized sites. In case that new URLs are not manually supplied with any priority, the level will automatically set to '1'. While invoking . . .
. . .
with any priority, the level will automatically set to '1'. While invoking the option 'Re-index only preferred sites', the admin may select a suitable level for the next index procedure. Thus, only those URLs, containing the according level, will be re-indexed. 2.6 Multithreaded indexing The Admin setting: Define number of threads allowed for . . .
. . .
Admin setting: Define number of threads allowed for index procedures (max. 10) activates parallel indexing. For multiple site indexing, this option will speed up the procedure significant. If this option is activated, browser output of logging data, as well as real-time output in a second browser window (tab), is suppressed. Never the less all . . .
. . .
sitemap file and new links are not searched during index / re-index. If is detected in a sitemap.xml file, and if multiple Sitemap files are available, Sphider-plus will process the secondary Sitemaps and extract all links for index / re-index. Also gzip-compressed files (Index Sitemap files as well as the Sitemap files) will be processed, . . .
. . .
to reindex a site. -m <string> Set the string(s) that an URL must include (use \\n as a delimiter between multiple strings). -n <string> Set the string(s) that an URL must not include (use \\n as a delimiter between multiple strings). For example, for spidering and indexing http://www.domain.com/test.html to depth 2, use: php . . .
. . .
in this common list may end with a wildcard, so that 'menu*' will work for ids like menu1, menu2, menu_left, etc. Multiple and nested divs will be attended. For even more flexability, the file /include/common/divs_not.txt may alternately contain a regexp pattern. The regexp needs to be introduced by */ and must be ended with another slash. . . .
. . .
in this common list may end with a wildcard, so that 'menu*' will work for ids like menu1, menu2, menu_left, etc. Multiple and nested divs will be attended. For even more flexability, the file /include/common/divs_use.txt may alternately contain a regexp pattern. The regexp needs to be introduced by */ and must be ended with another slash. . . .
. . .
of result hits for queries with wildcards To be found in ‘Settings’ = ‘Search Settings’. If you like to know the multiple words found in the database to be highlighted: In your editor open the script: /include/searchfuncs.php Find the row containing the text: Multiple words found in the database to be highlighted: '$hi' Uncomment this row. Now, . . .
. . .
'$hi' Uncomment this row. Now, if there is more than one result word, on top of the result listing, all the multiple words found in the database will be presented. Strict search ! This variant is invoked by entering a ! as first character of the search query. If you search for '!plus' only results for the word 'plus' will be presented in . . .
. . .
thumbnails and to index ID3 and EXIF information, it is necessary to download the media file. For pages with multiple media content, the time for index /re-index procedure may increase dramatically. As ID3 information is not available for all audio and video files, the minimum play time in order to be indexed was not yet implemented. In . . .