. . . Re-index log file - Server info offering: Server software, environment, MySQL, PDF-converter, image functions, php .ini file PHP integration, PHP security info. Each item holding lists of details. All text links, media links and thumbnails are active linked. As stated in chapter Introduction , this search engine uses some PHP libraries and . . .
. . . 'Tips & Tricks & Mods' 3.1 All options It is possible to spider web pages from the command line, using the syntax: php spider.php <options> where <options> are: -all Reindex everything in the database. -eall Erase database and afterwards re-index all. -new Index all new URLs in database which had not jet been indexed. -erase Erase the. . .
. . . multiple strings). For example, for spidering and indexing http://www.domain.com/test.html to depth 2, use: php spider.php -u http://www.domain.com/test.html -d 2 If you want to reindex the same URL, use: php spider.php -u http://www.domain.com/test.html -r 3.2 Multithreaded indexing For command line operation parallel indexing has no. . .
. . . not jet been indexed <-new> Simply start several threads and add individual IDs to the option parameter like php spider.php -new1 php spider.php -new2 etc. The IDs will be added to the name of the corresponding log files like: db2_100524-21.47.56_ID1.html (log file of first thread) db2_100524-21.48.12_ID2.html (log file of second thread). . .
. . . log file will be unreadable. 3.2.2 Re-index all To be invoked by once preparing the database with the command php spider.php <-preall> This will reset all 'Last indexed' tables to '0000', but will not erase the content of all the other tables. So the check whether the content of a page has changed (MD5sum) is still available for a fast. . .
Useful links [ Those might be helpfull ] Sphider http://www.sphider.eu The original script, developed by Ando Saabas. PHP http://www.php .net/ PHP is a widely-used general-purpose scripting language that is especially suited for Web development and can be embedded into HTML. PHP Documentation http://www.php .net/docs.php The PHP Manual is available . . .
. . . PHP Documentation http://www.php .net/docs.php The PHP Manual is available online in a selection of languages. PHP FAQ http://php .net/manual/en/faq.php The PHP FAQ is your first stop for general information and those questions that seem to be on most people's minds. If you have licensing questions, see the separate License FAQ. Maria DB. . .
. . . manual is for both MySQL Community Server and MySQL Enterprise Server. Phorum http://www.phorum.org Phorum, the PHP and MySQL Forum Software. GNU GPL http://www.gnu.org/copyleft/gpl.html GNU General Public License. ID3 Tag Information http://www.id3.org/ Website of the ID3 consortium. Sitemaps http://www.sitemaps.org/ All important info about. . .
. . . and Legal Info Installation Documentation Change Log [ Installation Summary ] Preconditions Sphider-plus requires PHP 5.3 - 8.x (proven up to version 8.0.1) with installed GD, mbstring, PECL and zlib libraries. Additionally, if RAR compressed files should be indexed, the RAR extension is required. Also a MySQL (version 5.3 ) database or MariaDB . . .
. . . Also a MySQL (version 5.3 ) database or MariaDB (proven up to version 10.4.17) must be available. The following PHP and Apache settings should be adjusted on the server before installing Sphider-plus allow_url_fopen : On allow_url_include : On PHP safe_mode : Off (deprected since PHP 5.4) Webserver mod_rewrite : On display_errors : On. . .
. . . (deprected since PHP 5.4) Webserver mod_rewrite : On display_errors : On register_globals : Off (deprected since PHP 5.4) error_reporting : E_ALL & ~E_DEPRECATED & ~E_WARNING & ~E_NOTICE & ~E_STRICT memory_limit : 256M (minimum) AlllowOverride : All (to be found in apache sub folder /conf/httpd.ini) Additional note: The .htaccess files supplied. . .
. . . of the Sphider-plus scripts, before entering into the admin backend the first time. For example with a tool like php MyAdmin, PLESK, or something similar. During this step you already define - Name of database - Username - Password - Database host which will be required later on in step 5 of the installation process. Please take in mind that. . .
. . . do not process them. Sometimes the 'Database host' is also called 'Server' or 'Database server', like in the tool php MyAdmin. 3. Open the Admin interface with your browser by addressing the Admin with something like: http://localhost/public/sphider/admin/admin.php First access to the admin backend is granted without login. Later access to the. . .
. . . be activated in Admin settings. New feature: Index of media files enabled for those servers that do not offer all PHP functions for remote files. Bypassed PHP functions are: fopen(); file_get_contents(); md5_file(); 3 new features for command line operation: - Erase & Re-index all sites ( -eall ) - Index all new URLs in database which had not . . .
. . . ( -erased ) New feature: In order to index XLS files, a converter for Exel files was developed. Implemented as PHP script, the converter needs no adoption to the Operating System. New Admin setting: Index RAR compressed files and archives. Supports (X)HTML, XML and also compressed PDFs and other document files, as well as all kind of feeds,. . .
. . . index procedure, they are to be activated individually in Admin settings. New feature: Self test for all required PHP libraries and extensions. If Debug mode is enabled, the corresponding warning messages will be presented on top of the Settings menu. Improved database 'Activate / Disable' menu: If multiple sets of tables are available, because. . .
. . . Thanks to Lionel Geo Mischie. Involved files that have been modified / added for this release: .htaccess search.php /admin/admin.php /admin/configset.php /admin/db_config.php /admin/index_media.php /admin/install_tables.php /admin/messages.php /admin/real-log.php /admin/spider.php /admin/spiderfuncs.php /admin/url_backup.php. . .
. . . modifications have been added: New feature: Index DOCX files. To be activated in Admin settings. Implemented as PHP script, the converter needs no adoption to the Operating Syst New feature: Index XLSX files. To be activated in Admin settings. Implemented as PHP script, the converter needs no adoption to the Operating System. New feature: . . .
. . . set to 1) UTF-8 support implemented for media titles, file names and ID-3 tags. SQLi connector implemented between PHP and a MySQL database. Performed by OOP. Bug fixed in option: Do not index the full text. Bug fixed for URLs containing CP1252 coded paths. Bug fixed in detection of www/non www links. Now preventing double indexing. Bug fixed in. . .
. . . Involved files that have been modified / added for this release: As the SQLi connector is implemented between PHP and a MySQL database, nearly all scripts are renewed. It is strongly recommended to perform a fresh installation for this version.. . .
. . . Separated for text and media results. MySQLi Improved Extension implemented SQLi connector implemented between PHP and a MySQL database. Performed by OOP, also PHP v.5.5 is supported. Compatible with MySQL and MariaDB Proven up to: - MySQL version 5.7.32 - MariaDB version 10.4.17 Ready to run in PHP8 environment Proven up to PHP version 8.0.1. . .
. . . To be activated in admin backend = Settings = General Settings New feature: - The scripts now are compatible with PHP 8 New feature: - Indexation of .webp images. New feature: - In order to skip comparison of all evil UAs, an additional white list holding brave UAs was added. Redesigned sub menu 'Categories': - New form to add top level . . .
. . . duplicate category names. New PDF converter: - Now indexing text and images in not encoded PDFs. Realized as pure PHP script, the new script does no longer require the definition to its individual path. New spreadsheet converter scripts for .xls and .xlsx files. New open document text converter script for .odt files. New PowerPoint /Impress. . .
. . . Now preventing duplicate category names. Some small bugs fixed. Attention : In order to become compatible with PHP 8.x, nearly all scripts have been modified. As also database has been altered, Spider plus needs to be installed completely new for full functionality of this release. Please take special care of chapter 2 and 3 for the new. . .
. . . use a standard browser HTTP_USER_AGENT to connect to the site. New algorithm to delete the content of HTML and PHP tags No longer using the PHP function strip_tags(); now also unclosed and invalid tags will be observed during index procedure. As result, also the text following an unclosed or invalid tag will become indexed. This part of the . . .
. . . the text following an unclosed or invalid tag will become indexed. This part of the full text was cut off by the PHP function strip_tags(). Modified index procedure: The instructions 'RESET QUERY CACHE' and 'FLUSH TABLE' will only be used, if the following Admin setting is activated: 'Clean resources during index / re-index and also for search. . .
. . . id='abc'>;" for multiple nested divs. Involved files that have been modified / added for this release: /addurl.php /search_ini.php /admin/admin.php /admin/admin_header.php /admin/admin_search.php /admin/auto_index.php /admin/db_common.php /admin/configset.php /admin/index_media.php /admin/messages.php /admin/spider.php /admin/spiderfuncs.php. . .
. . . Requires MySQL server version 5.5.3 New feature: Compressed transfer on the Internet enabled for page content and PHP scripts. Depending on server environment this feature may not work on all servers. Improved MySQL database support: - Now creating tables in compressed format. - Protection to prevent error 1071: Specified key was too long, max . . .
. . . Improved UP and DOWN buttons in admin 'Settings' menu, and also in result listing. Wrapper added to bypass the PHP bug (error known since PHP v.5.3) gzopen() = gzopen64() and all other gz functions. p Bug fixed to store the admin and dispatcher e-mail account in admin backend. Bug fixed in <! sphider_noindex > directive. Bug fixed for. . .
. . . Error messages and Debug mode New item in Admin / Settings / General Settings: - Enable / Disable MySQL and PHP error messages. It is recommended to disable the output of these messages for production systems, as they could reveal sensitive information. For more details, please notice chapter Error messages and Debug mode New item in Admin . . .
. . . more details, please notice chapter Error messages and Debug mode New item in Admin / Statistics / Server Info: - PHP security Info. Some basic info about current server configuration, presenting the security information status of the PHP environment. Completely rewritten Suggest framework. Based on 'script.aculo.us' and 'prototype' scripts, now. . .
. . . for Microsoft IIS. Thanks to bobyn. Involved files that have been modified / added for this release: /addurl.php /search.php /admin/admin.php /admin/admin_header.php /admin/auth.php /admin/configset.php /admin/confirm.js /admin/dbase.js (file no longer required) /admin/db_backup.php /admin/db_main.php /admin/ext.txt /admin/messages.php. . .