开发规范

[!TIP|style:callout|label:爬山虎应用 worker 有两种运作模式:|iconVisibility:default|labelVisibility:default|className:block-tip] 1、单 worker 运作模式:限定只能编写特定的downloader实例,即可完成所有的爬虫需求;
     好处是开箱即用,不依赖redis服务,使用PHP内置队列,缺点是只能对付简单的爬虫需求;
2、多 worker 运作模式:支持自由编写任意多个业务worker实例,这是爬山虎默认的工作模式;

--

[!WARNING|style:callout|label: 作者推荐使用爬山虎应用框架开发|iconVisibility:default|labelVisibility:default|className:block-warning] 手册所有例子都是以爬山虎应用框架为上下文应用环境的,若要自行编写请参考"常见问题"章节

1、全局启动脚本编写:

全局启动脚本是一个独立的全局启动脚本,其中一次性加载了多个业务 worker 实例,该脚本的存放位置随意, 默认由爬山虎应用助手自动生成,如果手动自由编写,只需要保证脚本能够正常引入如下代码即可:

require_once "/path/to/PHPCreeper-Appication/Application/Core/Launcher.php";
2、单一启动脚本编写:

单一启动脚本指的是各个独立的业务 worker 启动脚本,同样默认由爬山虎应用助手自动生成, 除非你手动自由编写,否则这些脚本的存放位置不可随意摆放,必须位于如下特定的目录中:

/path/to/PHPCreeper-Appication/Application/Spider/项目名/Start/单一启动脚本1.php 
/path/to/PHPCreeper-Appication/Application/Spider/项目名/Start/单一启动脚本2.php 
/path/to/PHPCreeper-Appication/Application/Spider/项目名/Start/单一启动脚本N.php

单一启动脚本代码片段 AppProducer.php:

namespace PHPCreeperApp\Spider\News\Start;

require_once dirname(__FILE__, 4) . '/Core/Launcher.php';

use PHPCreeperApp\Core\Launcher;
use PHPCreeper\Kernel\PHPCreeper;
use PHPCreeper\Producer;

class AppProducer
{
    /**
     *  single instance
     *
     *  @var object 
     */
    static protected $_instance;

    /**
     *  producer instance
     *
     *  @var object
     */
    protected $_producer;

    /**
     * @brief   get single instance 
     *
     * @return  object
     */
    static public function getInstance()
    {
        if(!self::$_instance instanceof self)
        {
            self::$_instance = new self();
        }

        return self::$_instance;
    }

    /**
     * @brief    start entry
     *
     * @return   mixed
     */
    public function start($config)
    {
        //single instance
        $this->_producer = new Producer($config);

        //set process name
        $this->_producer->setName('producer1');

        //set process number
        $this->_producer->setCount(1);

        //set user callback
        $this->_producer->onProducerStart   = array($this, 'onProducerStart');
        $this->_producer->onProducerStop    = array($this, 'onProducerStop');
        $this->_producer->onProducerReload  = array($this, 'onProducerReload');
    }


    /**
     * @brief    onProducerStart  
     *
     * @param    object $producer
     *
     * @return   mixed
     */
    public function onProducerStart($producer)
    {
    }

    /**
     * @brief    onProducerStop
     *
     * @param    object $producer
     *
     * @return   mixed
     */
    public function onProducerStop($producer)
    {
    }

    /**
     * @brief    onProducerReload     
     *
     * @param    object $producer
     *
     * @return   mixed
     */
    public function onProducerReload($producer)
    {
    }
}



//!!! WARN: DON'T CHANGE THE CODES BELOW ALL !!!
//!!! WARN: DON'T CHANGE THE CODES BELOW ALL !!!
//!!! WARN: DON'T CHANGE THE CODES BELOW ALL !!!
if(!defined('GLOBAL_START'))  
{
    $classname = pathinfo(__FILE__, PATHINFO_FILENAME);
    $config = Launcher::getSpiderConfig($spider ?? getSpiderName(), $classname);
    $_classname = __NAMESPACE__ . "\\" . $classname;
    $_classname::getInstance()->start($config);
    PHPCreeper::start();
}
3、每个业务启动脚本名称必须和相应的配置文件名称完全一致:
/path/to/Application/Spider/News/Start/AppProducer.php
/path/to/Application/Spider/News/Config/AppProducer.php

results matching ""

    No results matching ""