扩展 HttpSocket 以用于 Amazon 等其他用途,适用于 1.3.x 和 2.x

在尝试使用 Amazon API 时,我阅读了很多关于 Amazon 规则的信息,该规则只允许每秒进行 1 次请求,否则如果网站经常忽略此规则,将被拒绝访问 Amazon。因此,我开始着手解决这个问题,使用 CakePHP,最终得到了几个对其他人可能也有一定用处的类。本文的源代码可从 Github 下载。主分支适用于 CakePHP1.3.x,现在还有一个 2.x 分支。

对于 Amazon 请求频率问题,显而易见的解决方案是确保您的网站每秒不超过 1 次请求。我决定使用锁定文件方法来实现这一点,该方法会整理正在进行的请求。

这里介绍的系统由 3 个类和一个接口组成。

ISocket 接口简单地概述了 CakePHP 中 HttpSocket 所实现的方法。

interface ISocket {
    public function put($uri = null, $data = array(), $request = array());
    public function delete($uri = null, $data = array(), $request = array());
    public function get($uri=NULL, $query = array(), $request = array());
    public function post($uri = null, $data = array(), $request = array());
}

抽象类 BaseSocket 封装任何实现 ISocket 的类,并负责将调用传递给该类。

abstract class BaseSocket implements ISocket {
    private $_socket;

    public function  __construct( ISocket $socket ) {
        $this->_socket = $socket;
    }

    public function  delete($uri = null, $data = array(), $request = array()) {
        return $this->_socket->delete($uri,$data,$request);
    }

    public function post($uri = null, $data = array(), $request = array()) {
        return $this->_socket->post($uri,$data,$request);
    }

    public function get($uri = NULL, $query = array(), $request = array()) {
        return $this->_socket->get($uri,$query,$request);
    }

    public function put($uri = null, $data = array(), $request = array()) {
        return $this->_socket->put($uri,$data,$request);
    }
}

NormalSocket 类只是扩展了 HttpSocket 并实现了 ISocket,这使得它可以被 BaseSocket 使用。除此之外,它与 CakePHP 提供的 HttpSocket 相同。

class NormalSocket extends HttpSocket implements ISocket {
    public function __construct($config = array() ) {
        parent::__construct($config);
    }
}

您可以像使用 HttpSocket 一样使用 NormalSocket。

/* Create a NormalSocket which has the same usage as HttpSocket */
$ns = new NormalSocket();
/* Get content from a URI */
$response = $ns->get($uri);

ThrottledSocket 会限制另一个实现 ISocket 的对象,该对象在构造函数中传递。

class ThrottledSocket extends BaseSocket implements ISocket {

    /**
     * Filename used for synchronisation
     */
    private $_filename;

    /**
     * Create a Throttled Socket.
     */
    public function __construct(ISocket $socket) {
        parent::__construct($socket);
        $this->_filename = ROOT . DIRECTORY_SEPARATOR . APP_DIR . DIRECTORY_SEPARATOR . 'tmp' . DIRECTORY_SEPARATOR . 'throttle.dat';
        if (!file_exists($this->_filename)) {
            file_put_contents($this->_filename, time());
        }
    }

    /**
     * Issues a GET request to the specified URI, query, and request.
     */
    public function get($uri=NULL, $query = array(), $request = array()) {
        $delay = $this->throttle();
        if ($delay > 0) {
            sleep($delay);
        }
        return parent::get($uri, $query, $request);
    }

    /**
     * Issues a POST request to the specified URI, query, and request.
     */
    public function post($uri = null, $data = array(), $request = array()) {
        $delay = $this->throttle();
        if ($delay > 0) {
            sleep($delay);
        }
        return parent::post($uri, $query, $request);
    }

    /**
     * Introduce a delay. Requests are only allowed to be sent
     * once a second.
     */
    private function throttle() {
        $curtime = time();
        $filetime = $curtime;
        $fp = fopen($this->_filename, "r+");
        if (flock($fp, LOCK_EX)) {
            $nbr = fread($fp, filesize($this->_filename));
            $filetime = intval(trim($nbr));
            $curtime = time();
            if ($curtime > $filetime) {
                $filetime = $curtime;
            } else {
                $filetime++;
            }
            rewind($fp);
            ftruncate($fp, 0);
            fprintf($fp, "%d", $filetime);
            flock($fp, LOCK_UN);
        }
        fclose($fp);
        return $filetime - $curtime;
    }
}

您可以按如下方式使用 ThrottledSocket…

$ts = new ThrottledSocket(new NormalSocket());
/* Requests to Throttled socket now happen only once per second */
$response = $ts->get($uri);

CachedSocket 会缓存另一个实现 ISocket 的对象,该对象在构造函数中传递。该类使用标准的 CakePHP 缓存机制来缓存封装套接字返回的响应。

class CachedSocket extends BaseSocket implements ISocket {

    /**
     * The cache key
     */
    private $_cacheKey;

    /**
     * The duration of the cache in seconds
     */
    private $_cacheDuration;


    /**
     * Create the object and assign a cache key.
     */
    public function __construct(ISocket $socket, $key, $duration=3600) {
        parent::__construct($socket);
        $this->_cacheKey = $key;
        $this->_cacheDuration = $duration;
    }

    /**
     * Set the number of seconds for which responses should be cached.
     */
    public function setCacheDuration($duration) {
        $this->_cacheDuration = $duration;
    }

    /**
     * Set the cache key
     */
    public function setCacheKey($key) {
        $this->_cacheKey = $key;
    }

    /**
     * GET Request a URL.
     */
    public function get($uri=NULL, $query = array(), $request = array()) {
        $response = Cache::read($this->_cacheKey);
        if ($response === false) {
            $response = parent::get($uri, $query, $request);
            if ($response) {
                Cache::set(array('duration' => '+' . $this->_cacheDuration . ' seconds'));
                Cache::write($this->_cacheKey, $response);
            }
        }
        return $response;
    }

    /**
     * POST Request a URL.
     */
    public function post($uri=NULL, $query = array(), $request = array()) {
        $response = Cache::read($this->_cacheKey);
        if ($response === false) {
            $response = parent::post($uri, $query, $request);
            if ($response) {
                Cache::set(array('duration' => '+' . $this->_cacheDuration . ' seconds'));
                Cache::write($this->_cacheKey, $response);
            }
        }
        return $response;
    }

}

您可以使用 CachedSocket 来缓存来自 HttpSocket 的响应

/* Cache the NormalSocket with the key 'CacheKey' for 1 hour */
$cs = new CachedSocket(new NormalSocket(), 'CacheKey', 3600 );
/* Requests to Cached socket now return the cached response for the next hour */
$response = $cs->get($uri);

通过这种封装其他实现 ISocket 对象的系统,我们可以缓存受限请求,这对于访问 Amazon 非常理想。一旦发出请求,响应就会被缓存,因此只有在必要时才会发出 Amazon 请求,并且每秒不会发出超过一次请求。

function GetBookByAsin( $asin ) {
    /* Throttle an HttpSocket to send requests at once a second */
    $ts = new ThrottledSocket( new NormalSocket() );
    /* Cache the throttled socket */
    $amazonSocket = new CachedSocket( $ts, 'ASIN' . $asin, 3600 );
    /* Build amazon request URL in $url */
    $response = $amazonSocket->get($url);
    /* Process XML returned by Amazon or by the Cache */
}

示例函数第一次被调用时,它将发出对 Amazon 的受限请求并缓存响应。在接下来的一个小时内,对同一项目的任何请求都将返回缓存的响应,而不会向 Amazon 发出任何调用。

源代码可从上面显示的 GitHub URL 获取,您只需将这些类复制到您的 app/libs(或 2.x 的 app/Lib)目录中并使用它们即可。我已经记录了每个类,并提供了一些测试用例。

在测试用例中可以看到将套接字类封装到其他类中的另一个好处。创建一个实现 ISocket 的虚拟类并将其提供给 CachedSocket 或 ThrottledSocket 很容易,这样就可以测试这些类,而无需进行任何实际请求。