AnyEvent-HTTP

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

        original "Status" and "Reason" values from the header are available
        as "OrigStatus" and "OrigReason".

        The pseudo-header "URL" contains the actual URL (which can differ
        from the requested URL when following redirects - for example, you
        might get an error that your URL scheme is not supported even though
        your URL is a valid http URL because it redirected to an ftp URL, in
        which case you can look at the URL pseudo header).

        The pseudo-header "Redirect" only exists when the request was a
        result of an internal redirect. In that case it is an array
        reference with the "($data, $headers)" from the redirect response.
        Note that this response could in turn be the result of a redirect
        itself, and "$headers->{Redirect}[1]{Redirect}" will then contain
        the original response, and so on.

        If the server sends a header multiple times, then their contents
        will be joined together with a comma (","), as per the HTTP spec.

        If an internal error occurs, such as not being able to resolve a
        hostname, then $data will be "undef", "$headers->{Status}" will be
        590-599 and the "Reason" pseudo-header will contain an error
        message. Currently the following status codes are used:

        595 - errors during connection establishment, proxy handshake.
        596 - errors during TLS negotiation, request sending and header
        processing.
        597 - errors during body receiving or processing.
        598 - user aborted request via "on_header" or "on_body".
        599 - other, usually nonretryable, errors (garbled URL etc.).

        A typical callback might look like this:

           sub {
              my ($body, $hdr) = @_;

              if ($hdr->{Status} =~ /^2/) {
                 ... everything should be ok
              } else {
                 print "error, $hdr->{Status} $hdr->{Reason}\n";
              }
           }

        Additional parameters are key-value pairs, and are fully optional.
        They include:

        recurse => $count (default: $MAX_RECURSE)
            Whether to recurse requests or not, e.g. on redirects,
            authentication and other retries and so on, and how often to do
            so.

            Only redirects to http and https URLs are supported. While most
            common redirection forms are handled entirely within this
            module, some require the use of the optional URI module. If it
            is required but missing, then the request will fail with an
            error.

        headers => hashref
            The request headers to use. Currently, "http_request" may
            provide its own "Host:", "Content-Length:", "Connection:" and
            "Cookie:" headers and will provide defaults at least for "TE:",
            "Referer:" and "User-Agent:" (this can be suppressed by using
            "undef" for these headers in which case they won't be sent at
            all).

            You really should provide your own "User-Agent:" header value
            that is appropriate for your program - I wouldn't be surprised
            if the default AnyEvent string gets blocked by webservers sooner
            or later.

            Also, make sure that your headers names and values do not
            contain any embedded newlines.

        timeout => $seconds
            The time-out to use for various stages - each connect attempt
            will reset the timeout, as will read or write activity, i.e.
            this is not an overall timeout.

            Default timeout is 5 minutes.

        proxy => [$host, $port[, $scheme]] or undef
            Use the given http proxy for all requests, or no proxy if
            "undef" is used.

            $scheme must be either missing or must be "http" for HTTP.

            If not specified, then the default proxy is used (see
            "AnyEvent::HTTP::set_proxy").

            Currently, if your proxy requires authorization, you have to
            specify an appropriate "Proxy-Authorization" header in every
            request.

            Note that this module will prefer an existing persistent
            connection, even if that connection was made using another
            proxy. If you need to ensure that a new connection is made in
            this case, you can either force "persistent" to false or e.g.
            use the proxy address in your "sessionid".

        body => $string
            The request body, usually empty. Will be sent as-is (future
            versions of this module might offer more options).

        cookie_jar => $hash_ref
            Passing this parameter enables (simplified) cookie-processing,
            loosely based on the original netscape specification.

            The $hash_ref must be an (initially empty) hash reference which
            will get updated automatically. It is possible to save the
            cookie jar to persistent storage with something like JSON or
            Storable - see the "AnyEvent::HTTP::cookie_jar_expire" function
            if you wish to remove expired or session-only cookies, and also
            for documentation on the format of the cookie jar.

            Note that this cookie implementation is not meant to be
            complete. If you want complete cookie management you have to do
            that on your own. "cookie_jar" is meant as a quick fix to get
            most cookie-using sites working. Cookies are a privacy disaster,
            do not use them unless required to.

            When cookie processing is enabled, the "Cookie:" and
            "Set-Cookie:" headers will be set and handled by this module,
            otherwise they will be left untouched.

        tls_ctx => $scheme | $tls_ctx
            Specifies the AnyEvent::TLS context to be used for https
            connections. This parameter follows the same rules as the
            "tls_ctx" parameter to AnyEvent::Handle, but additionally, the
            two strings "low" or "high" can be specified, which give you a
            predefined low-security (no verification, highest compatibility)
            and high-security (CA and common-name verification) TLS context.

            The default for this option is "low", which could be interpreted
            as "give me the page, no matter what".

            See also the "sessionid" parameter.

        sessionid => $string
            The module might reuse connections to the same host internally
            (regardless of other settings, such as "tcp_connect" or
            "proxy"). Sometimes (e.g. when using TLS or a specfic proxy),
            you do not want to reuse connections from other sessions. This
            can be achieved by setting this parameter to some unique ID
            (such as the address of an object storing your state data or the
            TLS context, or the proxy IP) - only connections using the same
            unique ID will be reused.

        on_prepare => $callback->($fh)
            In rare cases you need to "tune" the socket before it is used to
            connect (for example, to bind it on a given IP address). This
            parameter overrides the prepare callback passed to
            "AnyEvent::Socket::tcp_connect" and behaves exactly the same way
            (e.g. it has to provide a timeout). See the description for the
            $prepare_cb argument of "AnyEvent::Socket::tcp_connect" for
            details.

        tcp_connect => $callback->($host, $service, $connect_cb,
        $prepare_cb)
            In even rarer cases you want total control over how
            AnyEvent::HTTP establishes connections. Normally it uses
            AnyEvent::Socket::tcp_connect to do this, but you can provide
            your own "tcp_connect" function - obviously, it has to follow
            the same calling conventions, except that it may always return a
            connection guard object.

            The connections made by this hook will be treated as equivalent
            to connections made the built-in way, specifically, they will be
            put into and taken from the persistent connection cache. If your
            $tcp_connect function is incompatible with this kind of re-use,
            consider switching off "persistent" connections and/or providing
            a "sessionid" identifier.

            There are probably lots of weird uses for this function,
            starting from tracing the hosts "http_request" actually tries to
            connect, to (inexact but fast) host => IP address caching or
            even socks protocol support.

        on_header => $callback->($headers)
            When specified, this callback will be called with the header
            hash as soon as headers have been successfully received from the
            remote server (not on locally-generated errors).

README  view on Meta::CPAN

    its own. If you want DNS caching, you currently have to provide your own
    default resolver (by storing a suitable resolver object in
    $AnyEvent::DNS::RESOLVER) or your own "tcp_connect" callback.

  GLOBAL FUNCTIONS AND VARIABLES
    AnyEvent::HTTP::set_proxy "proxy-url"
        Sets the default proxy server to use. The proxy-url must begin with
        a string of the form "http://host:port", croaks otherwise.

        To clear an already-set proxy, use "undef".

        When AnyEvent::HTTP is loaded for the first time it will query the
        default proxy from the operating system, currently by looking at
        "$ENV{http_proxy"}.

    AnyEvent::HTTP::cookie_jar_expire $jar[, $session_end]
        Remove all cookies from the cookie jar that have been expired. If
        $session_end is given and true, then additionally remove all session
        cookies.

        You should call this function (with a true $session_end) before you
        save cookies to disk, and you should call this function after
        loading them again. If you have a long-running program you can
        additionally call this function from time to time.

        A cookie jar is initially an empty hash-reference that is managed by
        this module. Its format is subject to change, but currently it is as
        follows:

        The key "version" has to contain 2, otherwise the hash gets cleared.
        All other keys are hostnames or IP addresses pointing to
        hash-references. The key for these inner hash references is the
        server path for which this cookie is meant, and the values are again
        hash-references. Each key of those hash-references is a cookie name,
        and the value, you guessed it, is another hash-reference, this time
        with the key-value pairs from the cookie, except for "expires" and
        "max-age", which have been replaced by a "_expires" key that
        contains the cookie expiry timestamp. Session cookies are indicated
        by not having an "_expires" key.

        Here is an example of a cookie jar with a single cookie, so you have
        a chance of understanding the above paragraph:

           {
              version    => 2,
              "10.0.0.1" => {
                 "/" => {
                    "mythweb_id" => {
                      _expires => 1293917923,
                      value    => "ooRung9dThee3ooyXooM1Ohm",
                    },
                 },
              },
           }

    $date = AnyEvent::HTTP::format_date $timestamp
        Takes a POSIX timestamp (seconds since the epoch) and formats it as
        a HTTP Date (RFC 2616).

    $timestamp = AnyEvent::HTTP::parse_date $date
        Takes a HTTP Date (RFC 2616) or a Cookie date (netscape cookie spec)
        or a bunch of minor variations of those, and returns the
        corresponding POSIX timestamp, or "undef" if the date cannot be
        parsed.

    $AnyEvent::HTTP::MAX_RECURSE
        The default value for the "recurse" request parameter (default: 10).

    $AnyEvent::HTTP::TIMEOUT
        The default timeout for connection operations (default: 300).

    $AnyEvent::HTTP::USERAGENT
        The default value for the "User-Agent" header (the default is
        "Mozilla/5.0 (compatible; U; AnyEvent-HTTP/$VERSION;
        +http://software.schmorp.de/pkg/AnyEvent)").

    $AnyEvent::HTTP::MAX_PER_HOST
        The maximum number of concurrent connections to the same host
        (identified by the hostname). If the limit is exceeded, then
        additional requests are queued until previous connections are
        closed. Both persistent and non-persistent connections are counted
        in this limit.

        The default value for this is 4, and it is highly advisable to not
        increase it much.

        For comparison: the RFC's recommend 4 non-persistent or 2 persistent
        connections, older browsers used 2, newer ones (such as firefox 3)
        typically use 6, and Opera uses 8 because like, they have the
        fastest browser and give a shit for everybody else on the planet.

    $AnyEvent::HTTP::PERSISTENT_TIMEOUT
        The time after which idle persistent connections get closed by
        AnyEvent::HTTP (default: 3).

    $AnyEvent::HTTP::ACTIVE
        The number of active connections. This is not the number of
        currently running requests, but the number of currently open and
        non-idle TCP connections. This number can be useful for
        load-leveling.

  SHOWCASE
    This section contains some more elaborate "real-world" examples or code
    snippets.

  HTTP/1.1 FILE DOWNLOAD
    Downloading files with HTTP can be quite tricky, especially when
    something goes wrong and you want to resume.

    Here is a function that initiates and resumes a download. It uses the
    last modified time to check for file content changes, and works with
    many HTTP/1.0 servers as well, and usually falls back to a complete
    re-download on older servers.

    It calls the completion callback with either "undef", which means a
    nonretryable error occurred, 0 when the download was partial and should
    be retried, and 1 if it was successful.

       use AnyEvent::HTTP;

       sub download($$$) {



( run in 1.196 second using v1.01-cache-2.11-cpan-39bf76dae61 )