for any info/changes follow me: @nickmilon

twtPyCurl.py.requests module

module:requests (pyCurl based Requests)

a lightweight small footprint interface to pyCurl provides the base for twtCurl

Warning

although classes defined here can possibly be used for generic http(s) requests those have only been tested for requests to twitter REST and streaming API

exception twtPyCurl.py.requests.ErrorRq[source]

Bases: exceptions.Exception

Exceptions base

exception twtPyCurl.py.requests.ErrorRqMissingKeys[source]

Bases: twtPyCurl.py.requests.ErrorRq

exception twtPyCurl.py.requests.ErrorRqCredentialsNotValid[source]

Bases: twtPyCurl.py.requests.ErrorRq

exception twtPyCurl.py.requests.ErrorRqHttp(http_code, msg='')[source]

Bases: twtPyCurl.py.requests.ErrorRq

HTTP error

__init__(http_code, msg='')[source]
exception twtPyCurl.py.requests.ErrorRqCurl(err_number, msg)[source]

Bases: twtPyCurl.py.requests.ErrorRq

Exceptions raised by Curl

__init__(err_number, msg)[source]
class twtPyCurl.py.requests.CredentialsProvider[source]

Bases: object

Generic oAuth credentials provider class

appl_keys = ['id_appl', 'consumer_access_token']
user_keys = ['id_user', 'user_name', 'consumer_key', 'consumer_secret', 'access_token_key', 'access_token_secret']
classmethod get_credentials(id_appl, id_user=None)[source]

must return a dictionary with all appl_keys and user_keys classes inherited from this base class must implement this method

classmethod on_revoke_credentials(appl_id, user_id)[source]

inherited classes should handle this to inform the application

classmethod validate(credentials_dict)[source]

validates credentials dictionary against missing keys

class twtPyCurl.py.requests.CredentialsProviderFile[source]

Bases: twtPyCurl.py.requests.CredentialsProvider

simple file based credentials provider reads credentials from the contents of a json file

See also

  • a sample file with user credentials at: twt_data/sample_credentials_user.json
  • a sample file with application credentials at: twt_data/sample_credentials_application.json
classmethod get_credentials(file_path=None)[source]
Parameters:file_path (str) – full path name to a file, defaults to credentials.json in user’s home directory
Returns:a validated credentials dictionary
Raises:IOError on file error
class twtPyCurl.py.requests.Credentials(**kwargs)[source]

Bases: object

stores OAuth1 or OAuth2 credentials and provides OAuth headers

__init__(**kwargs)[source]
is_appl()[source]
Returns:Boolean: True if credentials belong to an application False if belong to an application user
id
Returns:tuple: (application id, user id)
id_str
Returns:(str) representation of intance’s id
get_oath_header(*args)[source]
Returns:str: the OAuth header to be used by a request
on_revoke_credantials()[source]

descendants can override to handle revoking credentials

class twtPyCurl.py.requests.Response[source]

Bases: object

‘a lightweight HTTP response class handles only basic things since we want it to be fast

__init__()[source]
reset()[source]

we reset the properties between requests so we don’t have to create a new instance between each request

write_headers(headers_data)[source]
headers

sets (on demand and only once) and returns headers dictionary constructing the dictionary is not cheap - so we avoid only do it when this method is called :returns: the headers dictionary

class twtPyCurl.py.requests.Client(request=None, credentials=None, on_data_cb=None, user_agent=None, name=None, allow_retries=True, verbose=0, allow_redirects=False)[source]

Bases: object

this is a minimal class to execute HTTP Requests via curl/pycurl, for efficiency urls are NOT url encoded since it is not necessary for our use case. all arguments are optional

Parameters:
  • request (tuple) – (url, method, parms) if specified request will be executed following instance creation see request()
  • credentials (Credentials) – an instance of Credentials
  • on_data_cb (function) – a call back with a single parameter to execute when data from request are ready, if missing or None instance’s on_data_default() will be called instead
  • user_agent (str) – a user agent string to use in request header (defaults to class name + ‘v ‘+ __version)
  • name (str) – name for this instance if missing a default based on instance’s id is provided see: name()
  • allow_retries (bool) – if True allows instance to perform retries to recover from an error if possible (defaults to True)
  • allow_redirects (bool) – if True allows automatic redirects (defaults to False)
  • verbose (int) – set to 0 for silent mode 1 to turn curl verbose and progress on, 2 to turn curl debug mode on (defaults to 0)
Example:
>>> client = Client()
>>> response = client.request(url="https://www.yandex.com/", method='GET')
>>> response.data
'<!DOCTYPE html><html class="i-ua_js_no i-ua_css_standart i-ua_browser_unknown" lang="en">.......'
>>> responce.status_http
200
format_progress = '|progress |download:{:6.2f}%| upload:{:6.2f}%|'
__init__(request=None, credentials=None, on_data_cb=None, user_agent=None, name=None, allow_retries=True, verbose=0, allow_redirects=False)[source]
name
Returns:instance’s name
request_headers
Returns:current request headers
credentials
_handle_init()[source]

initializes pycurl handle, override for any special set up for options details see

_handle_init_end()[source]

modify in descedants if additional initialization requirements

_raise(err_class, *args)[source]

use this mechanism to raise critical exceptions useful to notify applications before raising the exception and maybe try a remedy in application level especially useful in a threading environment to notify main thread before raising it calls _on_exception and raises the exception only if it returns True

_on_exception(err_class, *args)[source]

descendants can specify any special handling

handle_set(url, method, request_parms, multipart=False)[source]
Parameters:
  • url (str) – url to be used by request
  • method (str) – method to be used by request
  • request_parms (dict) – request’s parameters
  • multipart (boolean) – defaults to False, specify True for a multipart request
Raises:

KeyError: if method is not one of GET POST or HEAD

curl_set_option(option, value)[source]

used for general options like verbose, noprogress etc, we store values internally so we can query for option status for options details see

curl_get_option(option)[source]
curl_noprogress
curl_verbose
curl_low_speed
request_abort
request_abort_set(reason_num=None, reason_msg=None)[source]

Raise or reset _request_abort property if reason_num is not None aborts current request by returning -1 while on accepting data or headers effectively server sees an (104 Connection reset by peer) or (32 broken pipe) thats the only way to disconnect a connection its use makes more sense for streaming data connection

Parameters:
  • reason_num (int) – None or an integer that defines the reason we want to abort current request
  • reason_msg (str) – a string that describes the reason we want to abort current request
Usage:

set it to a Not None value to abort current request main purpose is controlled exit from a streaming request

on_progress(*args)[source]

pycurl on_progress callback gives progress statistics

on_progress_change(download_t, download_d, upload_t, upload_d)[source]

called by on_progress() if it senses a change in progress (to avoid endless progress reports)

on_request_start()[source]

called when a request starts override in descendants as needed

on_request_end()[source]

called when a request ends override in descendants as needed

on_request_error_curl(err)[source]

default error handling, for curl (connection) Errors override method for any special handling see libcurl error codes return True to auto retry request, raise an exception or return False to abort

on_request_error_http(err)[source]

default error handling, for HTTP Errors override method for any special handling return True to auto retry request, raise an exception or return False to abort

request(url, method, parms={}, multipart=False)[source]

Warning

  • Currently we don’t url-encode the url, clients should encode it if needed before making a call.
  • Response object returned is hot i.e a reference to client.response will be invalid after next request. Clients should copy it if they intend to reuse it in future.
Parameters:
  • url (str) – requests’ url
  • method (str) – request
  • kwargs (dict) – parameters dictionary to pass to twitter
Returns:

an instance of Response

Raises:

proper HTTP or pyCurl errors

del_request(url, method, parms={}, multipart=False)[source]
request_repeat()[source]

repeat last request, override in subclasses to yield cursor results by modifying parts of pycurl options

get(url, request_parms={})[source]

shortcut to a GET request

post(url, request_parms={})[source]

shortcut to a POST request

head(url, request_parms={})[source]

shortcut to a HEAD request

set_user_agent(user_agent_str=None)[source]

sets user agent header string :param str user_agent_str: user agent string defaults class name + version

handle_on_headers(header_data)[source]
handle_on_write(data)[source]

this must return None or number of bytes received else connection terminates

on_data_default(data)[source]

default function to process data, i.e. return json.loads(data), override it or provide an on_data_cb function on init

handle_on_ioctl(ioctl, cmd)[source]
handle_on_debug(msg_type, msg_str)[source]

pyCurl’s handle on debug call back

handle_reset()[source]
handle_close()[source]
class twtPyCurl.py.requests.ClientStream(data_separator='rn', stats_every=10000, **kwargs)[source]

Bases: twtPyCurl.py.requests.Client

Parameters:
  • data_separator (str) – string used by server to separate data
  • stats_every (int) – report statistics every n data packets (specify 0 to suppress stats)
  • kwargs (dict) – any other argument(s) as specified in Client
format_stream_stats = '|{name:8s}|{DHMS:12s}|{chunks:15,d}|{data:14,d}|{avg_per_sec:12,.2f}|'
format_stream_stats_header = '...................................................................\n| name | DHMS | chunks | data |avg_per_sec |\n...................................................................'
__init__(data_separator='\r\n', stats_every=10000, **kwargs)[source]
handle_on_write(data_chunk)[source]

data call back receives chunks of data from server and this must return None or number of bytes received else connection terminates

on_data_default(data)[source]

this is where actual data comes after data chunks cleansing, if you don’t specify an on_data_cb function on init Override it in descendants for your use case or specify an on_data_cb function

on_request_start()[source]
on_request_end()[source]
time_since_start()[source]
stats_str()[source]
Returns:a string containing operation(s) statistics
print_stats()[source]

prints a string containing operation(s) statistics