bitpit Reference

Date:

2019-10-13

Version:

1.2.0

Authors:

Event driven http download library with automatic resume and other features. The goal of this module is to ease the process of downloading files and resuming interrupted downloads. The library is written in an event-driven style similar to GTK. The module defines the class Downloader. Instances of this class download a file from an http server and call callback functions whenever an event happens ralated to this download. Examples of events are download state change (start, pause, complete, error) and download speed change. The following is a typical usage example:

import bitpit

#will download this
url = 'https://www.python.org/static/img/python-logo.png'
d = bitpit.Downloader(url) #downloader instance

#listen to download events and call a function whenever an event happens
#print state when state changes
d.listen('state-changed', lambda var: print('download state:', var.state))

#print speed in human readable format whenever speed changes
#speed is updated and callback is called every 1 second by default
d.listen('speed-changed', lambda var: print('download speed:', *var.human_speed))

#register another callback function to the speed change signal
#print percentage downloaded whenever speed changes
d.listen('speed-changed', lambda var: print(int(var.percentage), '%'))

#print total file size in human readable format when the downloader knows the file size
d.listen('size-changed', lambda var: print('total file size:', *var.human_size))

#done registering callbacks. lets start our download
#the following call will not block. it will start a new download thread
d.start()

#do some other work while download is taking place...

#wait for download completion or error
d.join()

This module can also be run as a main python script to download a file. You can have a look at the main function for another usage example.

commandline syntax:

python -m bitpit.py [-r rate_limit] [-m max_running] url [url ...]
args:
  • url: one or more urls to download.
  • -r rate_limit: total rate limit for all running downloads.
  • -m max_running: maximum number of running downloads at any single time.
class bitpit.Downloader(url, path=None, dir_path=False, rate_limit=0, timeout=10, update_period=1, restart_wait=-1, chunk_size=4096)

downloader class. instances of this class are able to download files from an http or https server in a dedicated thread, pause download and resume download. it subclasses Emitter.

in addition to listen and unlisten, you probably want to use the following methods:

  • self.start()
  • self.stop()
  • self.join()
  • self.bar()

properties:

name type access description
url str RW url to download. cannot be set if is_alive is True.
path pathlib.Path or io.BufferedIOBase RW path to download at. if an instance of pathlib.Path, file will be opened and content will be written to it. the file is closed whenever the download stops (completion, pause or error). if it is an instance of io.BufferedIOBase, content is written to the object and the object is never closed. cannot be set if is_alive property is True.
restart_wait int RW number of seconds to wait before restarting the download in case of error. setting it when a restart thread is active will restart the thread again.
restart_time datetime.datetime or None R the time when the download will be restarted. None if there is no scheduled restart.
chunk_size int RW number of bytes to write in a single write operation. ok to keep default value. when set, new value takes effect in the next time the download is started.
update_period int RW speed-changed signal is emitted every this number of seconds.
timeout int RW download will interrupt when no bytes are recieved for this number of seconds. when set, new value takes effect in the next time the download is started.
rate_limit int RW speed limit of the downloads in bytes per second. may not work well with small files.
human_rate_limit tuple R same as rate_limit but as human readable tuple. eg. (100.0, ‘KB/s’).
size int R total size of the file being downloaded in bytes. -1 if unknown.
human_size tuple R same as size but as human readable tuple.
downloaded int R bytes downloaded so far.
human_downloaded tuple R same as downloaded but as human readable tuple.
remaining int R bytes remaining to complete the download.
human_remaining tuple R same as remaining but as human readable tuple.
speed int R download speed in bytes per second.
human_speed tuple R same as speed but as human readable tuple.
ratio float R downloaded / size. -1.0 if unknown.
percentage float R 100 * ratio
eta datetime.timedelta R estimated time remaining to complete the download.
state str R
one of the following:
  • start: trying to connect.
  • download: downloading now.
  • pause: stopped.
  • error: stopped because of an error.
  • complete: completed.
is_alive bool R True if download thread is running. False otherwise.
is_restarting bool R True if restart thread is running. False otherwise.
last_exception BaseException or None R last exception that occured during download. None if no exception occured yet.
signals:
  • state-changed: emitted when state property changes. its callback
    takes 2 positional arguments, the Downloader instance which emitted the signal and the old state the Downloader was in.
  • size-changed: emitted when size property changes. its callback takes
    1 positional argument, the Downloader instance which emitted the signal.
  • speed-changed: emitted when speed property changes. its callback
    takes 1 positional argument, the Downloader instance which emitted the signal.
  • url-changed: emitted when url property changes. its callback takes 1
    positional argument, the Downloader instance which emitted the signal.
  • path-changed: emitted when path property changes. its callback takes
    1 positional argument, the Downloader instance which emitted the signal.
  • restart-time-changed: emitted when restart_time property changes.
    its callback takes 1 positional argument, the Downloader instance which emitted the signal.
  • rate-limit-changed: emitted when rate_limit property changes. its
    callback takes 1 positional argument, the Downloader instance which emitted the signal.
bar(width=30, char='=', unknown='?')

returns a string of width width representing a progress bar. the string is filled with char and spaces. the number of char represents the part of the file downloaded (e.g., if half of the file is downloaded, half of the string will be filled with char). the rest of the string will be filled with spaces. if the ratio of downloaded data is not known, returns a string of width width filled with the unknown argument.

args:
  • width (int): number of characters in the bar.
  • char (str) character to fill the bar with.
  • unknown (str): character to fill the bar if the ratio downloaded is
    unknown.
returns:

a string containing width characters filled with char and spaces to show the ratio of the downloaded bytes to the total file size.

examples:

if the width is 8 and 25% of the file is downloaded, the returned string will be ‘== ‘

if the width is 8 and the ratio downloaded is not known, the returned string will be ‘????????’

join(timeout=None)

waits until the downloading thread terminates for any reason (download completion, error or pause). check self.state after join if you want to know the state of the download.

args:
  • timeout (None or int) the timeout for the join operation. defaults
    to None meaning no timeout.
restart(wait=None)

schedules a download restart and returns. it is called when an error occures during download and self.restart_wait property >= 0.

args:
  • wait (float or None): seconds to wait before the restart. if None,
    uses self.restart_wait.
start()

starts a downloading thread. if self.path has data, the download will resume and bytes will be appended to the end of the file. does nothing if the downloader is already started. if there is a scheduled restart, it will be cancelled.

stop()

stops downloading thread. does nothing if the downloader is already stopped. if there is a scheduled restart, it will be cancelled.

update_size()

sends a head request to get the size of the file update self.size.

class bitpit.Emitter

a base class for classes that implement event driven programming. a derived class should define the class attribute __signals__ which is a sequence of its valid signals.

emit(signal, *args)

calls all callback functions previously registered for the signal by previous calls to self.listen(). emitting a signal not present in __signals__ class property raises KeyError. exceptions raised by the callback function are printed to stderr and ignored.

args:
  • signal (str): the signal to call its callbacks.
  • args: positional arguments to be passed to the callbacks. args
    that were passed to self.listen() will be after args that are passed to this method.
listen(signal, func, *args, **kwargs)

registers the callback function func for the signal signal. whenever the signal is emitted, the callback function will be called with 1 argument which is the object that emitted the signal. listening to a signal not present in class attribute __signals__ raises KeyError. registering a callback function multiple times calls the function that number of times when the signal is emitted.

args:
  • signal (str): the signal to listen to.
  • func (a callable): the callback function.
  • args: positional arguments to be passed to the callback.
  • kwargs: keyword arguments to be passed to the callback.
unlisten(signal, func, *args, **kwargs)

unregisters the callback function func for the signal signal. unlistening from an unknown signal raises KeyError. unlistening a callback which was not passed to listen method previously raises a ValueError. unlistening a call back will remove it from callback list only once. if the callback was passed to self.listen() multiple times, it must be unlistened that number of times to be completely removed from the callback list.

args:
  • signal (str): the signal to unlisten from.
  • func (a callable): the callback function.
  • args: args that were passed to self.listen().
  • kwargs: kwargs that were passed to self.listen().
class bitpit.Manager(max_running=0, rate_limit=0, restart_wait=30, **kwargs)

download manager class. multiple urls can be added to it. you can specify the maximum number of downloads that run at a single time and the manager will start or stop downloads to reach and not exceed this number. you can also specify the total download rate limit and the manager class will equally divide the speed over the running downloads.

the Manager class subclasses Emitter and emits signals when a download is added or removed.

properties:

name type access description
rate_limit int RW rate limit for all running downloads. it will be divided equally over the them. a value <= 0 means no rate limit.
max_running int RW maximum running downloads at a single time. if the number of started downloads exceed this number, the manager will stop some downloads. if the number is less than this number, the manager will start some downloads. a value <= 0 means no limit.
restart_wait float RW minimum time before the manager starts the same download. even if max_running is not reached, if restart_wait has not passed since the download last stopped, the download not started immediately. the manager will wait until this number of seconds has passed then start the download. this is to prevent frequent restarts in case of network failure.
kwargs dict RW keyword arguments to added downloads when creating an instance of Downloader using self.add()
downloads list R downloads added to this manager. a list containing Downloader instances.
signals:
  • add: emitted when a new Downloader is added. the signal’s callbacks
    take 2 positional arguments, the Manager instance that emitted the signal and the Downloader that was just added. the added Downloader can be found in self.downloads.
  • remove: emitted when a Downloader is removed. the signal’s callbacks
    take 2 positional arguments, the Manager instance that emitted the signal and the Downloader that was just removed. the removed Downloader can no longer be found in self.downloads.
  • property-changed: emitted when rate_limit, max_running,
    restart_wait or kwargs property is changed.
add(d)

add a new download to the manager.

args:
  • d (str or Downloader): the url or Downloader instance to
    add. if d type is str, a new Downloader instance is created with arguments taken from self.kwargs property.
returns:
the Downloader instance added.
remove(d)

remove a previously added download then emits remove signal. if the download is running, it is not stopped.

args:
  • d (Downloader): the downloader to remove.
start()

start download manager thread. after a call to this method, the manager will start checking added downloads to start, stop and change rate limit when necessary.

stop()

stop the manager thread.

stop_all()

pause all currently running downloads. the manager thread is not stopped. if you want to stop the manager and all downloads, call self.stop() first.

update()

tell the manager thread to check pending downloads to see if there is need to start, stop or change rate limit to some of them. this is called automatically when the state of any added download changes and when manager properties are changed. you do not need to call it.

bitpit.human_readable(n, digits=3)

return a human readable number of bytes.

args:
  • n (float): the number to return as human readable.
  • digits (int): the number of digits before the decimal point.
returns:
tuple:
  1. (float) human readable number or None if n is None.
  2. (str) suffix or None if n is None.
bitpit.main(urls, rate_limit='0', max_running=5)

downloads the given urls until done downloading them all. displays statistics about downloads in the following format: s | speed | downloaded | percent | eta | name

in the above format, the first item s is the first letter of the state of the download. for example, for complete downloads, that would be the letter c. Similarly, e would be for error and f for fatal error. speed is the download speed in human readable format. downloaded is the number of downloaded bytes in human readable format. percent is percentage downloaded. eta is estimated time to complete the download. name is the name of the file being downloaded or part of the name if the name is very long.

args:
  • urls: the urls to download.
  • rate_limit: total rate limit for all downloads
  • max_running: maximum running downloads at any given time