Welcome to EbookLib’s documentation!

EbookLib is a Python library for managing EPUB2/EPUB3 and Kindle files. It’s capable of reading and writing EPUB files programmatically (Kindle support is under development).

The API is designed to be as simple as possible, while at the same time making complex things possible too. It has support for covers, table of contents, spine, guide, metadata and more. EbookLib works with Python 2.7 and Python 3.3.

Homepage: https://github.com/aerkalov/ebooklib/

ebooklib Package

ebooklib Package

epub Module

class ebooklib.epub.EpubBook[source]

Bases: object

add_author(author, file_as=None, role=None, uid='creator')[source]

Add author for this document

add_item(item)[source]

Add additional item to the book. If not defined, media type and chapter id will be defined for the item.

Args:
  • item: Item instance
add_metadata(namespace, name, value, others=None)[source]

Add metadata

add_prefix(name, uri)[source]

Appends custom prefix to be added to the content.opf document

>>> epub_book.add_prefix('bkterms', 'http://booktype.org/')
Args:
  • name: namespave name
  • uri: URI for the namespace
get_item_with_href(href)[source]

Returns item for defined HREF.

>>> book.get_item_with_href('EPUB/document.xhtml')
Args:
  • href: HREF for the item we are searching for
Returns:

Returns item object. Returns None if nothing was found.

get_item_with_id(uid)[source]

Returns item for defined UID.

>>> book.get_item_with_id('image_001')
Args:
  • uid: UID for the item
Returns:

Returns item object. Returns None if nothing was found.

get_items()[source]

Returns all items attached to this book.

Returns:Returns all items as tuple.
get_items_of_media_type(media_type)[source]

Returns all items of specified media type.

Args:
  • media_type: Media type for items we are searching for
Returns:

Returns found items as tuple.

get_items_of_type(item_type)[source]

Returns all items of specified type.

>>> book.get_items_of_type(epub.ITEM_IMAGE)
Args:
  • item_type: Type for items we are searching for
Returns:

Returns found items as tuple.

get_metadata(namespace, name)[source]

Retrieve metadata

get_template(name)[source]

Returns value for the template.

Args:
  • name: template name
Returns:

Value of the template.

reset()[source]

Initialises all needed variables to default values

set_cover(file_name, content, create_page=True)[source]

Set cover and create cover document if needed.

Args:
  • file_name: file name of the cover page
  • content: Content for the cover image
  • create_page: Should cover page be defined. Defined as bool value (optional). Default value is True.
set_direction(direction)[source]
Args:
  • direction: Options are “ltr”, “rtl” and “default”
set_identifier(uid)[source]

Sets unique id for this epub

Args:
  • uid: Value of unique identifier for this book
set_language(lang)[source]

Set language for this epub. You can set multiple languages. Specific items in the book can have different language settings.

Args:
  • lang: Language code
set_template(name, value)[source]

Defines templates which are used to generate certain types of pages. When defining new value for the template we have to use content of type ‘str’ (Python 2) or ‘bytes’ (Python 3).

At the moment we use these templates:
  • ncx
  • nav
  • chapter
  • cover
Args:
  • name: Name for the template
  • value: Content for the template
set_title(title)[source]

Set title. You can set multiple titles.

Args:
  • title: Title value
set_unique_metadata(namespace, name, value, others=None)[source]

Add metadata if metadata with this identifier does not already exist, otherwise update existing metadata.

class ebooklib.epub.EpubCover(uid='cover-img', file_name='')[source]

Bases: ebooklib.epub.EpubItem

Represents Cover image in the EPUB file.

get_type()[source]

Guess type according to the file extension. Might not be the best way how to do it, but it works for now.

Items can be of type:
  • ITEM_UNKNOWN = 0
  • ITEM_IMAGE = 1
  • ITEM_STYLE = 2
  • ITEM_SCRIPT = 3
  • ITEM_NAVIGATION = 4
  • ITEM_VECTOR = 5
  • ITEM_FONT = 6
  • ITEM_VIDEO = 7
  • ITEM_AUDIO = 8
  • ITEM_DOCUMENT = 9
  • ITEM_COVER = 10

We map type according to the extensions which are defined in ebooklib.EXTENSIONS.

Returns:Returns type of the item as number.
class ebooklib.epub.EpubCoverHtml(uid='cover', file_name='cover.xhtml', image_name='', title='Cover')[source]

Bases: ebooklib.epub.EpubHtml

Represents Cover page in the EPUB file.

get_content()[source]

Returns content for cover page as HTML string. Content will be of type ‘str’ (Python 2) or ‘bytes’ (Python 3).

Returns:Returns content of this document.
is_chapter()[source]

Returns if this document is chapter or not.

Returns:Returns book value.
exception ebooklib.epub.EpubException(code, msg)[source]

Bases: exceptions.Exception

class ebooklib.epub.EpubHtml(uid=None, file_name='', media_type='', content=None, title='', lang=None, direction=None)[source]

Bases: ebooklib.epub.EpubItem

Represents HTML document in the EPUB file.

add_item(item)[source]

Add other item to this document. It will create additional links according to the item type.

Args:
  • item: item we want to add defined as instance of EpubItem

Add additional link to the document. Links will be embeded only inside of this document.

>>> add_link(href='styles.css', rel='stylesheet', type='text/css')
get_body_content()[source]

Returns content of BODY element for this HTML document. Content will be of type ‘str’ (Python 2) or ‘bytes’ (Python 3).

Returns:Returns content of this document.
get_content(default=None)[source]

Returns content for this document as HTML string. Content will be of type ‘str’ (Python 2) or ‘bytes’ (Python 3).

Args:
  • default: Default value for the content if it is not defined.
Returns:

Returns content of this document.

get_language()[source]

Get language code for this book item. Language of the book item can be different from the language settings defined globaly for book.

Returns:As string returns language code.

Returns list of additional links defined for this document.

Returns:As tuple return list of links.

Returns list of additional links of specific type.

Returns:As tuple returns list of links.
get_type()[source]

Always returns ebooklib.ITEM_DOCUMENT as type of this document.

Returns:Always returns ebooklib.ITEM_DOCUMENT
is_chapter()[source]

Returns if this document is chapter or not.

Returns:Returns book value.
set_language(lang)[source]

Sets language for this book item. By default it will use language of the book but it can be overwritten with this call.

class ebooklib.epub.EpubImage[source]

Bases: ebooklib.epub.EpubItem

Represents Image in the EPUB file.

get_type()[source]

Guess type according to the file extension. Might not be the best way how to do it, but it works for now.

Items can be of type:
  • ITEM_UNKNOWN = 0
  • ITEM_IMAGE = 1
  • ITEM_STYLE = 2
  • ITEM_SCRIPT = 3
  • ITEM_NAVIGATION = 4
  • ITEM_VECTOR = 5
  • ITEM_FONT = 6
  • ITEM_VIDEO = 7
  • ITEM_AUDIO = 8
  • ITEM_DOCUMENT = 9
  • ITEM_COVER = 10

We map type according to the extensions which are defined in ebooklib.EXTENSIONS.

Returns:Returns type of the item as number.
class ebooklib.epub.EpubItem(uid=None, file_name='', media_type='', content='', manifest=True)[source]

Bases: object

Base class for the items in a book.

get_content(default='')[source]

Returns content of the item. Content should be of type ‘str’ (Python 2) or ‘bytes’ (Python 3)

Args:
  • default: Default value for the content if it is not already defined.
Returns:

Returns content of the item.

get_id()[source]

Returns unique identifier for this item.

Returns:Returns uid number as string.
get_name()[source]

Returns name for this item. By default it is always file name but it does not have to be.

Returns:Returns file name for this item.
get_type()[source]

Guess type according to the file extension. Might not be the best way how to do it, but it works for now.

Items can be of type:
  • ITEM_UNKNOWN = 0
  • ITEM_IMAGE = 1
  • ITEM_STYLE = 2
  • ITEM_SCRIPT = 3
  • ITEM_NAVIGATION = 4
  • ITEM_VECTOR = 5
  • ITEM_FONT = 6
  • ITEM_VIDEO = 7
  • ITEM_AUDIO = 8
  • ITEM_DOCUMENT = 9
  • ITEM_COVER = 10

We map type according to the extensions which are defined in ebooklib.EXTENSIONS.

Returns:Returns type of the item as number.
set_content(content)[source]

Sets content value for this item.

Args:
  • content: Content value
class ebooklib.epub.EpubNav(uid='nav', file_name='nav.xhtml', media_type='application/xhtml+xml')[source]

Bases: ebooklib.epub.EpubHtml

Represents Navigation Document in the EPUB file.

is_chapter()[source]

Returns if this document is chapter or not.

Returns:Returns book value.
class ebooklib.epub.EpubNcx(uid='ncx', file_name='toc.ncx')[source]

Bases: ebooklib.epub.EpubItem

Represents Navigation Control File (NCX) in the EPUB.

class ebooklib.epub.EpubReader(epub_file_name, options=None)[source]

Bases: object

DEFAULT_OPTIONS = {}
load()[source]
process()[source]
read_file(name)[source]
class ebooklib.epub.EpubWriter(name, book, options=None)[source]

Bases: object

DEFAULT_OPTIONS = {'epub2_guide': True, 'epub3_landmark': True, 'landmark_title': 'Guide', 'package_direction': False, 'play_order': {'start_from': 1, 'enabled': False}, 'spine_direction': True}
process()[source]
write()[source]

Bases: object

class ebooklib.epub.Section(title, href='')[source]

Bases: object

ebooklib.epub.read_epub(name, options=None)[source]

Creates new instance of EpubBook with the content defined in the input file.

>>> book = ebooklib.read_epub('book.epub')
Args:
  • name: full path to the input file
  • options: extra options as dictionary (optional)
Returns:

Instance of EpubBook.

ebooklib.epub.write_epub(name, book, options=None)[source]

Creates epub file with the content defined in EpubBook.

>>> ebooklib.write_epub('book.epub', book)
Args:
  • name: file name for the output file
  • book: instance of EpubBook
  • options: extra opions as dictionary (optional)

utils Module

ebooklib.utils.debug(obj)[source]
ebooklib.utils.guess_type(extenstion)[source]
ebooklib.utils.parse_html_string(s)[source]
ebooklib.utils.parse_string(s)[source]

Subpackages

plugins Package

base Module
class ebooklib.plugins.base.BasePlugin[source]

Bases: object

after_read(book)[source]

Processing after save

after_write(book)[source]

Processing after save

before_read(book)[source]

Processing before save

before_write(book)[source]

Processing before save

html_after_read(book, chapter)[source]

Processing HTML before read.

html_before_write(book, chapter)[source]

Processing HTML before save.

item_after_read(book, item)[source]

Process general item after read.

item_before_write(book, item)[source]

Process general item before write.

booktype Module
class ebooklib.plugins.booktype.BooktypeFootnotes(booktype_book)[source]

Bases: ebooklib.plugins.base.BasePlugin

NAME = 'Booktype Footnotes'
html_before_write(book, chapter)[source]

Processing HTML before save.

Bases: ebooklib.plugins.base.BasePlugin

NAME = 'Booktype Links'
html_before_write(book, chapter)[source]

Processing HTML before save.

sourcecode Module
class ebooklib.plugins.sourcecode.SourceHighlighter[source]

Bases: ebooklib.plugins.base.BasePlugin

html_before_write(book, chapter)[source]

Processing HTML before save.

standard Module
class ebooklib.plugins.standard.SyntaxPlugin[source]

Bases: ebooklib.plugins.base.BasePlugin

NAME = 'Check HTML syntax'
html_before_write(book, chapter)[source]

Processing HTML before save.

ebooklib.plugins.standard.leave_only(item, tag_list)[source]
tidyhtml Module
class ebooklib.plugins.tidyhtml.TidyPlugin(extra={})[source]

Bases: ebooklib.plugins.base.BasePlugin

NAME = 'Tidy HTML'
OPTIONS = {'char-encoding': 'utf8', 'tidy-mark': 'no'}
html_after_read(book, chapter)[source]

Processing HTML before read.

html_before_write(book, chapter)[source]

Processing HTML before save.

ebooklib.plugins.tidyhtml.tidy_cleanup(content, **extra)[source]

Usage

Reading

from ebooklib import epub
import ebooklib

book = epub.read_epub('test.epub')

for image in book.get_items_of_type(ebooklib.ITEM_IMAGE):
    print image

Writing

from ebooklib import epub

book = epub.EpubBook()

# set metadata
book.set_identifier('id123456')
book.set_title('Sample book')
book.set_language('en')

book.add_author('Author Authorowski')
book.add_author('Danko Bananko', file_as='Gospodin Danko Bananko', role='ill', uid='coauthor')

# create chapter
c1 = epub.EpubHtml(title='Intro', file_name='chap_01.xhtml', lang='hr')
c1.content=u'<h1>Intro heading</h1><p>Žaba je skočila u baru.</p>'

# add chapter
book.add_item(c1)

# define Table Of Contents
book.toc = (epub.Link('chap_01.xhtml', 'Introduction', 'intro'),
             (epub.Section('Simple book'),
             (c1, ))
            )

# add default NCX and Nav file
book.add_item(epub.EpubNcx())
book.add_item(epub.EpubNav())

# define CSS style
style = 'BODY {color: white;}'
nav_css = epub.EpubItem(uid="style_nav", file_name="style/nav.css", media_type="text/css", content=style)

# add CSS file
book.add_item(nav_css)

# basic spine
book.spine = ['nav', c1]

# write to the file
epub.write_epub('test.epub', book, {})

Further examples are available in https://github.com/aerkalov/ebooklib/tree/master/samples

Indices and tables