Skip to content

Monadical's implementation of python bindings for libzim using the "class blend" approach #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 36 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
f67e296
Initial version python 2.7
jdcaballerov Mar 31, 2020
537759a
Change Cython language version
jdcaballerov Mar 31, 2020
20c32c8
Python 3 corrections
jdcaballerov Mar 31, 2020
8353a40
Revert "Change Cython language version"
jdcaballerov Mar 31, 2020
78480be
Rename README.rst -> README.md
jdcaballerov Mar 31, 2020
1b054f0
Change Readme.rst -> Readme.md in setup.py
jdcaballerov Apr 1, 2020
9f74ba7
Refactor to use Cython properties
jdcaballerov Apr 2, 2020
7a7d442
Delete test.py
jdcaballerov Apr 2, 2020
fe7452c
Update all the examples cookbook
jdcaballerov Apr 2, 2020
57d1665
Add get_metadata dict to ZimReader
jdcaballerov Apr 2, 2020
60faf37
Delete tests files at start and ending
jdcaballerov Apr 2, 2020
d835898
fix README formatting
pirate Apr 6, 2020
73a7a64
Delete add_art from example
jdcaballerov Apr 6, 2020
a75bf02
Remove u identifier from example strings
jdcaballerov Apr 6, 2020
92a9c26
Remove u identifier in some examples.py strings
jdcaballerov Apr 6, 2020
842e09c
Cython class blend initial version
jdcaballerov Apr 6, 2020
3283c3b
Add Cython Generated pyzim.cpp
jdcaballerov Apr 7, 2020
ef5faba
Refactor libzim
jdcaballerov Apr 16, 2020
d1057fb
Add content reference
jdcaballerov Apr 16, 2020
b2e4643
Allow to transparently construct ZimBlob with bytes or str
jdcaballerov Apr 16, 2020
b0a5b4f
Correct example content
jdcaballerov Apr 18, 2020
867a170
Remove delete this from ZimCreatorWrapper.finalize()
jdcaballerov Apr 18, 2020
e1e4ac7
Simplify long url check
jdcaballerov Apr 18, 2020
a8c5baf
Remove empty bytes object from public api
jdcaballerov Apr 18, 2020
cc31d31
Remove duplicate code in public api
jdcaballerov Apr 18, 2020
def7029
Add enforce mandatory metadata feature
jdcaballerov Apr 18, 2020
c60a6b8
Remove check metadata check from finalize
jdcaballerov Apr 19, 2020
cede983
Update finalize docstring
jdcaballerov Apr 19, 2020
24a3480
Add write metadata function
jdcaballerov Apr 19, 2020
0e4e32d
Fix html adding head closing in test
jdcaballerov Apr 19, 2020
63d5721
Fix docstring
jdcaballerov Apr 20, 2020
9068761
Add content reference to article
jdcaballerov Apr 20, 2020
5265e48
Write metadata in finalize
jdcaballerov Apr 20, 2020
20c76fc
Remove manual metadata write from examples.py
jdcaballerov Apr 20, 2020
ae41a0a
Add ZimFileReader and ZimFileArticle implementation and tests
jdcaballerov Apr 21, 2020
412006c
Fix docstirngs ZimFileReader and ZimFileArticle
jdcaballerov Apr 21, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
FROM ubuntu:bionic

# Update system
RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections

# Configure locales
RUN apt-get update -y && \
apt-get install -y --no-install-recommends locales && \
apt-get clean -y && \
rm -rf /var/lib/apt/lists/*
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8
RUN locale-gen en_US.UTF-8

# Install necessary packages
RUN apt-get update -y && \
apt-get install -y --no-install-recommends git pkg-config libtool automake autoconf make g++ liblzma-dev coreutils meson ninja-build wget zlib1g-dev libicu-dev libgumbo-dev libmagic-dev ca-certificates && \
apt-get clean -y && \
rm -rf /var/lib/apt/lists/*

# Update CA certificates
RUN update-ca-certificates

# Install Xapian (wget zlib1g-dev)
RUN wget https://oligarchy.co.uk/xapian/1.4.14/xapian-core-1.4.14.tar.xz
RUN tar xvf xapian-core-1.4.14.tar.xz
RUN cd xapian-core-1.4.14 && ./configure
RUN cd xapian-core-1.4.14 && make all install
RUN rm -rf xapian

# Install zimlib (libicu-dev)
RUN git clone https://github.com/openzim/libzim.git
RUN cd libzim && git checkout 6.0.2
RUN cd libzim && meson . build
RUN cd libzim && ninja -C build install
RUN rm -rf libzim

RUN ldconfig
ENV LD_LIBRARY_PATH /usr/local/lib/x86_64-linux-gnu/

# Install python dependecies

RUN apt-get update -y && \
apt-get install -y --no-install-recommends python-dev python3-dev python3-pip && \
apt-get clean -y && \
rm -rf /var/lib/apt/lists/*

# Install Cython

RUN pip3 install Cython
82 changes: 82 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@

# Setup

```bash
docker-compose build
docker-compose run libzim /bin/bash
```
```bash
python setup.py build_ext -i
python tests/test_libzim.py

# or

./rebuild.sh
./run_tests
```

Example:

```python3
from libzim import ZimArticle, ZimBlob, ZimCreator

class ZimTestArticle(ZimArticle):
content = '''<!DOCTYPE html>
<html class="client-js">
<head><meta charset="UTF-8">
<title>Monadical</title>
</head>
<h1> ñññ Hello, it works ñññ </h1></html>'''

def __init__(self):
ZimArticle.__init__(self)

def is_redirect(self):
return False

def get_url(self):
return "A/Monadical_SAS"

def get_title(self):
return "Monadical SAS"

def get_mime_type(self):
return "text/html"

def get_filename(self):
return ""

def should_compress(self):
return True

def should_index(self):
return True

def get_data(self):
return ZimBlob(self.content.encode('UTF-8'))

# Create a ZimTestArticle article

article = ZimTestArticle()
print(article.content)

# Write the articles
import uuid
rnd_str = str(uuid.uuid1())

test_zim_file_path = "/opt/python-libzim/tests/kiwix-test"

zim_creator = ZimCreator(test_zim_file_path + '-' + rnd_str + '.zim',main_page = "welcome",index_language= "eng", min_chunk_size= 2048)

# Add article to zim file
zim_creator.add_article(article)


# Set mandatory metadata
if not zim_creator.mandatory_metadata_ok():
zim_creator.update_metadata(creator='python-libzim',description='Created in python',name='Hola',publisher='Monadical',title='Test Zim')

# Write article to zim file
zim_creator.finalize()

```
13 changes: 13 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
version: '3'

services:
libzim:
build:
context: .
dockerfile: ./Dockerfile
image: kiwix:python-libzim
working_dir: /opt/python-libzim
stdin_open: true
tty: true
volumes:
- .:/opt/python-libzim
75 changes: 75 additions & 0 deletions libzim/examples.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
from libzim import ZimArticle, ZimBlob, ZimCreator

class ZimTestArticle(ZimArticle):

def __init__(self, url, title, content):
ZimArticle.__init__(self)
self.url = url
self.title = title
self.content = content

def is_redirect(self):
return False

def get_url(self):
return f"A/{self.url}"

def get_title(self):
return f"{self.title}"

def get_mime_type(self):
return "text/html"

def get_filename(self):
return ""

def should_compress(self):
return True

def should_index(self):
return True

def get_data(self):
return ZimBlob(self.content)

# Create a ZimTestArticle article

content = '''<!DOCTYPE html>
<html class="client-js">
<head><meta charset="UTF-8">
<title>Monadical</title>
</head>
<h1> ñññ Hello, it works ñññ </h1></html>'''

content2 = '''<!DOCTYPE html>
<html class="client-js">
<head><meta charset="UTF-8">
<title>Monadical 2</title>
</head>
<h1> ñññ Hello, it works 2 ñññ </h1></html>'''

article = ZimTestArticle("Monadical_SAS", "Monadical", content)
article2 = ZimTestArticle("Monadical_2", "Monadical 2", content2)

print(article.content)

# Write the article
import uuid
rnd_str = str(uuid.uuid1())

test_zim_file_path = "/opt/python-libzim/tests/kiwix-test"

zim_creator = ZimCreator(test_zim_file_path + '-' + rnd_str + '.zim',main_page = "Monadical",index_language= "eng", min_chunk_size= 2048)

# Add articles to zim file
zim_creator.add_article(article)
zim_creator.add_article(article2)

# Set mandatory metadata
if not zim_creator.mandatory_metadata_ok():
zim_creator.update_metadata(creator='python-libzim',description='Created in python',name='Hola',publisher='Monadical',title='Test Zim')

print(zim_creator._get_metadata())

# Write articles to zim file
zim_creator.finalize()
Loading