Tag: python

Comparing Python Executable Packaging Tools: PEX, PyOxidizer, and PyInstaller

2024/10/25 No Comments

Reading time: 9 – 15 minutes

Packaging Python applications into standalone executables can simplify deployment and distribution, especially when dealing with users who may not have Python installed or when aiming for a seamless installation experience. Three prominent tools in this space are PEX, PyOxidizer, and PyInstaller. In this post, we’ll explore each of these tools, highlighting their features, how they work, and their pros and cons to help you decide which one suits your needs.

PEX (Python EXecutable)
PyOxidizer
PyInstaller
Comparison
Conclusion

PEX (Python EXecutable)

Website: docs.pex-tool.org
GitHub: github.com/pex-tool/pex

Overview

PEX stands for Python EXecutable. It creates self-contained executable Python environments that are runnable on other machines without requiring a Python interpreter or additional dependencies.

Features

Self-contained Executables: Packages all dependencies into a single file.
Virtual Environment Management: Manages dependencies in an isolated environment.
Support for Multiple Python Versions: Can target different Python versions.
Reproducible Builds: Ensures consistent builds across different environments.

How It Works

PEX files are ZIP files with a special header that makes them executable. When you run a PEX file, it sets up an isolated environment and executes your application within it. Dependencies are resolved and bundled at build time, ensuring that the executable has everything it needs to run.

Pros and Cons

Pros:

Ease of Use: Straightforward command-line interface.
Isolation: Avoids conflicts with system-installed packages.
Flexible Configuration: Supports complex dependency management.

Cons:

Platform Dependency: The generated PEX file is platform-specific. (OS and Python version)
Size Overhead: Can result in large executable files due to included dependencies.

PyOxidizer

Website: gregoryszorc.com
GitHub: github.com/indygreg/PyOxidizer

Overview

PyOxidizer is a tool that produces distributable binaries from Python applications. It embeds the Python interpreter and your application into a single executable file.

Features

Single Executable Output: Creates a single binary without external dependencies.
Embedded Python Interpreter: Bundles a Rust-based Python interpreter.
Cross-Compilation: Supports building executables for different platforms.
Performance Optimization: Optimizes startup time and reduces runtime overhead.

How It Works

PyOxidizer uses Rust to compile your Python application into a binary. It embeds the Python interpreter and compiles your Python code into bytecode, which is then included in the binary. This approach results in a single executable that can be distributed without requiring a separate Python installation.

Pros and Cons

Pros:

No Runtime Dependencies: Users don’t need Python installed.
Cross-Platform Support: Can build executables for Windows, macOS, and Linux.
Optimized Performance: Faster startup times compared to other tools.

Cons:

Complex Configuration: Requires understanding of Rust and PyOxidizer’s configuration.
Relatively New Tool: May have less community support and fewer resources.

PyInstaller

Website: pyinstaller.org
GitHub: github.com/pyinstaller/pyinstaller

Overview

PyInstaller bundles a Python application and all its dependencies into a single package, which can be a directory or a standalone executable.

Features

Multi-Platform Support: Works on Windows, macOS, and Linux.
Customizable Builds: Allows inclusion or exclusion of files and dependencies.
Support for Various Libraries: Handles complex dependencies like NumPy, PyQt, etc.
One-Folder and One-File Modes: Choose between a directory of files or a single executable.

How It Works

PyInstaller analyzes your Python script to discover every other module and library your script needs to run. It then collects copies of all those files—including the active Python interpreter—and packs them into a single executable or a folder.

Pros and Cons

Pros:

Ease of Use: Simple command-line usage.
Wide Compatibility: Supports many third-party packages.
Flexible Output Options: Choose between single-file or directory output.

Cons:

Executable Size: Can produce large files.
Hidden Imports: May miss some dependencies, requiring manual specification.

Comparison

Feature	PEX	PyOxidizer	PyInstaller
Single Executable	Yes (but requires Python)	Yes	Yes
No Python Required	No	Yes	Yes
Cross-Platform	Yes (build on target OS)	Yes (cross-compilation)	Yes (build on target OS)
Ease of Use	Moderate	Complex	Easy
Executable Size	Large	Smaller	Large
Configuration	Flexible	Requires Rust knowledge	Simple
Community Support	Active	Growing	Extensive
GitHub Activity	Actively maintained	Unmaintained	Actively maintained

Conclusion

Choosing the right tool depends on your specific needs and the assurance of ongoing support:

Use PEX if you need a self-contained environment for systems where Python is available. Its active maintenance ensures that you can rely on timely updates and community support.
Use PyOxidizer if you prefer a single executable without runtime dependencies and are comfortable with Rust. Its growing GitHub activity signifies a promising future and dedicated maintenance.
Use PyInstaller if you value simplicity and extensive community support. Its active maintenance status means you can expect regular updates and a wealth of community resources.

References:

Dynaconf: A Comprehensive Guide to Configuration Management in Python

2023/11/01 No Comments

Reading time: 35 – 58 minutes

Dynaconf is a powerful configuration management library for Python that simplifies the process of managing and accessing application settings. It offers a wide range of features, including support for multiple configuration sources, environment variables, and dynamic reloading.

Main features:

Multiple configuration sources: Dynaconf can read settings from various sources, including environment variables, files, and even Vault servers. This flexibility allows you to centralize your configuration management and adapt to different deployment environments.
Environment variable support: Dynaconf seamlessly integrates with environment variables, making it easy to inject configuration values from your system or containerized environments.
Dynamic reloading: Dynaconf can dynamically reload configuration changes without restarting your application, ensuring that your application always reflects the latest settings.

Supported configuration file formats:

Dynaconf supports a variety of configuration file formats, including:

TOML (recommended)
YAML
JSON
INI
Python files

Installation:

Dynaconf can be installed using either pip or poetry:

pip install dynaconf

# or

poetry add dynaconf

How to initialize Dynaconf

To initialize Dynaconf in your project, run the following command in your project’s root directory:

dynaconf init -f toml

This command will create the following files:

config.py: This file imports the Dynaconf settings object.
settings.toml: This file contains your application settings.
.secrets.toml: This file contains your sensitive data, such as passwords and tokens.

config.py

The config.py file is a Python file that imports the Dynaconf settings object. This file is required for Dynaconf to work.

Here is an example of a config.py file:

import dynaconf

settings = dynaconf.settings()

settings.toml

The settings.toml file contains your application settings. This file is optional, but it is the recommended way to store your settings.

Here is an example of a settings.toml file:

[default]
debug = false
host = "localhost"
port = 5000

secrets.toml

The .secrets.toml file contains your sensitive data, such as passwords and tokens. This file is optional, but it is recommended to store your sensitive data in a separate file from your application settings.

Here is an example of a .secrets.toml file:

[default]
database_password = "my_database_password"
api_token = "my_api_token"

Importing and using the Dynaconf settings object

Once you have initialized Dynaconf, you can import the settings object into your Python code. For example:

from config import settings

You can then access your application settings using the settings object. For example:

setting_value = settings.get('KEY')

Dynaconf also supports accessing settings using attributes. For example:

setting_value = settings.KEY

Importing environment variables in Docker-Compose:

When using Dynaconf with Docker-Compose, the best strategy for importing environment variables is to define them in your docker-compose.yml file and then access them using Dynaconf’s envvar_prefix setting.

version: "3.8"

services:
  app:
    build: .
    environment:
      - APP_DEBUG=true
      - APP_HOST=0.0.0.0
      - APP_PORT=8000

In your Python code, you can access these environment variables using Dynaconf:

import dynaconf

settings = dynaconf.settings(envvar_prefix="APP")

debug = settings.get("DEBUG")
host = settings.get("HOST")
port = settings.get("PORT")

Conclusion:

Dynaconf is a versatile and powerful configuration management library for Python that simplifies the process of managing and accessing application settings. Its comprehensive feature set and ease of use make it an ideal choice for a wide range of Python projects.

HTTPie: The Next-Level Tool for Querying HTTP Resources

2023/09/29 No Comments

Reading time: 19 – 31 minutes

HTTPie is not only an intuitively designed tool, but it also offers user-friendly methods to send HTTP requests directly from the command line. For developers looking for a more elegant and visual approach than traditional tools like curl or wget, HTTPie comes as a refreshing solution.

Installing HTTPie Without System Packages

Sometimes, relying on system packages isn’t an option due to various constraints or the desire to always fetch the latest version directly. Here are three alternative methods to get the latest version of HTTPie:

Using curl with jq

curl -L -o http $(curl -s https://api.github.com/repos/httpie/cli/releases/latest | jq -r ".assets[] | select(.name == \"http\") | .browser_download_url")

Using wget with jq

wget -O http $(wget -q -O - https://api.github.com/repos/httpie/cli/releases/latest | jq -r ".assets[] | select(.name == \"http\") | .browser_download_url")

Python One-liner (Python 3)

python3 -c "from urllib.request import urlopen; from json import loads; open('http', 'wb').write(urlopen([asset['browser_download_url'] for asset in loads(urlopen('https://api.github.com/repos/httpie/cli/releases/latest').read().decode())['assets'] if asset['name'] == 'http'][0]).read())"

These methods ensure you’re directly fetching the binary from the latest GitHub release, bypassing any potential system package cache limitations.

Exploring HTTPie’s Features with Examples

To truly appreciate the capabilities of HTTPie, one should explore its rich array of features. The official HTTPie Examples page showcases a variety of use cases. From simple GET requests to more complex POST requests with data, headers, and authentication, the examples provided make it evident why HTTPie stands out.

For instance, performing a simple GET request is as easy as:

http https://httpie.io

Or, if you want to post data:

http POST httpie.io/post Hello=World

Dive deeper into the examples to discover how HTTPie can simplify your HTTP querying experience.

Conclusion

HTTPie offers a refreshing approach to HTTP interactions, bringing clarity and simplicity to the command line. With flexible installation methods and an array of powerful features, it’s an indispensable tool for developers aiming for efficiency. Give HTTPie a try, and it might just become your go-to for all HTTP-related tasks!

Convert JSON file to YAML file using CLI

2019/08/12 No Comments

Reading time: 4 – 6 minutes

Just a cookbook about how to get a YAML file when you have a JSON one.

python -c 'import sys, yaml, json; yaml.safe_dump(json.load(sys.stdin), sys.stdout, default_flow_style=False)' < file.json > file.yaml

Python logger, quite interesting wrapper for python logging library

2017/11/22 No Comments

Reading time: < 1 minute Python logging library is really flexible and powerful but usually, you need some time for setting up the basics or just for logging in a simple script, some commands and settings have to be done. Daiquiri is a library which wrapper python logging library and offers a simple interface for start enjoying logging features in python. Next, there is a hello world example extracted from Daiquiri documentation which shows how easy it gets nice output from the console when you're programming simple scripts.

Ansible and Windows Playbooks

2015/01/29 No Comments

Reading time: 33 – 54 minutes

Firstly let me introduce a Windows service called: “Windows Remote Manager” or “WinRM”. This is the Windows feature that allows remote control of Windows machines and many other remote functionalities. In my case I have a Windows 7 laptop with SP1 and PowerShell v3 installed.

Secondly don’t forget that Ansible is developed using Python then a Python library have to manage the WinRM protocol. I’m talking about “pywinrm“. Using this library it’s easy to create simple scripts like that:

#!/usr/bin/env python

import winrm

s = winrm.Session('10.2.0.42', auth=('the_username', 'the_password'))
r = s.run_cmd('ipconfig', ['/all'])
print r.status_code
print r.std_out
print r.std_err

This is a remote call to the command “ipconfig /all” to see the Windows machine network configuration. The output is something like:

$ ./winrm_ipconfig.py 
0

Windows IP Configuration

   Host Name . . . . . . . . . . . . : mini7w
   Primary Dns Suffix  . . . . . . . : 
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : ymbi.net

Ethernet adapter GigaBit + HUB USB:

   Connection-specific DNS Suffix  . : ymbi.net
   Description . . . . . . . . . . . : ASIX AX88179 USB 3.0 to Gigabit Ethernet Adapter
   Physical Address. . . . . . . . . : 00-23-56-1C-XX-XX
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::47e:c2c:8c25:xxxx%103(Preferred) 
   IPv4 Address. . . . . . . . . . . : 10.2.0.42(Preferred) 
   Subnet Mask . . . . . . . . . . . : 255.255.255.192
   Lease Obtained. . . . . . . . . . : mi�rcoles, 28 de enero de 2015 12:41:41
   Lease Expires . . . . . . . . . . : mi�rcoles, 28 de enero de 2015 19:17:56
   Default Gateway . . . . . . . . . : 10.2.0.1
   DHCP Server . . . . . . . . . . . : 10.2.0.1
   DHCPv6 IAID . . . . . . . . . . . : 2063606614
   DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-15-F7-BF-36-xx-C5-xx-03-xx-xx
   DNS Servers . . . . . . . . . . . : 10.2.0.27
                                       10.2.0.1
   NetBIOS over Tcpip. . . . . . . . : Enabled
...

Of course, it’s possible to run Powershell scripts like the next one which shows the system memory:

$strComputer = $Host
Clear
$RAM = WmiObject Win32_ComputerSystem
$MB = 1048576

"Installed Memory: " + [int]($RAM.TotalPhysicalMemory /$MB) + " MB"

The Python code to run that script is:

#!/usr/bin/env python

import winrm

ps_script = open('scripts/mem.ps1','r').read()
s = winrm.Session('10.2.0.42', auth=('the_username', 'the_password'))
r = s.run_ps(ps_script)
print r.status_code
print r.std_out
print r.std_err

and the output:

$ ./winrm_mem.py 
0
Installed Memory: 2217 MB

In the end it’s time to talk about how to create an Ansible Playbook to deploy anything in a Windows machine. As always the first thing that we need is a hosts file. In the next example there are several ansible variables needed to run Ansible Windows modules on WinRM, all of them are self-explanatory:

[all]
10.2.0.42

[all:vars]

ansible_ssh_user=the_username ansible_ssh_pass=the_password ansible_ssh_port=5985 #winrm (non-ssl) port ansible_connection=winrm

The first basic example could be a simple playbook that runs the ‘ipconfig’ command and registers the output in an Ansible variable to be showed later like a debug information:

- name: test raw module
  hosts: all
  tasks:
    - name: run ipconfig
      raw: ipconfig
      register: ipconfig
    - debug: var=ipconfig

The command and the output to run latest example:

$ ansible-playbook -i hosts ipconfig.yml 

PLAY [test raw module] ******************************************************** 

GATHERING FACTS *************************************************************** 
ok: [10.2.0.42]

TASK: [run ipconfig] ********************************************************** 
ok: [10.2.0.42]

TASK: [debug var=ipconfig] **************************************************** 
ok: [10.2.0.42] => {
    "ipconfig": {
        "invocation": {
            "module_args": "ipconfig", 
            "module_name": "raw"
        }, 
        "rc": 0, 
        "stderr": "", 
        "stdout": "\r\nWindows IP Configuration\r\n\r\n\r\nEthernet adapter GigaBit 
...
        ]
    }
}

PLAY RECAP ******************************************************************** 
10.2.0.42                  : ok=3    changed=0    unreachable=0    failed=0

As always Ansible have several modules, not only the ‘raw’ module. I committed two examples in my Github account using a module to download URLs and another one that runs Powershell scripts.

My examples are done using Ansible 1.8.2 installed in a Fedora 20. But main problems I’ve found are configuring Windows 7 to accept WinRM connections. Next I attach some references that helped me a lot:

pywinrm Github project notes., easy to trace errors than Ansible.
Ansible notes about Windows.
How to enable WinRM via Group Policy by Alan Burchill

If you want to use my tests code you can connect to my Github: Basic Ansible playbooks for Windows.

Using Ansible like library programming in Python

2015/01/21 7 Comments

Reading time: 22 – 36 minutes

Ansible is a very powerful tool. Using playbooks, something like a cookbook, is very easy to automate maintenance tasks of systems. I used Puppet and other tools like that but IMHO Ansible is the best one.

In some cases you need to manage dynamic systems and take into advantage of Ansible like a Python library is a very good complement for your scripts. This is my last requirement and because of that I decided to share some simple Python snippets that help you to understand how to use Ansible as a Python library.

Firstly an example about how to call an Ansible module with just one host in the inventory (test_modules.py):

#!/usr/bin/python 
import ansible.runner
import ansible.playbook
import ansible.inventory
from ansible import callbacks
from ansible import utils
import json

# the fastest way to set up the inventory

# hosts list
hosts = ["10.11.12.66"]
# set up the inventory, if no group is defined then 'all' group is used by default
example_inventory = ansible.inventory.Inventory(hosts)

pm = ansible.runner.Runner(
    module_name = 'command',
    module_args = 'uname -a',
    timeout = 5,
    inventory = example_inventory,
    subset = 'all' # name of the hosts group 
    )

out = pm.run()

print json.dumps(out, sort_keys=True, indent=4, separators=(',', ': '))

As a second example, we’re going to use a simple Ansible Playbook with that code (test.yml):

- hosts: sample_group_name
  tasks:
    - name: just an uname
      command: uname -a

The Python code which uses that playbook is (test_playbook.py):

#!/usr/bin/python 
import ansible.runner
import ansible.playbook
import ansible.inventory
from ansible import callbacks
from ansible import utils
import json

### setting up the inventory

## first of all, set up a host (or more)
example_host = ansible.inventory.host.Host(
    name = '10.11.12.66',
    port = 22
    )
# with its variables to modify the playbook
example_host.set_variable( 'var', 'foo')

## secondly set up the group where the host(s) has to be added
example_group = ansible.inventory.group.Group(
    name = 'sample_group_name'
    )
example_group.add_host(example_host)

## the last step is set up the invetory itself
example_inventory = ansible.inventory.Inventory()
example_inventory.add_group(example_group)
example_inventory.subset('sample_group_name')

# setting callbacks
stats = callbacks.AggregateStats()
playbook_cb = callbacks.PlaybookCallbacks(verbose=utils.VERBOSITY)
runner_cb = callbacks.PlaybookRunnerCallbacks(stats, verbose=utils.VERBOSITY)

# creating the playbook instance to run, based on "test.yml" file
pb = ansible.playbook.PlayBook(
    playbook = "test.yml",
    stats = stats,
    callbacks = playbook_cb,
    runner_callbacks = runner_cb,
    inventory = example_inventory,
    check=True
    )

# running the playbook
pr = pb.run()  

# print the summary of results for each host
print json.dumps(pr, sort_keys=True, indent=4, separators=(',', ': '))

If you want to download example files you can go to my github account: github.com/oriolrius/programming-ansible-basics

I hope it was useful for you.

Celery logs through syslog

2013/09/06 No Comments

Reading time: 15 – 24 minutes

Celery logs are colorized by default, the first big idea is disable color logs. It’s as easy as setting ‘CELERYD_LOG_COLOR’ to ‘False’ in ‘celery.conf’. The code could be something like this:

celery.conf.update('CELERYD_LOG_COLOR' = False)

Secondly we need a function where we set up a new handler and other settings to celery logging system. For example, the code could be:

from __future__ import absolute_import
from logging import BASIC_FORMAT, Formatter
from logging.handlers import SysLogHandler
from celery.log import redirect_stdouts_to_logger

def setup_log(**args):
    # redirect stdout and stderr to logger
    redirect_stdouts_to_logger(args['logger'])
    # logs to local syslog
    hl = SysLogHandler('/dev/log')
    # setting log level
    hl.setLevel(args['loglevel'])
    # setting log format
    formatter = Formatter(BASIC_FORMAT)
    hl.setFormatter(formatter)
    # add new handler to logger
    args['logger'].addHandler(hl)

Pay attention to ‘redirect_stdouts_to_logger’ it’s used to send all outputs like print’s or something else to syslog.

Thirdly we want to use those settings in our celery tasks, then we have to connect ‘setup_log’ code to some celery signals. Those signals are launched when ‘task_logger’ and ‘logger’ are configured. To connect signals:

from celery.signals import after_setup_task_logger, after_setup_logger

after_setup_logger.connect(setup_log)
after_setup_task_logger.connect(setup_log)

Fourthly we have to get the ‘logger’, we can have more than one if we are interested in records with task context or without it. For example:

logger = get_logger('just_a_name_for_internal_use')
logger_with_task_context = get_task_logger('name_of_the_task_to_be_recorded_in_logs')

Finally we only have to use those loggers with common methods DEBUG, INFO, WARN, ERROR and CRITICAL:

@celery.task
def the_task():
    logger.info('this is a message without task context')
    logger_with_task_context.debug('this record will have the prefix "name_of_the_task_to_be_recorded_in_logs" in syslog')

Deep inside AMQP

2012/03/23 2 Comments

Reading time: 5 – 8 minutes

In the next lines I’ll describe with more details the properties and features of AMQP elements. It won’t be an exhaustive description but in my opinion more than enough to start playing with AMQP queues.

Channels

When producers and consumers connects to the broker using a TCP socket after authenticating the connection they establish a channel where AMQP commands are sent. The channel is a virtual path inside a TCP connection between this is very useful because there can be multiple channels inside the TCP connection each channels is identified using an unique ID.

An interesting parameter of a channel is confirmation mode if this is set to true when messages delivered to a exchange finally gets their queues the producer receives an acknowledge message with an UID of the message. This kind of messages are asynchronous and permits to a producer send the next message when it is still waiting the ACK message. Of course if the message cannot be stored and it is lost the producer receives a NACK (not acknowledged) message.

Producers

Maybe this is the most simple part of the system. Producers only need to negotiate the authentication across a TCP connection create a channel and then publish all messages that want with its corresponding routing key. Of course, producers can create exchanges, queues and then bind them. But usually this is not a good idea is much more secure do this from consumers. Because when a producers try to send a message to a broker and doesn’t have the needed exchange then message will be lost. Usually consumers are connected all time and subscribed to queues and producers only connect to brokers when they need to send messages.

Consumers

When a consumer connects to a queue usually uses a command called basic.consume to subscribe the channel to a queue, then every time subscribed queue has a new message it is sent to consumer after last message is consumed, or rejected.

If consumer only want to receive one message without a subscription it can use the command basic.get.This is like a poll method. In fact, the consumer only gets a message each time it sends the command.

You can get the best throughput using basic.consume command because is more efficient than poll every time the consumer wants another message.

When more than one consumer was connected to a queue, messages are distributed in a round-robin. After the message is delivered to a consumer this send an acknowledge message and then queue send another message to next consumer. If the consumer sends a reject message the same message is sent to next consumer.

There are two types of acknowledgements:

basic.ack: this is the message that sends consumer to queue to acknowledge the reception of a message
auto_ack: this is a parameter we can set when consumer subscribes to a queue. The setting assumes ACK message from consumer and then queue sends next message without waiting the ACK message.

The message basic.reject is sent when the consumer wants to reject a received message. This message discards the message and it is lost. If we want to requeue the message we can set the parameter requeue=true when sent a reject message.

When the queue is created there can be a parameter called dead letter set to true, then consumer rejects a message with the parameter requeue=false the message is queued to a new queue called dead letter. This is very useful because after all we can go tho that queue an inspect the message rejection reason.

Queues

Both consumers and producers can create a queue using queue.declare command. The most natural way is create queues from consumers and then bind it to an exchange. The consumers needs a free channel to create a queue, if a channel is subscribed to a queue, the channel is busy and cannot create new queues. When a queue is created usually we use a name to identify the queue, if the name is not specified it’s randomly generated. This is useful when create temporary and anonymous queues for RPC-over-AMQP.

Parameters we can set when create a new queue:

exclusive – this setting makes a queue private and is only accessible from your application. Only one consumer can connect to a queue.
auto-delete – when last consumer unsubscribes from queue the queue is removed.
passive – when create a queue that exists the server returns successfully or returns fail if parameters don’t match. If passive parameter is set and we create a queue that exists always returns success but if the queue doesn’t exist it is not created.
durable – the queue can persist when the services reboots.

Exchange and binding

In the first post of the serie we talked about different exchange types as you can remember these types are: direct, fanout and topic. And the most important parameter to set when producer sends a message is the routing key this is used to route the message to a queue.

Once we have declared an exchange this can be related with a queue using a binding command: queue_bind. The relation between them is made using the routing key or a pattern based in routing key. When exchange has type fanout the routing key or patterns are not needed.

Some pattern examples can be: log.*, message.* and #.

The most important exchange parameters are:

type: direct, fanout and topic.
durable: makes an exchange persistent to reboots.

Broker and virtual hosts

A broker is a container where exhanges, bindings and queues are created. Usually we can define more than one virtual brokers in the same server. Virtual brokers are also called virtual hosts. The users, permissions and something else related to a Broker cannot be used from another one. This is very useful because we can create multiple brokers in the same physical server like multi-domain web server and when some of this virtual hosts is too big it can be migrated to another physical server and it can be clustered if it is required.

Messages

An AMQP message is a binary without a fixed size and format. Each application can set it’s own messages. The AMQP broker only will add small headers to be routed among different queues as fast as possible.

Messages are not persistent inside a broker unless the producer sets the parameter persistent=true. In the other way the messages needs to be stored in durable exchanges and durable queues to persist in the broker when it is restarted. Of course when the messages are persistent these must be wrote to disk and the throughput will fall down. Then maybe sometimes create persistent messages is not a good idea.

What is AMQP? and the architecture

2012/03/15 No Comments

Reading time: 3 – 4 minutes

What is AMQP? (Advanced Message Queuing Protocol)

When two applications need to communicate there are a lot of solutions like IPC, if these applications are remote we can use RPC. When two or more applications communicate with each other we can use ESB. And there are many more solutions. But when more than two applications communicate and the systems need to be scalable the problem is a bit more complicated. In fact, when we need to send a call to a remote process or distribute object processing among different servers we start to think about queues.

Typical examples are rendering farms, massive mail sending, publish/subscriptions solutions like news systems. At that time we start to consider a queue-based solution. In my case the first approach to these types of solutions was Gearman; that is a very simple queue system where workers connect to a central service where producers have to call the methods published by workers; the messages are queued and delivered to workers in a simple queue.

Another interesting solution can be use Redis like a queue service using their features like publish/subscribe. Anyway always you can develop your own queue system. Maybe there a lot of solutions like that but when you are interested in develop in standard way and want a long-run solution with scalability and high availability then you need to think in use AMQP-based solutions.

The most simple definition of AMQP is: “message-oriented middleware”. Behind this simple definition there are a lot of features available. Before AMQP there was some message-oriented middlewares, for example, JMS. But AMQP is the standard protocol to keep when you choice a queue-based solution.

AMQP have features like queuing, routing, reliability and security. And most of the implementations of AMQP have a really scalable architectures and high availability solutions.

The architecture

The basic architecture is simple, there are a client applications called producers that create messages and deliver it to a AMQP server also called broker. Inside the broker the messages are routed and filtered until arrive to queues where another applications called consumers are connected and get the messages to be processed.

When we have understood this maybe is the time to deep inside the broker where there are AMQP magic. The broker has three parts:

Exchange: where the producer applications delivers the messages, messages have a routing key and exchange uses it to route messages.
Queues: where messages are stored and then consumers get the messages from queues.
Bindings: makes relations between exchanges and queues.

When exchange have a message uses their routing key and three different exchange methods to choose where the message goes:

Direct Exchange: routing key matches the queue name.
Fanout Exchange: the message is cloned and sent to all queues connected to this exchange.
Topic Exchange: using wildcards the message can be routed to some of connected queues.

This is the internal schema of a broker:

oriolrius.cat

Tag: python

Comparing Python Executable Packaging Tools: PEX, PyOxidizer, and PyInstaller

Table of Contents

PEX (Python EXecutable)

Overview

Features

How It Works

Pros and Cons

PyOxidizer

Overview

Features

How It Works

Pros and Cons

PyInstaller

Overview

Features

How It Works

Pros and Cons

Comparison

Conclusion

Dynaconf: A Comprehensive Guide to Configuration Management in Python

HTTPie: The Next-Level Tool for Querying HTTP Resources

Installing HTTPie Without System Packages

Exploring HTTPie’s Features with Examples

Conclusion

Convert JSON file to YAML file using CLI

Python logger, quite interesting wrapper for python logging library

Ansible and Windows Playbooks

Using Ansible like library programming in Python

Celery logs through syslog

Deep inside AMQP

Channels

Queues

Broker and virtual hosts

Messages

What is AMQP? and the architecture

What is AMQP? (Advanced Message Queuing Protocol)

The architecture