Tag: programming

Deep inside AMQP

2012/03/23 2 comentaris

Reading time: 5 – 8 minutes

In the next lines I’ll describe with more details the properties and features of AMQP elements. It won’t be an exhaustive description but in my opinion more than enough to start playing with AMQP queues.

Channels

When producers and consumers connects to the broker using a TCP socket after authenticating the connection they establish a channel where AMQP commands are sent. The channel is a virtual path inside a TCP connection between this is very useful because there can be multiple channels inside the TCP connection each channels is identified using an unique ID.

An interesting parameter of a channel is confirmation mode if this is set to true when messages delivered to a exchange finally gets their queues the producer receives an acknowledge message with an UID of the message. This kind of messages are asynchronous and permits to a producer send the next message when it is still waiting the ACK message. Of course if the message cannot be stored and it is lost the producer receives a NACK (not acknowledged) message.

Producers

Maybe this is the most simple part of the system. Producers only need to negotiate the authentication across a TCP connection create a channel and then publish all messages that want with its corresponding routing key. Of course, producers can create exchanges, queues and then bind them. But usually this is not a good idea is much more secure do this from consumers. Because when a producers try to send a message to a broker and doesn’t have the needed exchange then message will be lost. Usually consumers are connected all time and subscribed to queues and producers only connect to brokers when they need to send messages.

Consumers

When a consumer connects to a queue usually uses a command called basic.consume to subscribe the channel to a queue, then every time subscribed queue has a new message it is sent to consumer after last message is consumed, or rejected.

If consumer only want to receive one message without a subscription it can use the command basic.get.This is like a poll method. In fact, the consumer only gets a message each time it sends the command.

You can get the best throughput using basic.consume command because is more efficient than poll every time the consumer wants another message.

When more than one consumer was connected to a queue, messages are distributed in a round-robin. After the message is delivered to a consumer this send an acknowledge message and then queue send another message to next consumer. If the consumer sends a reject message the same message is sent to next consumer.

There are two types of acknowledgements:

basic.ack: this is the message that sends consumer to queue to acknowledge the reception of a message
auto_ack: this is a parameter we can set when consumer subscribes to a queue. The setting assumes ACK message from consumer and then queue sends next message without waiting the ACK message.

The message basic.reject is sent when the consumer wants to reject a received message. This message discards the message and it is lost. If we want to requeue the message we can set the parameter requeue=true when sent a reject message.

When the queue is created there can be a parameter called dead letter set to true, then consumer rejects a message with the parameter requeue=false the message is queued to a new queue called dead letter. This is very useful because after all we can go tho that queue an inspect the message rejection reason.

Queues

Both consumers and producers can create a queue using queue.declare command. The most natural way is create queues from consumers and then bind it to an exchange. The consumers needs a free channel to create a queue, if a channel is subscribed to a queue, the channel is busy and cannot create new queues. When a queue is created usually we use a name to identify the queue, if the name is not specified it’s randomly generated. This is useful when create temporary and anonymous queues for RPC-over-AMQP.

Parameters we can set when create a new queue:

exclusive – this setting makes a queue private and is only accessible from your application. Only one consumer can connect to a queue.
auto-delete – when last consumer unsubscribes from queue the queue is removed.
passive – when create a queue that exists the server returns successfully or returns fail if parameters don’t match. If passive parameter is set and we create a queue that exists always returns success but if the queue doesn’t exist it is not created.
durable – the queue can persist when the services reboots.

Exchange and binding

In the first post of the serie we talked about different exchange types as you can remember these types are: direct, fanout and topic. And the most important parameter to set when producer sends a message is the routing key this is used to route the message to a queue.

Once we have declared an exchange this can be related with a queue using a binding command: queue_bind. The relation between them is made using the routing key or a pattern based in routing key. When exchange has type fanout the routing key or patterns are not needed.

Some pattern examples can be: log.*, message.* and #.

The most important exchange parameters are:

type: direct, fanout and topic.
durable: makes an exchange persistent to reboots.

Broker and virtual hosts

A broker is a container where exhanges, bindings and queues are created. Usually we can define more than one virtual brokers in the same server. Virtual brokers are also called virtual hosts. The users, permissions and something else related to a Broker cannot be used from another one. This is very useful because we can create multiple brokers in the same physical server like multi-domain web server and when some of this virtual hosts is too big it can be migrated to another physical server and it can be clustered if it is required.

Messages

An AMQP message is a binary without a fixed size and format. Each application can set it’s own messages. The AMQP broker only will add small headers to be routed among different queues as fast as possible.

Messages are not persistent inside a broker unless the producer sets the parameter persistent=true. In the other way the messages needs to be stored in durable exchanges and durable queues to persist in the broker when it is restarted. Of course when the messages are persistent these must be wrote to disk and the throughput will fall down. Then maybe sometimes create persistent messages is not a good idea.

What is AMQP? and the architecture

2012/03/15 1 comentari

Reading time: 3 – 4 minutes

What is AMQP? (Advanced Message Queuing Protocol)

When two applications need to communicate there are a lot of solutions like IPC, if these applications are remote we can use RPC. When two or more applications communicate with each other we can use ESB. And there are many more solutions. But when more than two applications communicate and the systems need to be scalable the problem is a bit more complicated. In fact, when we need to send a call to a remote process or distribute object processing among different servers we start to think about queues.

Typical examples are rendering farms, massive mail sending, publish/subscriptions solutions like news systems. At that time we start to consider a queue-based solution. In my case the first approach to these types of solutions was Gearman; that is a very simple queue system where workers connect to a central service where producers have to call the methods published by workers; the messages are queued and delivered to workers in a simple queue.

Another interesting solution can be use Redis like a queue service using their features like publish/subscribe. Anyway always you can develop your own queue system. Maybe there a lot of solutions like that but when you are interested in develop in standard way and want a long-run solution with scalability and high availability then you need to think in use AMQP-based solutions.

The most simple definition of AMQP is: “message-oriented middleware”. Behind this simple definition there are a lot of features available. Before AMQP there was some message-oriented middlewares, for example, JMS. But AMQP is the standard protocol to keep when you choice a queue-based solution.

AMQP have features like queuing, routing, reliability and security. And most of the implementations of AMQP have a really scalable architectures and high availability solutions.

The architecture

The basic architecture is simple, there are a client applications called producers that create messages and deliver it to a AMQP server also called broker. Inside the broker the messages are routed and filtered until arrive to queues where another applications called consumers are connected and get the messages to be processed.

When we have understood this maybe is the time to deep inside the broker where there are AMQP magic. The broker has three parts:

Exchange: where the producer applications delivers the messages, messages have a routing key and exchange uses it to route messages.
Queues: where messages are stored and then consumers get the messages from queues.
Bindings: makes relations between exchanges and queues.

When exchange have a message uses their routing key and three different exchange methods to choose where the message goes:

Direct Exchange: routing key matches the queue name.
Fanout Exchange: the message is cloned and sent to all queues connected to this exchange.
Topic Exchange: using wildcards the message can be routed to some of connected queues.

This is the internal schema of a broker:

AMQP and RabbitMQ [TOC]

2012/03/09 7 comentaris

Reading time: 1 – 2 minutes

After reading the book ‘RabbitMQ in action‘ I’m working on series of posts that will include the following subjects:

What is AMQP? and the architecure
Deep inside AMQP
RabbitMQ CLI quick reference
Hello World using ‘kombu’ library and python
Parallel programming
Events example
RPC
Clustering fundamentals
Managing RabbitMQ from administration web interface
Managing RabbitMQ from REST API

Please let me know if you are interested in this series of posts. Because in my opinion this is very interesting and it always comes in handy to know if someone has been working on those subjects.

Autenticació PAM/OTP via Apache

2010/09/17 No hi ha comentaris

Reading time: 4 – 6 minutes

L’autenticació d’Apache coneguda com a AuthBasic malgrat la seva inseguretat és una de les més usades, ja que ens permet de forma senzilla i ràpida protegir un directori o simplement un fitxer. La protecció d’aquest tipus d’autenticació és molt relativa perquè el password viatge en clar a través de la xarxa, a més, al viatjar a les capçaleres HTTP o fins hi tot en la propia URL de la pàgina pot arribar a ser indexat per un navegador.

Seguint la línia dels anteriors articles:

Ara el que toca és afegir a l’Apache aquest tipus d’autenticació de forma que podem donar les credencials d’accés a un directori/fitxer a algú però aquestes caducaran al cap del temps (uns 3 minuts) i els buscadors o d’altres agents amb més mala idea no podran accedir al recurs passat aquest temps. Com sempre la idea és la d’aprofitar-se del PAM/OTP.

En primera instància per aconseguir això ho he intentat amb el mòdul libapache2-mod-auth-pam però no n’he tret l’entrellat i he estat incapaç de completar la configuració. Així doncs, el que he fet és provar amb el mòdul libapache2-mod-authnz-external ambdós disponibles en l’Ubuntu 8.04 (Hardy). Aquest segon mòdul l’he combinat amb el checkpassword-pam.

Així doncs, la idea és ben senzilla configurem l’Apache perquè usi el libapache2-mod-authnz-external, en escència el que fa aquest mòdul és recolzar-se en agents externs per fer l’autenticació. Aquest agent extern és el checkpassword-pam que comentava i que com diu el seu nom el que fa és validar el password contra PAM, així doncs, malgrat no és una solució massa eficient a nivell de recursos ja que ha de carregar un segon programa per validar cada usuari considero que és una solució suficienment bona pel meu cas.

La configuració

Instal·lem el mòdul d’apache i l’activem:

apt-get install libapache2-mod-authnz-external
a2enmod authnz_external
/etc/init.d/apache2 restart

Instal·lació del checkpassword-pam

cd /var/tmp
wget "http://downloads.sourceforge.net/project/checkpasswd-pam/checkpasswd-pam/0.99/checkpassword-pam-0.99.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fcheckpasswd-pam%2F&ts=1284649515&use_mirror=ignum"
tar xvfz checkpassword-pam-0.99.tar.gz
cd checkpassword-pam-0.99
./configure
make
mkdir -p /opt/checkpassword-pam
cp checkpassword-pam /opt/checkpassword-pam

Fitxer de configuració pel servei apache2 del PAM, /etc/pam.d/apache2:

auth    sufficient      pam_script.so onerr=fail dir=/etc/pam-script.d/
account  required       pam_script.so onerr=fail dir=/etc/pam-script.d/

Si us hi fixeu aquest fitxer és igual al que vaig usar per fer l’autenticació en l’article de PHP i PAM/OTP.

Ara anem a un dels fitxers de configuració d’Apache per un virtualhost i afegim el següent codi per protegir un directori servit per aquest virtualhost:

<Directory /var/www/virtualhost/directori_a_protegir>
  AuthType Basic
  AuthName "Restricted area for My Server"
  require user nom_usuari
  AuthBasicProvider external
  AuthExternal autenticador
</Directory>
AddExternalAuth autenticador "/opt/checkpassword-pam/checkpassword-pam -H --noenv -s php -- /bin/true"
SetExternalAuthMethod autenticador checkpassword

Coses importants a destacar en aquest configuració, fixeu-vos que em cal afegir ‘require user nom_usuari’, això és perquè no disposo de cap fitxer d’usuaris on Apache pugui validar quins són els meus usuaris vàlids i la directiva ‘require valid-users’ no funcionaria, així doncs, hem d’especificar quins usuaris podran fer login a través d’aquesta comanda, o bé, afegir una altre directiva que li permiti a Apache trobar un llistat d’usuaris en algún lloc.

Un altre detall important són els paràmetres que he d’usar al checkpasswords-pam perquè aquest funcioni bé:

-H no intenta fer un ‘chdir-home’ ja que potser l’usuari no disposa d’aquest ‘home directory’.
–noenv el checkpassword-pam busca certes variables d’entorn per funcionar, amb aquest paràmetre no les busca i agafa el que li cal del stdin.
-s apache2 especifica el nom del servei que buscarà dins de /etc/pam.d/apache2
— /bin/true simplement serveix perquè el checkpassword-pam no faci res després de validar un usuari, ja que podriem executar algún programa si volguessim després de fer la validació.

Referències

checkpassword-pam, recomanable consultar la ‘man page’.
mod-auth-external, especialment útil aquesta informació de configuració.

Autenticació PAM/OTP via PHP

2010/09/16 No hi ha comentaris

Reading time: 2 – 3 minutes

Una bona forma de continuar aprofindint amb el tema OTP i també amb PAM, després de l’article: Shellinabox i OTP, és explicar-vos com m’ho he fet per afegir suport OTP al PHP, de forma que quan programem amb PHP es pugui delegar l’autenticació al sistema PAM del linux. Obviament això té certes restriccions perquè PHP corre amb els permisos de l’usuari de l’Apache, o bé, de l’usuari del FSGI, etc. L’important de fer notar és que no és habitual llençar codi PHP amb permisos de ‘root’. Així doncs, depèn de quina acció li fem fer al PAM aquest no la podrà dur a terme, per exemple, no podrà accedir al fitxer /etc/shadow per validar el password de sistema. De fet, com que el que jo vull és treballar amb OTP això no és rellevant.

El primer que s’ha de fer és instal·lar el paquet php5-auth-apm i reiniciar l’Apache:

apt-get install php5-auth-pam
/etc/init.d/apache2 restart

Ara anem al directori /etc/pam-script.d i fem:

cd /etc/pam-script.d
rm pam_script_acct
# creem fitxer pam_script_acct
echo '#!/bin/sh' > pam_script_acct
echo 'exit 0' >> pam_script_acct
chmod 755 pam_script_acct

Ara toca crear el fitxer /etc/pam.d/php, amb el següent contingut:

auth    sufficient      pam_script.so onerr=fail dir=/etc/pam-script.d/
account  required       pam_script.so onerr=fail dir=/etc/pam-script.d/

Amb això ja en tenim prou per anar a jugar amb el php, el primer és amb un phpinfo(); validar que apareix algo així:

i després podem fer un codi tan senzill com aquest:

<?php
    $result=pam_auth('user','password-otp',&$error);
    var_dump($result);
    var_dump($error);
?>

La comanda clau com podeu veure és pam_auth, passeu com a paràmetre el nom de l’usuari, el password que ús ha donat la vostre aplicació generadora de passwords OTP i la variable que voleu que reculli els errors la passeu per referencia. En cas d’error de l’autenticació aquesta comanda contindrà la descripció de l’error. Aquesta mateixa funció retorna un codi boleà amb el resultat de l’autenticació.

wiki: Notes sobre entorns de programació

2010/08/30 No hi ha comentaris

Reading time: < 1 minute En un .txt tenia unes notes que havia pres sobre entorns de programació, escencialment la conferència de la que són les notes parlava sobre PHP i diferents formens de fer el desplegament dels projectes. Aquesta informació l'he passat a una pàgina del wiki amb la idea d'anar-la actualitzant per altres llenguatges especialment amb la idea de afegir-hi notes de python. Així doncs l'enllaç del wiki és a: Notes about programming environments and deployment. Tal com podeu deduir amb el títol les notes són amb anglès, sento no recordar la conferència per afegir-hi la presentació.

A continuació enganxo el contingut de la wiki de forma dinàmica, així quan actualitzi la wiki s’actualitzarà l’article:

Què és un WebHook?

2010/08/21 No hi ha comentaris

Reading time: < 1 minute WebHook logo
Un WebHook és una HTTP callback, o sigui, un HTTP POST que es dona quan algo passa (un event); o sigui, que podem definir WebHook com a sistema de notificació d’events per HTTP POST.

El valor més important dels WebHooks és la possibilitat de poder rebre events quan aquest passen no a través dels ineficaços mecanismes de polling.

Si voleu més informació sobre aquest tema: WebHooks site.

Jornades #decharlas sobre #symfony

2010/07/08 No hi ha comentaris

Reading time: 4 – 6 minutes

symfony logo

Com deia a l’anterior article, el dilluns i dimarts vaig ser per la zona de Castelló per assistir amb el Benja a les jornades de Symfony organitzades per la Universitat de Jaume I de Castelló. Doncs bé, comentar que les jornades em van sorprendre molt positivament, realment hi havia gent amb força o fins hi tot molt nivell en la materia i això sempre és d’agraïr en aquest tipus d’events.

El programa de les xerrades era molt interessant i tret d’algunes xerrades puntuals totes eren del màxim interés per mi. Com molts esteu cansats de sentir jo no sóc programador, però en aquesta vida he picat força codi i concretament amb Symfony vaig començar-ho a fer en la versió 0.6.4, o sigui, molt abans que fos estable. Però de la mà de l’Oriol M. varem fer un projecte amb uns quants megues de codi font per fer recàrregues de mòbil a una cadena de botigues de fotografia, un dels codis dels que estic més orgullós.

Tornant a les jornades, volia destacar el meu ranking particular de ponents, es tracta d’un petit TOP 3 que des de la meva més extrema modèstia preten classificar en base a uns criteris totalment subjectius les persones que van saber transmetre els valors més importants que jo busco en una xerrada:

Javier Eguíluz (Symfony 2) – a destacar la gran capacitat d’enfocar la conferència cap als items més importants a resaltar, sense oblidar els detalls rellevants en cada moment, tot això sense perdre un fil conductor clar en tota la xarrada. També s’ha de dir que el tema li donava molt de joc per fer embadalir a l’audiència; però va saber com treure-li el suc i tenir-nos a tots ben enganxats malgrat estar molt cansats després de dos dies de conferències sense parar, aquesta va ser l’última conferència.
José Antonio Pío (Vistas y sfForm) – cal deixar-ho clar, una màquina té tota la pinta de ser un programador com la copa d’un pi i els típics defectes i virtuts de ser un professor (ho és?). Malgrat sembla que és molt bo i sap explicar-se la densitat del seu contingut i complexitat del mateix, feien complex seguir-lo tota l’estona ja que exigia molt a la concentració. Cosa gens senzilla en una marató de conferències com el que ens ocupa.
Jordi Llonch (Plugins) – aquest representant de les nostres terres diria que també ha fet una molt bona feina explicant-nos la potència dels plug-ins de Symfony, potser a nivell personal no m’ha aportat tan com les altres dues xerrades però la qualitat de les transparències, l’oratòria i la bona organització dels continguts diria que l’han convertit en una referència de com fer les coses, sota el meu punt de vista, és clar.

Comentar que la resta de ponències també han estat molt bé, obviament algunes amb més qualitat que d’altres i amb temes que per mi tenien un major interés que altres. Però en conjunt dono una nota altíssima al contingut de les xerrades. Pel que fa a la organització i les instal·lacions un 9, més que res per donar-los marge a millorar. Impresionant el complexe universitari, res a envejar a d’altres que he vist.

Abans de tancar aquest article tan poc ortodoxe per descriure les jornades comentar en forma de punts algunes notes mentals que me n’he emportat:

Hauria de posar-me amb Doctrine, però al coneixer Propel mai trobo el moment.
Mirar-me a fons referències sobre integració continua amb PHP i eines de gestió de projectes
Dona gust adonar-se que moltes pràctiques que has adoptat unilateralment també es comparteixen en la comunitat: temes d’sfForm, vistes, pugins, web escalables, cloud computing, etc.
Pel que fa al tema de la web escalable, m’agradaria agafar els ‘slides’ del ponent Asier Marqués i fer el meu propi screencast del tema, comparteixo quasi tot el que va dir a la conferència però em quedo amb les ganes d’aportar-hi el meu granet de sorra.
Mirar-me el tema d’ESI que ja l’havia oblidat.
MongoDB, com dic cada dia… una passada! i si a sobre es soporta a Doctrine això ja no té preu. Interessant el tema de Mondongo.
i l’estrella de tot plegat!!!! Symofny2… estic impacient!!!!

UPDATE: presentacions de les jornades.

dbus+python: emetent i rebent senyals

2010/06/30 No hi ha comentaris

Reading time: 2 – 2 minutes

Feia massa temps que no jugava amb DBUS i les he passat una mica negres aquesta tarda intentant recordar com funcionava tot plegat. La qüestió de base és molt senzilla, com que el codi parla per si mateix. Simplement adjuntaré els dos codis.

Receptor de senyals DBUS, rep la senyal amb format ‘string’ i la mostra:

#!/usr/bin/env python
#--encoding: UTF-8--
"""
entra en un loop esperant senyals emeses a:
  dbus_interface = cat.oriolrius.prova
  object_path = "/cat/oriolrius/prova/senyal"
amb nom de senyal: 'estat'
quan es rep la senyal la mostrem
"""
import gobject
import dbus
import dbus.mainloop.glib

def mostra(m):
    print m

dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
bus = dbus.SessionBus()
bus.add_signal_receiver(
                 mostra,
                 path="/cat/oriolrius/prova/senyal",
                 dbus_interface="cat.oriolrius.prova",
                 signal_name = "estat"
                )
loop = gobject.MainLoop()
loop.run()

Emisor de senyals DBUS, envia una senyal de tipus ‘string’ amb el contingut ‘hola’:

#!/usr/bin/env python
#--encoding: UTF-8--
"""
Emet una senyal a dbus, al bus 'session' amb destí:
  dbus_interface = cat.oriolrius.prova
  object_path = "/cat/oriolrius/prova/senyal"
amb nom de senyal: 'estat'
"""
import gobject
import dbus
from dbus.service import signal,Object
import dbus.mainloop.glib

class EmetSenyal(Object):
    def __init__(self, conn, object_path='/'):
        Object.__init__(self, conn, object_path)

    @signal('cat.oriolrius.prova')
    def estat(self,m):
        global loop
        print("senyal emesa: %s" % m)
        gobject.timeout_add(2000, loop.quit)

dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
loop = gobject.MainLoop()
bus = dbus.SessionBus()
o = EmetSenyal(bus,object_path='/cat/oriolrius/prova/senyal')
o.estat('hola')
loop.run()

Usant el ‘dbus-monitor’ es pot veure la traça del missatge:

signal sender=:1.634 -> dest=(null destination) serial=2 path=/cat/oriolrius/prova/senyal; interface=cat.oriolrius.prova; member=estat
   string "hola"

Cheetah – the python powered template engine

2010/06/28 No hi ha comentaris

Reading time: 2 – 3 minutes

Un article ‘fast-n-dirty’ sobre potser la millor llibreria que he trobat per treballar amb templates i python: Cheetah. Es tracta de poder generar fitxers de texte de forma senzilla: fitxers de configuració, pàgines web, emails, etc. a partir de plantilles. Realment útil en molts entorns.

Les funcionalitats (copy-paste de la web):

is supported by every major Python web framework.
is fully documented and is supported by an active user community.
can output/generate any text-based format.
compiles templates into optimized, yet readable, Python code.
blends the power and flexibility of Python with a simple template language that non-programmers can understand.
gives template authors full access to any Python data structure, module, function, object, or method in their templates. Meanwhile, it provides a way for administrators to selectively restrict access to Python when needed.
makes code reuse easy by providing an object-oriented interface to templates that is accessible from Python code or other Cheetah templates. One template can subclass another and selectively reimplement sections of it. Cheetah templates can be subclasses of any Python class and vice-versa.
provides a simple, yet powerful, caching mechanism that can dramatically improve the performance of a dynamic website.
encourages clean separation of content, graphic design, and program code. This leads to highly modular, flexible, and reusable site architectures, shorter development time, and HTML and program code that is easier to understand and maintain. It is particularly well suited for team efforts.
can be used to generate static html via its command-line tool.

a qui va orientat (copy-paste de la web):

for programmers to create reusable components and functions that are accessible and understandable to designers.
for designers to mark out placeholders for content and dynamic components in their templates.
for designers to soft-code aspects of their design that are either repeated in several places or are subject to change.
for designers to reuse and extend existing templates and thus minimize duplication of effort and code.
and, of course, for content writers to use the templates that designers have created.