Jul 06

MIT Data Science course: Data to Insights

Reading time: 3 – 4 minutes

This week I finished this course from MIT. After my previous experience on MIT professional courses, I decided to involved in a new one. I know it sounds strange after my really bad feedback in my previous course about IoT, but I decided to give a second opportunity to this kind of courses.

My general opinion about this course is by far better than the previous one, so I’m happy to be done this course. I learn a lot, and of course now is time to put knowledge in practice, so it’s not a minor thing to do. Talking about the difficulty of the course I found two initial modules especially difficult, a lot of mathematics and formulas and it was very difficult to follow explanations because of the complexity of the concepts and mathematical formulas which describe them. In my case some code in any programming language using libraries which abstract mathematical formula complexity would be ideal.

About those two initial modules: “Making sense of unstructured data” and “Regression and Prediction” perhaps the subjects sounds good especially where I want to apply the knowledge, so in IIoT, this is not easy to figure out how to apply that knowledge in time series data. Maybe the best thing that I get from there is what are the main algorithms and theoretical basis that I have to apply in real world projects.

The third module has the subject “Classification, Hypothesis and Deep Learning” and it’s very linked with the previous, by the way, I found easier to understand the related mathematics and how to apply that knowledge. I especially found easy to understand and interesting to apply in IIoT the deep learning chapter, some concepts and basic ideas about neuronal networks are described in a very easy way and graphical synoptic and animations help a lot on following the concepts.

The last two modules about “Recommendation systems” and “Networks and Graphical Models” are presented in a very useful way, very applied to real world and with a lot of examples and I appreciate it. Apart from that teachers did a very good work explaining together and being very progressive in complexity from the bottom up.

If I have to suggest any improvement would be in the practical part, I consider Python a programming language with better future in Data Scientist world than R, may be R has a very good base and history as a language for scientists but I think tools like Jupyter has a better future than legacy tools like RStudio. So get more details and references about how to play with Python based tools and libraries than focus on R would be my recommendation.

Another point to improve, in my opinion, would be add some videos dedicated to using tools and how to apply those tools in case studies. At the end of the day, screencast videos are super useful when you’re not familiar with some technologies.

Summarizing I recommend the course, but don’t expect any fast application of the knowledge is a very theoretical course to get the basics and later get practical skills from your side with case studies or other references.

Oct 04

Table about Industry 4.0 on Catalan Telecommunications day

Reading time: 3 – 4 minutes

Last Thursday I participated in a round table about Industry 4.0 as part of the Catalan Telecommunications Day, really interesting event in a very nice place. I haven’t been in Cosmo Caixa since it was called “Museu de la Ciència” a very long time ago. But I have to say that the place is very trendy and awesome.

diada-telecom

Coming back to the event, I met some good friends and it’s always a pleasure but I also meet very interesting new people with who I’ll be happy to keep on talking and going deep on aligning synergies. One of those are i2cat people, guys we have to find the proper way to collaborate because again and again we meet each other with very compatible points of view.

taula8

About the content of my exposition I want to remark two things:

  • Firstly I think we have the debt to leadership the fourth industrial revolution, and catch up all those companies that never did the third revolution no the present. Catalonia has very powerful minds with a lot of entrepreneurs now it’s time to work together and demonstrate what we can do.
  • Secondly summarize the Fernando Trías de Bes article in “La Vanguardia”
    • In the 90s they said that Internet is going to be like another TV channel in our TVs; companies only need to create a Web page and they are ready for the future. But in the end it changed the ‘P’ of product in the marketing strategy.
    • Early 2000 e-commerce get it real and they said that it’s only another distribution channel, but finally it has been the change of two ‘P’s point of sale and price, both of them became obsolete.
    • In 2006 the revolution come through the social networks, they said only this is only personal webs instead of enterprise web; just create some accounts in those social networks and that’s all.  But ‘P’ of promotion has been redefined with new market segmentation.
    • Since 2010 smartphones sales increased dramatically and they said this is just like a mini PC, just adapt web pages and everything is done. But a lot of markets disappeared or changed deeply: photo cameras, music CDs, telephony, etc. So ‘P’ of point of sale and ‘P’ of product totally redefined. Virtual and physical experiences unified.
    •  First decade of new century Internet 2.0 has been consolidated, they said this is just web where people can participate. Companies only need to add a corner in their webpages where can discuss. ‘P’ of prices digital money and a lot of new business models.
    • Currently we talk about IoT and they say this is about adding electronics to the physical world.  Instead of that what happen is all product in a digital environment tends to be converted in a service. Again the ‘P’ of product is obsolete and has to be totally redefined.

diadatelecos2016123

Having that in mind IMHO we have huge opportunities within reach.

Aug 12

Secure download URLs with expiration time

Reading time: 4 – 6 minutes

Requirements

Imagine a HTTP server with those restrictions:

  • only specific files can be downloaded
  • with a limited time (expiration date)
  • an ID allows to trace who download files
  • with minimal maintenance and dependencies (no databases, or things like that)

the base of the solution that I designed is the URL format:

http://URL_HOST/<signature>/<customer_id>/<expire_date>/<path_n_file>
  • signature: is calculated with the next formula, given a “seed”
    • seed = “This is just a random text.”
    • str = customer_id + expire_date + path_n_file
    • signature = encode_base64( hmac_sha1( seed, str))
  • customer_id: just an arbitrary identifier when you want to distinguish who use the URL
  • expire_date: when the generated URL stops working
  • path_n_file: relative path in your private repository and the file to share

Understanding the ideas explained before I think it’s enough to understand what is the goal of the solution. I developed the solution using NGINX and LUA. But the NGINX version used is not the default version is a very patched version called Openresty. This version is specially famous because some important Chinese webs works with that, for instance, Taobao.com

Expiration URL solution Architecture schema

In the above schema there is a master who wants to share a file which is in the internal private repository, but the file has a time restriction and the URL is only for that customer. Then using the command line admin creates a unique URL with desired constrains (expiration date, customer to share and file to share). Next step is send the URL to the customer’s user. When the URL is requested NGINX server evaluates the URL and returns desired file only if the user has a valid URL. It means the URL is not expired, the file already exists, the customer identification is valid and the signature is not modified.

NGINX Configuration

server {
 server_name downloads.local;

 location ~ ^/(?<signature>[^/]+)/(?<customer_id>[^/]+)/(?<expire_date>[^/]+)/(?<path_n_file>.*)$ {
 content_by_lua_file "lua/get_file.lua";
 }

 location / {
 return 403;
 }
}

This is the server part of the NGINX configuration file, the rest of the file can as you want. Understanding this file is really simple, because the “server_name” works as always. Then only locations command are relevant. First “location” is just a regular expression which identifies the relevant variables of the URL and passes them to the LUA script. All other URLs that doesn’t match with the URI pattern fall in path “/” and the response is always “Forbiden” (HTTP 403 code). Then magics happen all in LUA code.

LUA scripts

There are some LUA files required:

  • create_secure_link.lua: creates secure URLs
  • get_file.lua: evaluates URLs and serves content of the required file
  • lib.lua: module developed to reuse code between other lua files
  • sha1.lua: SHA-1 secure hash computation, and HMAC-SHA1 signature computation in Lua (get from https://github.com/kikito/sha.lua)

It’s required to configure “lib.lua” file, at the beginning of the file are three variables to set up:

lib.secret = "This is just a long string to set a seed"
lib.base_url = "http://downloads.local/"
lib.base_dir = "/tmp/downloads/"

Create secure URLs is really simple, take look of the command parameters:

$ ./create_secure_link.lua 

 ./create_secure_link.lua <customer_id> <expiration_date> <relative_path/filename>

Create URLs with expiration date.

 customer_id: any string identifying the customer who wants the URL
 expiration_date: when URL has to expire, format: YYYY-MM-DDTHH:MM
 relative_path/filename: relative path to file to transfer, base path is: /tmp/downloads/

Run example:

$ mkdir -p /tmp/downloads/dir1
$ echo hello > /tmp/downloads/dir1/example1.txt
$ ./create_secure_link.lua acme 2015-08-15T20:30 dir1/example1.txt
http://downloads.local/YjZhNDAzZDY0/acme/2015-08-15T20:30/dir1/example1.txt
$ date
Wed Aug 12 20:27:14 CEST 2015
$ curl http://downloads.local:55080/YjZhNDAzZDY0/acme/2015-08-15T20:30/dir1/example1.txt
hello
$ date
Wed Aug 12 20:31:40 CEST 2015
$ curl http://downloads.local:55080/YjZhNDAzZDY0/acme/2015-08-15T20:30/dir1/example1.txt
Link expired

Little video demostration

Resources

Disclaimer and gratefulness

 

 

Sep 30

Hello World using ‘kombu’ library and python

This entry is part 4 of 4 in the series AMQP and RabbitMQ

Reading time: 4 – 7 minutes

Some times schemas and snippets don’t need large descriptions. If you think this is not enough in this case tell me and I’m going to add explanations.

Using a python library called kombu as an abstraction to talk with AMQP broker we are going to develop different message routes setting each type of Exchange. As a backend I used RabbitMQ with default configuration.

AMQP schema using an exchange of type direct

kombu-direct

Queue definition:

from kombu import Exchange, Queue

task_exchange = Exchange("msgs", type="direct")
queue_msg_1 = Queue("messages_1", task_exchange, routing_key = 'message_1')
queue_msg_2 = Queue("messages_2", task_exchange, routing_key = 'message_2')

The producer:

from __future__ import with_statement
from queues import task_exchange

from kombu.common import maybe_declare
from kombu.pools import producers


if __name__ == "__main__":
    from kombu import BrokerConnection

    connection = BrokerConnection("amqp://guest:guest@localhost:5672//")

    with producers[connection].acquire(block=True) as producer:
        maybe_declare(task_exchange, producer.channel)
        
        payload = {"type": "handshake", "content": "hello #1"}
        producer.publish(payload, exchange = 'msgs', serializer="pickle", routing_key = 'message_1')
        
        payload = {"type": "handshake", "content": "hello #2"}
        producer.publish(payload, exchange = 'msgs', serializer="pickle", routing_key = 'message_2')

One consumer:

from queues import queue_msg_1
from kombu.mixins import ConsumerMixin

class C(ConsumerMixin):
    def __init__(self, connection):
        self.connection = connection
        return
    
    def get_consumers(self, Consumer, channel):
        return [Consumer( queue_msg_1, callbacks = [ self.on_message ])]
    
    def on_message(self, body, message):
        print ("RECEIVED MSG - body: %r" % (body,))
        print ("RECEIVED MSG - message: %r" % (message,))
        message.ack()
        return
    

if __name__ == "__main__":
    from kombu import BrokerConnection
    from kombu.utils.debug import setup_logging
    
    setup_logging(loglevel="DEBUG")

    with BrokerConnection("amqp://guest:guest@localhost:5672//") as connection:
        try:
            C(connection).run()
        except KeyboardInterrupt:
            print("bye bye")


AMQP schema using an exchange of type fanout

kombu-fanout

Queue definition:

from kombu import Exchange, Queue

task_exchange = Exchange("ce", type="fanout")
queue_events_db = Queue("events.db", task_exchange)
queue_events_notify = Queue("events.notify", task_exchange)


The producer:

from __future__ import with_statement
from queues import task_exchange

from kombu.common import maybe_declare
from kombu.pools import producers


if __name__ == "__main__":
    from kombu import BrokerConnection

    connection = BrokerConnection("amqp://guest:guest@localhost:5672//")

    with producers[connection].acquire(block=True) as producer:
        maybe_declare(task_exchange, producer.channel)
        
        payload = {"operation": "create", "content": "the object"}
        producer.publish(payload, exchange = 'ce', serializer="pickle", routing_key = 'user.write')

        payload = {"operation": "update", "content": "updated fields", "id": "id of the object"}
        producer.publish(payload, exchange = 'ce', serializer="pickle", routing_key = 'user.write')

One consumer:

from queues import queue_events_db
from kombu.mixins import ConsumerMixin

class C(ConsumerMixin):
    def __init__(self, connection):
        self.connection = connection
        return
    
    def get_consumers(self, Consumer, channel):
        return [Consumer( queue_events_db, callbacks = [self.on_message])]
    
    def on_message(self, body, message):
        print ("save_db: RECEIVED MSG - body: %r" % (body,))
        print ("save_db: RECEIVED MSG - message: %r" % (message,))
        message.ack()
        return


if __name__ == "__main__":
    from kombu import BrokerConnection
    from kombu.utils.debug import setup_logging
    
    setup_logging(loglevel="DEBUG")

    with BrokerConnection("amqp://guest:guest@localhost:5672//") as connection:
        try:
            C(connection).run()
        except KeyboardInterrupt:
            print("bye bye")


AMQP schema using an exchange of type topic

kombu-topic

Queue definition:

from kombu import Exchange, Queue

task_exchange = Exchange("user", type="topic")
queue_user_write = Queue("user.write", task_exchange, routing_key = 'user.write')
queue_user_read = Queue("user.read", task_exchange, routing_key = 'user.read')
queue_notify = Queue("notify", task_exchange, routing_key = 'user.#')


The producer:

from __future__ import with_statement
from queues import task_exchange

from kombu.common import maybe_declare
from kombu.pools import producers


if __name__ == "__main__":
    from kombu import BrokerConnection

    connection = BrokerConnection("amqp://guest:guest@localhost:5672//")

    with producers[connection].acquire(block=True) as producer:
        maybe_declare(task_exchange, producer.channel)
        
        payload = {"operation": "create", "content": "the object"}
        producer.publish(payload, exchange = 'user', serializer="pickle", routing_key = 'user.write')

        payload = {"operation": "update", "content": "updated fields", "id": "id of the object"}
        producer.publish(payload, exchange = 'user', serializer="pickle", routing_key = 'user.write')

        payload = {"operation": "delete", "id": "id of the object"}
        producer.publish(payload, exchange = 'user', serializer="pickle", routing_key = 'user.write')

        payload = {"operation": "read", "id": "id of the object"}
        producer.publish(payload, exchange = 'user', serializer="pickle", routing_key = 'user.read')

One consumer:

from queues import queue_events_db
from kombu.mixins import ConsumerMixin

class C(ConsumerMixin):
    def __init__(self, connection):
        self.connection = connection
        return
    
    def get_consumers(self, Consumer, channel):
        return [Consumer( queue_events_db, callbacks = [self.on_message])]
    
    def on_message(self, body, message):
        print ("save_db: RECEIVED MSG - body: %r" % (body,))
        print ("save_db: RECEIVED MSG - message: %r" % (message,))
        message.ack()
        return


if __name__ == "__main__":
    from kombu import BrokerConnection
    from kombu.utils.debug import setup_logging
    
    setup_logging(loglevel="DEBUG")

    with BrokerConnection("amqp://guest:guest@localhost:5672//") as connection:
        try:
            C(connection).run()
        except KeyboardInterrupt:
            print("bye bye")


Simple queues

Kombu implements SimpleQueue and SimpleBuffer as simple solution for queues with exchange of type ‘direct’, with the same exchange name, routing key and queue name.

Pusher:

from kombu import BrokerConnection

connection = BrokerConnection("amqp://guest:guest@localhost:5672//")
queue = connection.SimpleQueue("logs")

payload = { "severity":"info", "message":"this is just a log", "ts":"2013/09/30T15:10:23" }
queue.put(payload, serializer='pickle')

queue.close()

Getter:

from kombu import BrokerConnection
from Queue import  Empty

connection = BrokerConnection("amqp://guest:guest@localhost:5672//")

queue = connection.SimpleQueue("logs")

while 1:
    try:
        message = queue.get(block=True, timeout=1)
        print message.payload
        message.ack()
    except Empty:
        pass
    except KeyboardInterrupt:
        break
    print message

queue.close()

The files

Download all example files: kombu-tests.tar.gz

Sep 25

Server send push notifications to client browser without polling

Reading time: 5 – 8 minutes

Nowadays last version of browsers support websockets and it’s a good a idea to use them to connect to server a permanent channel and receive push notifications from server. In this case I’m going to use Mosquitto (MQTT) server behind lighttpd with mod_websocket as notifications server. Mosquitto is a lightweight MQTT server programmed in C and very easy to set up. The best advantage to use MQTT is the possibility to create publish/subscriber queues and it’s very useful when you want to have more than one notification channel. As is usual in pub/sub services we can subscribe the client to a well-defined topic or we can use a pattern to subscribe to more than one topic. If you’re not familiarized with MQTT now it’s the best moment to read a little bit about because that interesting protocol. It’s not the purpose of this post to explain MQTT basics.

A few weeks ago I set up the next architecture just for testing that idea:

mqtt_schema

weboscket gateway to mosquitto mqtt server with javascrit mqtt client

The browser

Now it’s time to explain this proof of concept. HTML page will contain a simple Javascript code which calls mqttws31.js library from Paho. This Javascript code will connect to the server using secure websockets. It doesn’t have any other security measure for a while may be in next posts I’ll explain some interesting ideas to authenticate the websocket. At the end of the post you can download all source code and configuration files. But now it’s time to understand the most important parts of the client code.

client = new Messaging.Client("ns.example.tld", 443, "unique_client_id");
client.onConnectionLost = onConnectionLost;
client.onMessageArrived = onMessageArrived;
client.connect({onSuccess:onConnect, onFailure:onFailure, useSSL:true});

Last part is very simple, the client connects to the server and links some callbacks to defined functions. Pay attention to ‘useSSL’ connect option is used to force SSL connection with the server.

There are two specially interesting functions linked to callbacks, the first one is:

function onConnect() {
  client.subscribe("/news/+/sport", {qos:1,onSuccess:onSubscribe,onFailure:onSubscribeFailure});
}

As you can imagine this callback will be called when the connections is established, when it happens the client subscribes to all channels called ‘/news/+/sports’, for example, ‘/news/europe/sports/’ or ‘/news/usa/sports/’, etc. We can also use, something like ‘/news/#’ and it will say we want to subscribe to all channels which starts with ‘/news/’. If only want to subscribe to one channel put the full name of the channel on that parameter. Next parameter are dictionary with quality of service which is going to use and links two more callbacks.

The second interesting function to understand is:

function onMessageArrived(message) {
  console.log("onMessageArrived:"+message.payloadString);
};

It’s called when new message is received from the server and in this example, the message is printed in console with log method.

The server

I used an Ubuntu 12.04 server with next extra repositories:

# lighttpd + mod_webserver
deb http://ppa.launchpad.net/roger.light/ppa/ubuntu precise main
deb-src http://ppa.launchpad.net/roger.light/ppa/ubuntu precise main

# mosquitto
deb http://ppa.launchpad.net/mosquitto-dev/mosquitto-ppa/ubuntu precise main
deb-src http://ppa.launchpad.net/mosquitto-dev/mosquitto-ppa/ubuntu precise main

With these new repositories you can install required packages:

apt-get install lighttpd lighttpd-mod-websocket mosquitto mosquitto-clients

After installation it’s very easy to run mosquitto in test mode, use a console for that and write the command: mosquitto, we have to see something like this:

# mosquitto
1379873664: mosquitto version 1.2.1 (build date 2013-09-19 22:18:02+0000) starting
1379873664: Using default config.
1379873664: Opening ipv4 listen socket on port 1883.
1379873664: Opening ipv6 listen socket on port 1883.

The configuration file for lighttpd in testing is:

server.modules = (
        "mod_websocket",
)

websocket.server = (
        "/mqtt" => ( 
                "host" => "127.0.0.1",
                "port" => "1883",
                "type" => "bin",
                "subproto" => "mqttv3.1"
        ),
)

server.document-root        = "/var/www"
server.upload-dirs          = ( "/var/cache/lighttpd/uploads" )
server.errorlog             = "/var/log/lighttpd/error.log"
server.pid-file             = "/var/run/lighttpd.pid"
server.username             = "www-data"
server.groupname            = "www-data"
server.port                 = 80

$SERVER["socket"] == ":443" {
    ssl.engine = "enable" 
    ssl.pemfile = "/etc/lighttpd/certs/sample-certificate.pem" 
    server.name = "ns.example.tld"
}

Remember to change ‘ssl.pemfile’ for your real certificate file and ‘server.name’ for your real server name. Then restart the lighttpd and validate SSL configuration using something like:

openssl s_client -host ns.example.tld -port 443

You have to see SSL negotiation and then you can try to send HTTP commands, for example: “GET / HTTP/1.0” or something like this. Now the server is ready.

The Test

Now you have to load the HTML test page in your browser and validate how the connections is getting the server and then how the mosquitto console says how it receives the connection. Of course, you can modify the Javascript code to print more log information and follow how the client is connected to MQTT server and how it is subscribed to the topic pattern.

If you want to publish something in MQTT server we could use the CLI, with a command mosquitto_pub:

mosquitto_pub -h ns.example.tld -t '/news/europe/sport' -m 'this is the message about european sports'

Take a look in your browser Javascript consle you have to see how the client prints the message on it. If it fails, review the steps and debug each one to solve the problem. If you need help leave me a message. Of course, you can use many different ways to publish messages, for example, you could use python code to publish messages in MQTT server. In the same way you could subscribe not only browsers to topics, for example, you could subscribe a python code:

import mosquitto

def on_connect(mosq, obj, rc):
    print("rc: "+str(rc))

def on_message(mosq, obj, msg):
    print(msg.topic+" "+str(msg.qos)+" "+str(msg.payload))

def on_publish(mosq, obj, mid):
    print("mid: "+str(mid))

def on_subscribe(mosq, obj, mid, granted_qos):
    print("Subscribed: "+str(mid)+" "+str(granted_qos))

def on_log(mosq, obj, level, string):
    print(string)

mqttc = mosquitto.Mosquitto("the_client_id")
mqttc.on_message = on_message
mqttc.on_connect = on_connect
mqttc.on_publish = on_publish
mqttc.on_subscribe = on_subscribe

mqttc.connect("ns.example.tld", 1883, 60)
mqttc.subscribe("/news/+/sport", 0)

rc = 0
while rc == 0:
    rc = mqttc.loop()

Pay attention to server port, it isn’t the ‘https’ port (443/tcp) because now the code is using a real MQTT client. The websocket gateway isn’t needed.

The files

  • mqtt.tar.gz – inside this tar.gz you can find all referenced files
Sep 23

Home heating using Panstamp (Arduino + TI C1101) and SSR

This entry is part 2 of 4 in the series heater

Reading time: 3 – 5 minutes

Last weekend I worked on setting up home heaters using Panstamp. Panstamp is an Arduino board with Texas Instruments radio. Next winter we’re going to control our home heater using connected internet devices like the laptop, tablet o mobile phones. In this post I only want to share some pictures about the process to install the electronics inside the heaters changing the old electronic boards with new custom ones.

The parts:

  • AC/DC transformer, outputs 5V. It’s really cheap, in this case free because I have more than 20 of them from old projects.

acdc5v_

  • A small custom PCB designed and made by Daniel Berenguer, the owner of Panstamp. Thanks again Daniel. I bought the PCBs and parts for around 10€ each one.

IMG_20130923_114830

  • TMP36 temperature sensor. It costs about 1,5€ each one.

IMG_20130923_114814

  • Solid state relay (SSR) bought in Alied Express web site for less than 5€.

ssr

The process:

I used a lot of tools, because DIY aren’t my strong hability.

IMG_20130921_141640

Double-head tape and hot-blue gun are need…

IMG_20130921_133746

because I want to use a cork base under the PSU and PCB

IMG_20130921_133723Parallelization of the last process
IMG_20130921_143116Using a cutter I got the units
IMG_20130921_143434SSR setup
IMG_20130921_164651connecting SSR, PCB and PSU
IMG_20130921_170711assembling everything on heater side panel
IMG_20130921_173937finally, mounting side panel on the heater
IMG_20130921_133547

Next weeks, I’ll come back with this subject to talk about the software part.

Sep 06

Celery logs through syslog

Reading time: 2 – 2 minutes

Celery logs are colorized by default, the first big idea is disable color logs. It’s as easy as setting ‘CELERYD_LOG_COLOR’ to ‘False’ in ‘celery.conf’. The code could be something like this:

celery.conf.update('CELERYD_LOG_COLOR' = False)

Secondly we need a function where we set up a new handler and other settings to celery logging system. For example, the code could be:

from __future__ import absolute_import
from logging import BASIC_FORMAT, Formatter
from logging.handlers import SysLogHandler
from celery.log import redirect_stdouts_to_logger

def setup_log(**args):
    # redirect stdout and stderr to logger
    redirect_stdouts_to_logger(args['logger'])
    # logs to local syslog
    hl = SysLogHandler('/dev/log')
    # setting log level
    hl.setLevel(args['loglevel'])
    # setting log format
    formatter = Formatter(BASIC_FORMAT)
    hl.setFormatter(formatter)
    # add new handler to logger
    args['logger'].addHandler(hl)

Pay attention to ‘redirect_stdouts_to_logger’ it’s used to send all outputs like print’s or something else to syslog.

Thirdly we want to use those settings in our celery tasks, then we have to connect ‘setup_log’ code to some celery signals. Those signals are launched when ‘task_logger’ and ‘logger’ are configured. To connect signals:

from celery.signals import after_setup_task_logger, after_setup_logger

after_setup_logger.connect(setup_log)
after_setup_task_logger.connect(setup_log)

Fourthly we have to get the ‘logger’, we can have more than one if we are interested in records with task context or without it. For example:

logger = get_logger('just_a_name_for_internal_use')
logger_with_task_context = get_task_logger('name_of_the_task_to_be_recorded_in_logs')

Finally we only have to use those loggers with common methods DEBUG, INFO, WARN, ERROR and CRITICAL:

@celery.task
def the_task():
    logger.info('this is a message without task context')
    logger_with_task_context.debug('this record will have the prefix "name_of_the_task_to_be_recorded_in_logs" in syslog')
May 28

A pair of themes for ExtJS

Reading time: 1 – 2 minutes

I’m a ExtJS JavaScript framework believer, but there other interesting and famous JavaScript frameworks like Bootstrap and jQuery. IMHO ExtJS is more focused on web applications than public web. In this post I want to share two ExtJS themes that helps to improve UI look and feel.

The first one is a bootstrap look and feel for ExtJS:

extjs-bootstrap

if you want to test it take a look to demo site. The theme is opensource and you can find the source in github.

The second and last one is Clifton theme.

clifton-theme

IMHO is a nice theme although it’s not really free. It costs around 320€, but in some professional projects it could be a really low price if you consider the effort to get a professional look and feel . You can try it in demo page.

Oct 11

Some recommendations about RESTful API design

Reading time: 4 – 6 minutes

I want to recommend to you to watch the YouTube video called RESTful API design of Brian Mulloy. In this post I make an small abstract of the most important ideas of the video, of course from my point of view:

  • Use concrete plural nouns when you are defining resources.
  • Resource URL has to be focused in access collection of elements and specific element. Example:
    • /clients – get all clients
    • /clients/23 – get the client with ID 23
  • Map HTTP methods to maintein elements (CRUD):
    • POST – CREATE
    • GET – READ
    • PUT – UPDATE
    • DELETE – DELETE
  • Workaround, if your REST client doesn’t support HTTP methods, use a parameter called ‘method’ could be a good idea. For example, when you have to use a method HTTP PUT it could be changed by method HTTP GET and the parameter ‘method=put’ in the URL.
  • Sweep complexity behind the ‘?’. Use URL parameters to filter or put some optional information to your request.
  • How to manage errors:
    • Use HTTP response codes to refer error codes. You can find a list of HTTP response codes  in Wikipedia.
    • JSON response example can be like this:
      { 'message':'problem description', 'more_info':'http://api.domain.tld/errors/12345' }
    • Workaround, if REST client doesn’t know how to capture HTTP error codes and raise up an error losing the control of the client, you can use HTTP response code 200 and put ‘response_code’ field in JSON response object. It’s a good idea use this feature as optional across URL parameter ‘supress_response_code=true’.
  • Versioning the API. Use a literal ‘v’ followed by an integer number before the resource reference in the URL. It could be the most simple and powerful solution in this case. Example: /v1/clients/
  • The selection of what information will be returned in the response can be defined in the URL parameters, like in this example: /clients/23?fields=name,address,city
  • Pagination of the response. Use the parameters ‘limit’ and ‘offset’, keep simple. Example: ?limit=10&offset=0
  • Format of the answer, in this case I’m not completely agree with Brian. I prefer to use HTTP header ‘Accept’ than his proposal. Anyway both ideas are:
    • Use HTTP header ‘Accept’ with proper format request in the answer, for example, ‘Accept: application/json’ when you want a JSON response.
    • or, use extension ‘.json’ in URL request to get the response in JSON format.
  • Use Javascript format for date and time information, when you are formatting JSON objects.
  • Sometimes APIs need to share actions. Then we can’t define an action with a noun, in this case use verb. Is common to need actions like: convert, translate, calculate, etc.
  • Searching, there are two cases:
    • Search inside a resource, in this case use parameters to apply filters.
    • Search across multiple resource, here is useful to create the resource ‘search’.
  • Count elements inside a resource, simply add ‘/count’ after the resource. Example: /clients/count
  • As far as you can use a single base URL for all API resources, something like this: ‘http://api.domain.tld’.
  • Authentication, simply use OAuth 2.0
  • To keep your API KISS usually it’s a good idea develop SDK in several languages, where you can put more high level features than in API.
  • Inside an application each resource has its own API but it’s not a good idea publish it to the world, maybe use a virtual API in a layer above it’s more secure and powerful.

 

Mar 15

What is AMQP? and the architecture

This entry is part 2 of 4 in the series AMQP and RabbitMQ

Reading time: 3 – 4 minutes

What is AMQP? (Advanced Message Queuing Protocol)

When two applications need to communicate there are a lot of solutions like IPC, if these applications are remote we can use RPC. When two or more applications communicate with each other we can use ESB. And there are many more solutions. But when more than two applications communicate and the systems need to be scalable the problem is a bit more complicated. In fact, when we need to send a call to a remote process or distribute object processing among different servers we start to think about queues.

Typical examples are rendering farms, massive mail sending, publish/subscriptions solutions like news systems. At that time we start to consider a queue-based solution. In my case the first approach to these types of solutions was Gearman; that is a very simple queue system where workers connect to a central service where producers have to call the methods published by workers; the messages are queued and delivered to workers in a simple queue.

Another interesting solution can be use Redis like a queue service using their features like publish/subscribe. Anyway always you can develop your own queue system. Maybe there a lot of solutions like that but when you are interested in develop in standard way and want a long-run solution with scalability and high availability then you need to think in use AMQP-based solutions.

The most simple definition of AMQP is: “message-oriented middleware”. Behind this simple definition there are a lot of features available. Before AMQP there was some message-oriented middlewares, for example, JMS. But AMQP is the standard protocol to keep when you choice a queue-based solution.

AMQP have features like queuing, routing, reliability and security. And most of the implementations of AMQP have a really scalable architectures and high availability solutions.

The architecture

The basic architecture is simple, there are a client applications called producers that create messages and deliver it to a AMQP server also called broker. Inside the broker the messages are routed and filtered until arrive to queues where another applications called consumers are connected and get the messages to be processed.

When we have understood this maybe is the time to deep inside the broker where there are AMQP magic. The broker has three parts:

  1. Exchange: where the producer applications delivers the messages,  messages have a routing key and exchange uses it to route messages.
  2. Queues: where messages are stored and then consumers get the messages from queues.
  3. Bindings: makes relations between exchanges and queues.

When exchange have a message uses their routing key and three different exchange methods to choose where the message goes:

    1. Direct Exchange:  routing key matches the queue name.
    2. Fanout Exchange: the message is cloned and sent to all queues connected to this exchange.
    3. Topic Exchange: using wildcards the message can be routed to some of connected queues.

This is the internal schema of a broker: