Blog Archives

Deploying Django application to AWS EC2 instance with Docker

In AWS we have several ways to deploy Django (and not Django applications) with Docker. We can use ECS or EKS clusters. If we don’t have one ECS or Kubernetes cluster up and running, maybe it can be complex. Today I want to show how deploy a Django application in production mode within a EC2 host. Let’s start.

I’m getting older to provision one host by hand I prefer to use docker. The idea is create one EC2 instance (one simple Amazon Linux AMI AWS-supported image). This host don’t have docker installed. We need to install it. When we launch one instance, when we’re configuring the instance, we can specify user data to configure an instance or run a configuration script during launch.

We only need to put this shell script to set up docker

#! /bin/bash
yum update -y
yum install -y docker
usermod -a -G docker ec2-user
curl -L https://github.com/docker/compose/releases/download/1.25.5/docker-compose-`uname -s`-`uname -m` | sudo tee /usr/local/bin/docker-compose > /dev/null
chmod +x /usr/local/bin/docker-compose
service docker start
chkconfig docker on

rm /etc/localtime
ln -s /usr/share/zoneinfo/Europe/Madrid /etc/localtime

ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose

docker swarm init

We also need to attach one IAM role to our instance. This IAM role only need to allow us the following policies:

  • AmazonEC2ContainerRegistryReadOnly (because we’re going to use AWS ECR as container registry)
  • CloudWatchAgentServerPolicy (because we’re going to emit our logs to Cloudwatch)

Also we need to set up a security group to allow incoming SSH connections to port 22 and HTTP connections (in our example to port 8000)

When we launch our instance we need to provide a keypair to connect via ssh. I like to put this keypair in my .ssh/config

Host xxx.eu-central-1.compute.amazonaws.com
    User ec2-user
    Identityfile ~/.ssh/keypair-xxx.pem

To deploy our application we need to follow those steps:

  • Build our docker images
  • Push our images to a container registry (in this case ECR)
  • Deploy the application.

I’ve created a simple shell script called deploy.sh to perform all tasks:

#!/usr/bin/env bash

set -a
[ -f deploy.env ] && . deploy.env
set +a

echo "$(tput setaf 1)Building docker images ...$(tput sgr0)"
docker build -t ec2-web -t ec2-web:latest -t $ECR/ec2-web:latest .
docker build -t ec2-nginx -t $ECR/ec2-nginx:latest .docker/nginx

echo "$(tput setaf 1)Pusing to ECR ...$(tput sgr0)"
aws ecr get-login-password --region $REGION --profile $PROFILE |
  docker login --username AWS --password-stdin $ECR
docker push $ECR/ec2-web:latest
docker push $ECR/ec2-nginx:latest

CMD="docker stack deploy -c $DOCKER_COMPOSE_YML ec2 --with-registry-auth"
echo "$(tput setaf 1)Deploying to EC2 ($CMD)...$(tput sgr0)"
echo "$CMD"

DOCKER_HOST="ssh://$HOST" $CMD
echo "$(tput setaf 1)Building finished $(date +'%Y%m%d.%H%M%S')$(tput sgr0)"

This script assumes that there’s a deploy.env file with our personal configuration (AWS profile, the host of the EC2, instance, The ECR and things like that)

PROFILE=xxxxxxx

DOKER_COMPOSE_YML=docker-compose.yml
HOST=ec2-user@xxxx.eu-central-1.compute.amazonaws.com

ECR=9999999999.dkr.ecr.eu-central-1.amazonaws.com
REGION=eu-central-1

In this example I’m using docker swarm to deploy the application. I want to play also with secrets. This dummy application don’t have any sensitive information but I’ve created one "ec2.supersecret" variable

echo "super secret text" | docker secret create ec2.supersecret -

That’s the docker-compose.yml file:

version: '3.8'
services:
  web:
    image: 999999999.dkr.ecr.eu-central-1.amazonaws.com/ec2-web:latest
    command: /bin/bash ./docker-entrypoint.sh
    environment:
      DEBUG: 'False'
    secrets:
      - ec2.supersecret
    deploy:
      replicas: 1
    logging:
      driver: awslogs
      options:
        awslogs-group: /projects/ec2
        awslogs-region: eu-central-1
        awslogs-stream: app
    volumes:
      - static_volume:/src/staticfiles
  nginx:
    image: 99999999.dkr.ecr.eu-central-1.amazonaws.com/ec2-nginx:latest
    deploy:
      replicas: 1
    logging:
      driver: awslogs
      options:
        awslogs-group: /projects/ec2
        awslogs-region: eu-central-1
        awslogs-stream: nginx
    volumes:
      - static_volume:/src/staticfiles:ro
    ports:
      - 8000:80
    depends_on:
      - web
volumes:
  static_volume:

secrets:
  ec2.supersecret:
    external: true

And that’s all. Maybe ECS or EKS are better solutions to deploy docker applications in AWS, but we also can deploy easily to one docker host in a EC2 instance that it can be ready within a couple of minutes.

Source code in my github

Django reactive users with Celery and Channels

Today I want to build a prototype. The idea is to create two Django applications. One application will be the master and the other one will the client. Both applications will have their User model but each change within master User model will be propagated through the client (or clients). Let me show you what I’ve got in my mind:

We’re going to create one signal in User model (at Master) to detect user modifications:

  • If certain fields have been changed (for example we’re going to ignore last_login, password and things like that) we’re going to emit a event
  • I normally work with AWS, so the event will be a SNS event.
  • The idea to have multiple clients, so each client will be listening to one SQS queue. Those SQSs queues will be mapped to the SNS event.
  • To decouple the SNS sending og the message we’re going to send it via Celery worker.
  • The second application (the Client) will have one listener to the SQS queue.
  • Each time the listener have a message it will persists the user information within the client’s User model
  • And also it will emit on message to one Django Channel’s consumer to be sent via websockets to the browser.

The Master

We’re going to emit the event each time the User model changes (and also when we create or delete one user). To detect changes we’re going to register on signal in the pre_save to mark if the model has been changed and later in the post_save we’re going to emit the event via Celery worker.

@receiver(pre_save, sender=User)
def pre_user_modified(sender, instance, **kwargs):
    instance.is_modified = None

    if instance.is_staff is False and instance.id is not None:
        modified_user_data = UserSerializer(instance).data
        user = User.objects.get(username=modified_user_data['username'])
        user_serializer_data = UserSerializer(user).data

        if user_serializer_data != modified_user_data:
            instance.is_modified = True

@receiver(post_save, sender=User)
def post_user_modified(sender, instance, created, **kwargs):
    if instance.is_staff is False:
        if created or instance.is_modified:
            modified_user_data = UserSerializer(instance).data
            user_changed_event.delay(modified_user_data, action=Actions.INSERT if created else Actions.UPDATE)

@receiver(post_delete, sender=User)
def post_user_deleted(sender, instance, **kwargs):
    deleted_user_data = UserSerializer(instance).data
    user_changed_event.delay(deleted_user_data, action=Actions.DELETE)

We need to register our signals in apps.py

from django.apps import AppConfig

class MasterConfig(AppConfig):
    name = 'master'

    def ready(self):
        from master.signals import pre_user_modified
        from master.signals import post_user_modified
        from master.signals import post_user_deleted

Our Celery task will send the message to sns queue

@shared_task()
def user_changed_event(body, action):
    sns = boto3.client('sns')
    message = {
        "user": body,
        "action": action
    }
    response = sns.publish(
        TargetArn=settings.SNS_REACTIVE_TABLE_ARN,
        Message=json.dumps({'default': json.dumps(message)}),
        MessageStructure='json'
    )
    logger.info(response)

AWS

In Aws We need to create one SNS messaging service and one SQS queue linked to this SNS.

The Client

First we need one command to run the listener.

class Actions:
    INSERT = 0
    UPDATE = 1
    DELETE = 2

switch_actions = {
    Actions.INSERT: insert_user,
    Actions.UPDATE: update_user,
    Actions.DELETE: delete_user,
}

class Command(BaseCommand):
    help = 'sqs listener'

    def handle(self, *args, **options):
        self.stdout.write(self.style.WARNING("starting listener"))
        sqs = boto3.client('sqs')

        queue_url = settings.SQS_REACTIVE_TABLES

        def process_message(message):
            decoded_body = json.loads(message['Body'])
            data = json.loads(decoded_body['Message'])

            switch_actions.get(data['action'])(
                data=data['user'],
                timestamp=message['Attributes']['SentTimestamp']
            )

            notify_to_user(data['user'])

            sqs.delete_message(
                QueueUrl=queue_url,
                ReceiptHandle=message['ReceiptHandle'])

        def loop():
            response = sqs.receive_message(
                QueueUrl=queue_url,
                AttributeNames=[
                    'SentTimestamp'
                ],
                MaxNumberOfMessages=10,
                MessageAttributeNames=[
                    'All'
                ],
                WaitTimeSeconds=20
            )

            if 'Messages' in response:
                messages = [message for message in response['Messages'] if 'Body' in message]
                [process_message(message) for message in messages]

        try:
            while True:
                loop()
        except KeyboardInterrupt:
            sys.exit(0)

Here we persists the model in Client’s database

def insert_user(data, timestamp):
    username = data['username']
    serialized_user = UserSerializer(data=data)
    serialized_user.create(validated_data=data)
    logging.info(f"user: {username} created at {timestamp}")

def update_user(data, timestamp):
    username = data['username']
    try:
        user = User.objects.get(username=data['username'])
        serialized_user = UserSerializer(user)
        serialized_user.update(user, data)
        logging.info(f"user: {username} updated at {timestamp}")
    except User.DoesNotExist:
        logging.info(f"user: {username} don't exits. Creating ...")
        insert_user(data, timestamp)

def delete_user(data, timestamp):
    username = data['username']
    try:
        user = User.objects.get(username=username)
        user.delete()
        logging.info(f"user: {username} deleted at {timestamp}")
    except User.DoesNotExist:
        logging.info(f"user: {username} don't exits. Don't deleted")

And also emit one message to channel’s consumer

def notify_to_user(user):
    username = user['username']
    serialized_user = UserSerializer(user)
    emit_message_to_user(
        message=serialized_user.data,
        username=username, )

Here the Consumer:

class WsConsumer(AsyncWebsocketConsumer):
    @personal_consumer
    async def connect(self):
        await self.channel_layer.group_add(
            self._get_personal_room(),
            self.channel_name
        )

    @private_consumer_event
    async def emit_message(self, event):
        message = event['message']
        await self.send(text_data=json.dumps(message))

    def _get_personal_room(self):
        username = self.scope['user'].username
        return self.get_room_name(username)

    @staticmethod
    def get_room_name(room):
        return f"{'ws_room'}_{room}"

def emit_message_to_user(message, username):
    group = WsConsumer.get_room_name(username)
    channel_layer = get_channel_layer()
    async_to_sync(channel_layer.group_send)(group, {
        'type': WsConsumer.emit_message.__name__,
        'message': message
    })

Our consumer will only allow to connect only if the user is authenticated. That’s because I like Django Channels. This kind of thing are really simple to to (I’ve done similar things using PHP applications connected to a socket.io server and it was a nightmare). I’ve created a couple of decorators to ensure authentication in the consumer.

def personal_consumer(func):
    @wraps(func)
    async def wrapper_decorator(*args, **kwargs):
        self = args[0]

        async def accept():
            value = await func(*args, **kwargs)
            await self.accept()
            return value

        if self.scope['user'].is_authenticated:
            username = self.scope['user'].username
            room_name = self.scope['url_route']['kwargs']['username']
            if username == room_name:
                return await accept()

        await self.close()

    return wrapper_decorator

def private_consumer_event(func):
    @wraps(func)
    async def wrapper_decorator(*args, **kwargs):
        self = args[0]
        if self.scope['user'].is_authenticated:
            return await func(*args, **kwargs)

    return wrapper_decorator

That’s the websocket route

from django.urls import re_path

from client import consumers

websocket_urlpatterns = [
    re_path(r'ws/(?P<username>\w+)$', consumers.WsConsumer),
]

Finally we only need to connect our HTML page to the websocket

{% block title %}Example{% endblock %}
{% block header_text %}Hello <span id="name">{{ request.user.first_name }}</span>{% endblock %}

{% block extra_body %}
  <script>
    var ws_scheme = window.location.protocol === "https:" ? "wss" : "ws"
    var ws_path = ws_scheme + '://' + window.location.host + "/ws/{{ request.user.username }}"
    var ws = new ReconnectingWebSocket(ws_path)
    var render = function (key, value) {
      document.querySelector(`#${key}`).innerHTML = value
    }
    ws.onmessage = function (e) {
      const data = JSON.parse(e.data);
      render('name', data.first_name)
    }

    ws.onopen = function () {
      console.log('Connected')
    };
  </script>
{% endblock %}

Here a docker-compose with the project:

version: '3.4'

services:
  redis:
    image: redis
  master:
    image: reactive_master:latest
    command: python manage.py runserver 0.0.0.0:8001
    build:
      context: ./master
      dockerfile: Dockerfile
    depends_on:
      - "redis"
    ports:
      - 8001:8001
    environment:
      REDIS_HOST: redis
  celery:
    image: reactive_master:latest
    command: celery -A master worker --uid=nobody --gid=nogroup
    depends_on:
      - "redis"
      - "master"
    environment:
      REDIS_HOST: redis
      SNS_REACTIVE_TABLE_ARN: ${SNS_REACTIVE_TABLE_ARN}
      AWS_DEFAULT_REGION: ${AWS_DEFAULT_REGION}
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY}
  client:
    image: reactive_client:latest
    command: python manage.py runserver 0.0.0.0:8000
    build:
      context: ./client
      dockerfile: Dockerfile
    depends_on:
      - "redis"
    ports:
      - 8000:8000
    environment:
      REDIS_HOST: redis
  listener:
    image: reactive_client:latest
    command: python manage.py listener
    build:
      context: ./client
      dockerfile: Dockerfile
    depends_on:
      - "redis"
    environment:
      REDIS_HOST: redis
      SQS_REACTIVE_TABLES: ${SQS_REACTIVE_TABLES}
      AWS_DEFAULT_REGION: ${AWS_DEFAULT_REGION}
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY}

And that’s all. Here a working example of the prototype in action:

Source code in my github.

Building real time Python applications with Django Channels, Docker and Kubernetes

Three years ago I wrote an article about webockets. In fact I’ve written several articles about Websockets (Websockets and real time communications is something that I’m really passionate about), but today I would like to pick up this article. Nowadays I’m involved with several Django projects so I want to create a similar working prototype with Django. Let’s start:

In the past I normally worked with libraries such as socket.io to ensure browser compatibility with Websockets. Nowadays, at least in my world, we can assume that our users are using a modern browser with websocket support, so we’re going to use plain Websockets instead external libraries. Django has a great support to Websockets called Django Channels. It allows us to to handle Websockets (and other async protocols) thanks to Python’s ASGI’s specification. In fact is pretty straightforward to build applications with real time communication and with shared authentication (something that I have done in the past with a lot of effort. I’m getting old and now I like simple things :))

The application that I want to build is the following one: One Web application that shows the current time with seconds. Ok it’s very simple to do it with a couple of javascript lines but this time I want to create a worker that emits an event via Websockets with the current time. My web application will show that real time update. This kind of architecture always have the same problem: The initial state. In this example we can ignore it. When the user opens the browser it must show the current time. As I said before in this example we can ignore this situation. We can wait until the next event to update the initial blank information but if the event arrives each 10 seconds our user will have a blank screen until the next event arrives. In our example we’re going to take into account this situation. Each time our user connects to the Websocket it will ask to the server for the initial state.

Our initial state route will return the current time (using Redis). We can authorize our route using the standard Django’s protected routes

from django.contrib.auth.decorators import login_required
from django.http import JsonResponse
from ws.redis import redis

@login_required
def initial_state(request):
    return JsonResponse({'current': redis.get('time')})

We need to configure our channels and define a our event:

from django.urls import re_path

from ws import consumers

websocket_urlpatterns = [
    re_path(r'time/tic/$', consumers.WsConsumer),
]

As we can see here we can reuse the authentication middleware in channel’s consumers also.

import json
import json
from channels.generic.websocket import AsyncWebsocketConsumer


class WsConsumer(AsyncWebsocketConsumer):
    GROUP = 'time'

    async def connect(self):
        if self.scope["user"].is_anonymous:
            await self.close()
        else:
            await self.channel_layer.group_add(
                self.GROUP,
                self.channel_name
            )
            await self.accept()

    async def tic_message(self, event):
        if not self.scope["user"].is_anonymous:
            message = event['message']

            await self.send(text_data=json.dumps({
                'message': message
            }))

We’re going to need a worker that each second triggers the current time (to avoid problems we’re going to trigger our event each 0.5 seconds). To perform those kind of actions Django has a great tool called Celery. We can create workers and scheduled task with Celery (exactly what we need in our example). To avoid the “initial state” situation our worker will persists the initial state into a Redis storage

app = Celery('config')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()


@app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
   sender.add_periodic_task(0.5, ws_beat.s(), name='beat every 0.5 seconds')


@app.task
def ws_beat(group=WsConsumer.GROUP, event='tic_message'):
   current_time = time.strftime('%X')
   redis.set('time', current_time)
   message = {'time': current_time}
   channel_layer = channels.layers.get_channel_layer()
   async_to_sync(channel_layer.group_send)(group, {'type': event, 'message': message})

Finally we need a javascript client to consume our Websockets

let getWsUri = () => {
  return window.location.protocol === "https:" ? "wss" : "ws" +
    '://' + window.location.host +
    "/time/tic/"
}

let render = value => {
  document.querySelector('#display').innerHTML = value
}

let ws = new ReconnectingWebSocket(getWsUri())

ws.onmessage = e => {
  const data = JSON.parse(e.data);
  render(data.message.time)
}

ws.onopen = async () => {
  let response = await axios.get("/api/initial_state")
  render(response.data.current)
}

Basically that’s the source code (plus Django the stuff).

Application architecture
The architecture of the application is the following one:

Frontend
The Django application. We can run this application in development with
python manage.py runserver

And in production using a asgi server (uvicorn in this case)

uvicorn config.asgi:application --port 8000 --host 0.0.0.0 --workers 1

In development mode:

celery -A ws worker -l debug

And in production

celery -A ws worker --uid=nobody --gid=nogroup

We need this scheduler to emit our event (each 0.5 seconds)

celery -A ws beat

Message Server for Celery
In this case we’re going to use Redis

Docker
With this application we can use the same dockerfile for frontend, worker and scheduler using different entrypoints

Dockerfile:

FROM python:3.8

ENV TZ 'Europe/Madrid'
RUN echo $TZ > /etc/timezone && \
apt-get update && apt-get install -y tzdata && \
rm /etc/localtime && \
ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && \
dpkg-reconfigure -f noninteractive tzdata && \
apt-get clean

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

ADD . /src
WORKDIR /src

RUN pip install -r requirements.txt

RUN mkdir -p /var/run/celery /var/log/celery
RUN chown -R nobody:nogroup /var/run/celery /var/log/celery

And our whole application within a docker-compose file

version: '3.4'

services:
  redis:
    image: redis
  web:
    image: clock:latest
    command: /bin/bash ./docker-entrypoint.sh
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 1m30s
      timeout: 10s
      retries: 3
      start_period: 40s
    depends_on:
      - "redis"
    ports:
      - 8000:8000
    environment:
      ENVIRONMENT: prod
      REDIS_HOST: redis
  celery:
    image: clock:latest
    command: celery -A ws worker --uid=nobody --gid=nogroup
    depends_on:
      - "redis"
    environment:
      ENVIRONMENT: prod
      REDIS_HOST: redis
  cron:
    image: clock:latest
    command: celery -A ws beat
    depends_on:
      - "redis"
    environment:
      ENVIRONMENT: prod
      REDIS_HOST: redis

If we want to deploy our application in a K8s cluster we need to migrate our docker-compose file into a k8s yaml files. I assume that we’ve deployed our docker containers into a container registry (such as ECR)

Frontend:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: clock-web-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: clock-web-api
      project: clock
  template:
    metadata:
      labels:
        app: clock-web-api
        project: clock
    spec:
      containers:
        - name: web-api
          image: my-ecr-path/clock:latest
          args: ["uvicorn", "config.asgi:application", "--port", "8000", "--host", "0.0.0.0", "--workers", "1"]
          ports:
            - containerPort: 8000
          env:
            - name: REDIS_HOST
              valueFrom:
                configMapKeyRef:
                  name: clock-app-config
                  key: redis.host
---
apiVersion: v1
kind: Service
metadata:
  name: clock-web-api
spec:
  type: LoadBalancer
  selector:
    app: clock-web-api
    project: clock
  ports:
    - protocol: TCP
      port: 8000 # port exposed internally in the cluster
      targetPort: 8000 # the container port to send requests to

Celery worker

apiVersion: apps/v1
kind: Deployment
metadata:
  name: clock-web-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: clock-web-api
      project: clock
  template:
    metadata:
      labels:
        app: clock-web-api
        project: clock
    spec:
      containers:
        - name: web-api
          image: my-ecr-path/clock:latest
          args: ["uvicorn", "config.asgi:application", "--port", "8000", "--host", "0.0.0.0", "--workers", "1"]
          ports:
            - containerPort: 8000
          env:
            - name: REDIS_HOST
              valueFrom:
                configMapKeyRef:
                  name: clock-app-config
                  key: redis.host
---
apiVersion: v1
kind: Service
metadata:
  name: clock-web-api
spec:
  type: LoadBalancer
  selector:
    app: clock-web-api
    project: clock
  ports:
    - protocol: TCP
      port: 8000 # port exposed internally in the cluster
      targetPort: 8000 # the container port to send requests to

Celery scheduler

apiVersion: apps/v1
kind: Deployment
metadata:
  name: clock-cron
spec:
  replicas: 1
  selector:
    matchLabels:
      app: clock-cron
      project: clock
  template:
    metadata:
      labels:
        app: clock-cron
        project: clock
    spec:
      containers:
        - name: clock-cron
          image: my-ecr-path/clock:latest
          args: ["celery", "-A", "ws", "beat"]
          env:
            - name: REDIS_HOST
              valueFrom:
                configMapKeyRef:
                  name: clock-app-config
                  key: redis.host

Redis

apiVersion: apps/v1
kind: Deployment
metadata:
  name: clock-redis
spec:
  replicas: 1
  selector:
    matchLabels:
      app: clock-redis
      project: clock
  template:
    metadata:
      labels:
        app: clock-redis
        project: clock
    spec:
      containers:
        - name: clock-redis
          image: redis
          ports:
            - containerPort: 6379
              name: clock-redis
---
apiVersion: v1
kind: Service
metadata:
  name: clock-redis
spec:
  type: ClusterIP
  ports:
    - port: 6379
      targetPort: 6379
  selector:
    app: clock-redis

Shared configuration

apiVersion: v1
kind: ConfigMap
metadata:
  name: clock-app-config
data:
  redis.host: "clock-redis"

We can deploy or application to our k8s cluster

kubectl apply -f .k8s/

And see it running inside the cluster locally with a port forward

kubectl port-forward deployment/clock-web-api 8000:8000

And that’s all. Our Django application with Websockets using Django Channels up and running with docker and also using k8s.

Source code in my github

Deploying Python Application using Docker and Kubernetes

I’ve learning how to deploy one Python application to Kubernetes. Here you can see my notes:

Let’s start from a dummy Python application. It’s a basic Flask web API. Each time we perform a GET request to “/” we increase one counter and see the number of hits. The persistence layer will be a Redis database. The script is very simple:

from flask import Flask
import os
from redis import Redis

redis = Redis(host=os.getenv('REDIS_HOST', 'localhost'),
              port=os.getenv('REDIS_PORT', 6379))
app = Flask(__name__)

@app.route('/')
def hello():
    redis.incr('hits')
    hits = int(redis.get('hits'))
    return f"Hits: {hits}"


if __name__ == "__main__":
    app.run(host='0.0.0.0')

First of all we create a virtual environment to ensure that we’re going to install your dependencies isolatedly:

python -m venv venv

We enter in the virtualenv

source venv/bin/activate

And we install our dependencies:

pip install -r requirements.txt

To be able to run our application we must ensure that we’ve a Redis database ready. We can run one with Docker:

docker run -p 6379:6379 redis

Now we can start our application:

python app.py

We open our browser with the url: http://localhost:5000 and it works.

Now we’re going to run our application within a Docker container. First of of all we need to create one Docker image from a docker file:

FROM python:alpine3.8
ADD . /code
WORKDIR /code
RUN pip install -r requirements.txt

EXPOSE 5000

Now we can build or image:

docker build -t front .

And now we can run our front image:

docker run -p 5000:5000 front python app.py

If we open now our browser with the url http://localhost:5000 we’ll get a 500 error. That’s because our Docker container is trying to use one Redis host within localhost. It worked before, when our application and our Redis were within the same host. Now our API’s localhost isn’t the same than our host’s one.

Our script the Redis host is localhost by default but it can be passed from an environment variable,

redis = Redis(host=os.getenv('REDIS_HOST', 'localhost'),
              port=os.getenv('REDIS_PORT', 6379))

we can pass to our our Docker container the real host where our Redis resides (suposing my IP address is 192.168.1.100):

docker run -p 5000:5000 --env REDIS_HOST=192.168.1.100 front python app.py

If dont’ want the development server we also can start our API using gunicorn

docker run -p 5000:5000 --env REDIS_HOST=192.168.1.100 front gunicorn -w 1 app:app -b 0.0.0.0:5000

And that works. We can start our app manually using Docker. But it's a bit complicated. We need to run two containers (API and Redis), setting up the env variables.
Docker helps us with docker-compose. We can create a docker-compose.yaml file configuring our all application:


version: '2'

services:
  front:
    image: front
    build:
      context: ./front
      dockerfile: Dockerfile
    container_name: front
    command: gunicorn -w 1 app:app -b 0.0.0.0:5000
    depends_on:
      - redis
    ports:
      - "5000:5000"
    restart: always
    environment:
      REDIS_HOST: redis
      REDIS_PORT: 6379
  redis:
    image: redis
    ports:
      - "6379:6379"

I can execute it

docker-compose up

Docker compose is pretty straightforward. But, what happens if our production environment is a cluster? docker-compose works fine in a single host. But it our production environment is a cluster, we´ll face problems (we need to esure manually things like hight avaiavility and things like that). Docker people tried to answer to this question with Docker Swarm. Basically Swarm is docker-compose within a cluster. It uses almost the same syntax than docker-compose in a single host. Looks good, ins’t it? OK. Nobody uses it. Since Google created their Docker conainer orchestator (Kubernetes, aka K8s) it becames into the de-facto standard. The good thing about K8s is that it’s much better than Swarm (more configurable and more powerfull), but the bad part is that it isn’t as simple and easy to understand as docker-compose.

Let’s try to execute our proyect in K8s:

First I start minikube

minikube start

and I configure kubectl to connect to my minikube k8s cluster

eval $(minikube docker-env)

The API:

First we create one service:

apiVersion: v1
kind: Service
metadata:
  name: front-api
spec:
  # types:
  # - ClusterIP: (default) only accessible from within the Kubernetes cluster
  # - NodePort: accessible on a static port on each Node in the cluster
  # - LoadBalancer: accessible externally through a cloud provider's load balancer
  type: LoadBalancer
  # When the node receives a request on the static port (30163)
  # "select pods with the label 'app' set to 'front-api'"
  # and forward the request to one of them
  selector:
    app: front-api
  ports:
    - protocol: TCP
      port: 5000 # port exposed internally in the cluster
      targetPort: 5000 # the container port to send requests to
      nodePort: 30164 # a static port assigned on each the node

And one deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: front-api
spec:
  # How many copies of each pod do we want?
  replicas: 1

  selector:
    matchLabels:
      # This must match the labels we set on the pod!
      app: front-api

  # This template field is a regular pod configuration
  # nested inside the deployment spec
  template:
    metadata:
      # Set labels on the pod.
      # This is used in the deployment selector.
      labels:
        app: front-api
    spec:
      containers:
        - name: front-api
          image: front:v1
          args: ["gunicorn", "-w 1", "app:app", "-b 0.0.0.0:5000"]
          ports:
            - containerPort: 5000
          env:
            - name: REDIS_HOST
              valueFrom:
                configMapKeyRef:
                  name: api-config
                  key: redis.host

In order to learn a little bit of K8s I’m using a config map called ‘api-config’ where I put some information (such as the Redis host that I’m going to pass as a env variable)

apiVersion: v1
kind: ConfigMap
metadata:
  name: api-config
data:
  redis.host: "back-api"

The Backend: Our Redis database:

First the service:

apiVersion: v1
kind: Service
metadata:
  name: back-api
spec:
  type: ClusterIP
  ports:
    - port: 6379
      targetPort: 6379
  selector:
    app: back-api

And finally the deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: back-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: back-api
  template:
    metadata:
      labels:
        app: back-api
    spec:
      containers:
        - name: back-api
          image: redis
          ports:
            - containerPort: 6379
              name: redis

Before deploying my application to the cluster I need to build my docker image and tag it

docker build -t front .
docker tag front front:v1

Now I can deploy my application to my K8s cluster:

kubectl apply -f .k8s/

If want to know what’s the external url of my application in the cluster I can use this command

minikube service front-api --url

Then I can see it running using the browser or with curl

curl $(minikube service front-api --url)

And that’s all. I can delete all application alos

kubectl delete -f .k8s/ 

Source code available in my github

Alexa and Raspberry Pi demo

We’re keeping on playing with Alexa. This time I want to create one skill that uses a compatible device (for example one Raspberry Pi 3). Here you can see the documentation and examples that the people of Alexa provides us. Basically we need to create a new Product/Alexa gadget in the console. It gives us one amazonId and alexaGadgetSecret.

Then we need to install the SDK in our Raspberry Pi (it install several python libraries). The we can create our gadget script running on our Raspberry Pi, using the amazonId and alexaGadgetSecret. We can listen to several Alexa events. For example: When we say the wakeword, with a timer, an alarm and also we can create our custom events.

I’m going to create one skill that allows me to say something like: “Alexa, use device demo and set color to green” or “Alexa, use device demo and set color to red” and I’ll put this color in the led matrix that I’ve got (one Raspberry Pi Sense Hat)

This is the python script:

import logging
import sys
import json
from agt import AlexaGadget
from sense_hat import SenseHat

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logger = logging.getLogger(__name__)

sense = SenseHat()
sense.clear()

RED = [255, 0, 0]
GREEN = [0, 255, 0]
BLUE = [0, 0, 255]
YELLOW = [255, 255, 0]
BLACK = [0, 0, 0]

class Gadget(AlexaGadget):
    def on_alexa_gadget_statelistener_stateupdate(self, directive):
        for state in directive.payload.states:
            if state.name == 'wakeword':
                sense.clear()
                if state.value == 'active':
                    sense.show_message("Alexa", text_colour=BLUE)

    def set_color_to(self, color):
        sense.set_pixels(([color] * 64))

    def on_custom_gonzalo123_setcolor(self, directive):
        payload = json.loads(directive.payload.decode("utf-8"))
        color = payload['color']

        if color == 'red':
            logger.info('turn on RED display')
            self.set_color_to(RED)

        if color == 'green':
            logger.info('turn on GREEN display')
            self.set_color_to(GREEN)

        if color == 'yellow':
            logger.info('turn on YELLOW display')
            self.set_color_to(YELLOW)

        if color == 'black':
            logger.info('turn on YELLOW display')
            self.set_color_to(BLACK)


if __name__ == '__main__':
    Gadget().main()

And here the ini file of our gadget

[GadgetSettings]
amazonId = my_amazonId
alexaGadgetSecret = my_alexaGadgetSecret

[GadgetCapabilities]
Alexa.Gadget.StateListener = 1.0 - wakeword
Custom.gonzalo123 = 1.0

Whit this ini file I’m saying that my gadget will trigger a function when wakeword has been said and with my custom event “Custom.gonzalo123”

That’s the skill:

const Alexa = require('ask-sdk')

const RequestInterceptor = require('./interceptors/RequestInterceptor')
const ResponseInterceptor = require('./interceptors/ResponseInterceptor')
const LocalizationInterceptor = require('./interceptors/LocalizationInterceptor')
const GadgetInterceptor = require('./interceptors/GadgetInterceptor')

const LaunchRequestHandler = require('./handlers/LaunchRequestHandler')
const ColorHandler = require('./handlers/ColorHandler')
const HelpIntentHandler = require('./handlers/HelpIntentHandler')
const CancelAndStopIntentHandler = require('./handlers/CancelAndStopIntentHandler')
const SessionEndedRequestHandler = require('./handlers/SessionEndedRequestHandler')
const FallbackHandler = require('./handlers/FallbackHandler')
const ErrorHandler = require('./handlers/ErrorHandler')

let skill

module.exports.handler = async (event, context) => {
  if (!skill) {
    skill = Alexa.SkillBuilders.custom().
      addRequestInterceptors(
        RequestInterceptor,
        ResponseInterceptor,
        LocalizationInterceptor,
        GadgetInterceptor
      ).
      addRequestHandlers(
        LaunchRequestHandler,
        ColorHandler,
        HelpIntentHandler,
        CancelAndStopIntentHandler,
        SessionEndedRequestHandler,
        FallbackHandler).
      addErrorHandlers(
        ErrorHandler).
      create()
  }

  return await skill.invoke(event, context)
}

One important thing here is the GadgetInterceptor. This interceptor find the endpointId from the request and append it to the session. This endpoint exists because we’ve created our Product/Alexa gadget and also the python script is running on the device. If the endpoint isn’t present maybe our skill must say to the user something like “No gadgets found. Please try again after connecting your gadget.”.

const utils = require('../lib/utils')

const GadgetInterceptor = {
  async process (handlerInput) {
    const endpointId = await utils.getEndpointIdFromConnectedEndpoints(handlerInput)
    if (endpointId) {
      utils.appendToSession(handlerInput, 'endpointId', endpointId)
    }
  }
}

module.exports = GadgetInterceptor

And finally the handlers to trigger the custom event:

const log = require('../lib/log')
const utils = require('../lib/utils')

const buildLEDDirective = (endpointId, color) => {
  return {
    type: 'CustomInterfaceController.SendDirective',
    header: {
      name: 'setColor',
      namespace: 'Custom.gonzalo123'
    },
    endpoint: {
      endpointId: endpointId
    },
    payload: {
      color: color
    }
  }
}

const ColorHandler = {
  canHandle (handlerInput) {
    return handlerInput.requestEnvelope.request.type === 'IntentRequest'
      && handlerInput.requestEnvelope.request.intent.name === 'ColorIntent'
  },
  handle (handlerInput) {
    const requestAttributes = handlerInput.attributesManager.getRequestAttributes()
    const cardTitle = requestAttributes.t('SKILL_NAME')

    const endpointId = utils.getEndpointIdFromSession(handlerInput)
    if (!endpointId) {
      log.error('endpoint', error)
      return handlerInput.responseBuilder.
        speak(error).
        getResponse()
    }

    const color = handlerInput.requestEnvelope.request.intent.slots['color'].value
    log.info('color', color)
    log.info('endpointId', endpointId)

    return handlerInput.responseBuilder.
      speak(`Ok. The selected color is ${color}`).
      withSimpleCard(cardTitle, color).
      addDirective(buildLEDDirective(endpointId, color)).
      getResponse()
  }
}

module.exports = ColorHandler

Here one video with the working example:

Source code in my github

Playing with threads and Python. Part 2

Today I want to keep on playing with python and threads (part1 here). The idea is create one simple script that prints one asterisk in the console each second. Simple, isn’t it?

from time import sleep
while True:
    print("*")
    sleep(1)

I want to keep this script running but I want to send one message externally, for example using RabbitMQ, and do something within the running script. In this demo, for example, stop the script.

In javascript we can do it with a single tread process using the setInterval function but, since the rabbit listener with pika is a blocking action, we need to use threads in Python (please tell me if I’m wrong). The idea is to create a circuit breaker condition in the main loop to check if I need to stop or not the main thread.

First I’ve created my Rabbit listener in a thread:

from queue import Queue, Empty
import threading
import pika
import os


class Listener(threading.Thread):
    def __init__(self, queue=Queue()):
        super(Listener, self).__init__()
        self.queue = queue
        self.daemon = True

    def run(self):
        channel = self._get_channel()
        channel.queue_declare(queue='stop')

        channel.basic_consume(
            queue='stop',
            on_message_callback=lambda ch, method, properties, body: self.queue.put(item=True),
            auto_ack=True)

        channel.start_consuming()

    def stop(self):
        try:
            return True if self.queue.get(timeout=0.05) is True else False
        except Empty:
            return False

    def _get_channel(self):
        credentials = pika.PlainCredentials(
            username=os.getenv('RABBITMQ_USER'),
            password=os.getenv('RABBITMQ_PASS'))

        parameters = pika.ConnectionParameters(
            host=os.getenv('RABBITMQ_HOST'),
            credentials=credentials)

        connection = pika.BlockingConnection(parameters=parameters)

        return connection.channel()

Now in the main process I start the Listener and I enter in one endless loop to print my asterisk each second but at the end of each loop I check if I need to stop the process or not

from Listener import Listener
from dotenv import load_dotenv
import logging
from time import sleep
import os

logging.basicConfig(level=logging.INFO)

current_dir = os.path.dirname(os.path.abspath(__file__))
load_dotenv(dotenv_path="{}/.env".format(current_dir))

l = Listener()
l.start()


def main():
    while True:
        logging.info("*")
        sleep(1)
        if l.stop():
            break


if __name__ == '__main__':
    main()

As we can see int the stop function we’re using the queue.Queue package to communicate with our listener loop.

And that’s all. In the example I also provide a minimal RabbitMQ server in a docker container.

version: '3.4'

services:
  rabbit:
    image: rabbitmq:3-management
    restart: always
    environment:
      RABBITMQ_ERLANG_COOKIE:
      RABBITMQ_DEFAULT_VHOST: /
      RABBITMQ_DEFAULT_USER: ${RABBITMQ_USER}
      RABBITMQ_DEFAULT_PASS: ${RABBITMQ_PASS}
    ports:
      - "15672:15672"
      - "5672:5672"

Source code available in my github

Playing with threads and Python

Today I want to play a little bit with Python and threads. This kind of post are one kind of cheat sheet that I like to publish, basically to remember how do do things.

I don’t like to use threads. They are very powerful but, as uncle Ben said: Great power comes great responsibility. I prefer decouple process with queues instead using threads, but sometimes I need to use them. Let’s start

First I will build a simple script without threads. This script will append three numbers (1, 2 and 3) to a list and it will sum them. The function that append numbers to the list sleeps a number of seconds equals to the number that we’re appending.

import time

start_time = time.time()

total = []


def sleeper(seconds):
    time.sleep(seconds)
    total.append(seconds)


for s in (1, 2, 3):
    sleeper(s)

end_time = time.time()

total_time = end_time - start_time

print("Total time: {:.3} seconds. Sum: {}".format(total_time, sum(total)))

The the outcome of the script is obviously 6 (1 + 2 + 3) and as we’re sleeping the script a number of seconds equals to the the number that we’re appending our script, it takes 6 seconds to finish.

Now we’re going to use a threading version of the script. We’re going to do the same, but we’re going to use one thread for each append (with the same sleep function)

import time
from queue import Queue
import threading

start_time = time.time()

queue = Queue()


def sleeper(seconds):
    time.sleep(seconds)
    queue.put(seconds)


threads = []
for s in (1, 2, 3):
    t = threading.Thread(target=sleeper, args=(s,))
    threads.append(t)
    t.start()

for one_thread in threads:
    one_thread.join()

total = 0
while not queue.empty():
    total = total + queue.get()

end_time = time.time()

total_time = end_time - start_time
print("Total time: {:.3} seconds. Sum: {}".format(total_time, total))

The outcome of our script is 6 again, but now it takes 3 seconds (the highest sleep).

Source code in my github

Playing with TOTP (2FA) and mobile applications with ionic

Today I want to play with Two Factor Authentication. When we speak about 2FA, TOTP come to our mind. There’re a lot of TOTP clients, for example Google Authenticator.

My idea with this prototype is to build one Mobile application (with ionic) and validate one totp token in a server (in this case a Python/Flask application). The token will be generated with a standard TOTP client. Let’s start

The sever will be a simple Flask server to handle routes. One route (GET /) will generate one QR code to allow us to configure or TOTP client. I’m using the library pyotp to handle totp operations.

from flask import Flask, jsonify, abort, render_template, request
import os
from dotenv import load_dotenv
from functools import wraps
import pyotp
from flask_qrcode import QRcode

current_dir = os.path.dirname(os.path.abspath(__file__))
load_dotenv(dotenv_path="{}/.env".format(current_dir))

totp = pyotp.TOTP(os.getenv('TOTP_BASE32_SECRET'))

app = Flask(__name__)
QRcode(app)


def verify(key):
    return totp.verify(key)


def authorize(f):
    @wraps(f)
    def decorated_function(*args, **kws):
        if not 'Authorization' in request.headers:
            abort(401)

        data = request.headers['Authorization']
        token = str.replace(str(data), 'Bearer ', '')

        if token != os.getenv('BEARER'):
            abort(401)

        return f(*args, **kws)

    return decorated_function


@app.route('/')
def index():
    return render_template('index.html', totp=pyotp.totp.TOTP(os.getenv('TOTP_BASE32_SECRET')).provisioning_uri("gonzalo123.com", issuer_name="TOTP Example"))


@app.route('/check/<key>', methods=['GET'])
@authorize
def alert(key):
    status = verify(key)
    return jsonify({'status': status})


if __name__ == "__main__":
    app.run(host='0.0.0.0')

I’ll use an standard TOTP client to generate the tokens but with pyotp we can easily create a client also

import pyotp
import time
import os
from dotenv import load_dotenv
import logging

logging.basicConfig(level=logging.INFO)

current_dir = os.path.dirname(os.path.abspath(__file__))
load_dotenv(dotenv_path="{}/.env".format(current_dir))

totp = pyotp.TOTP(os.getenv('TOTP_BASE32_SECRET'))

mem = None
while True:
    now = totp.now()
    if mem != now:
        logging.info(now)
        mem = now
        time.sleep(1)

And finally the mobile application. It’s a simple ionic application. That’s the view:

<ion-header>
  <ion-toolbar>
    <ion-title>
      TOTP Validation demo
    </ion-title>
  </ion-toolbar>
</ion-header>

<ion-content>
  <div class="ion-padding">
    <ion-item>
      <ion-label position="stacked">totp</ion-label>
      <ion-input placeholder="Enter value" [(ngModel)]="totp"></ion-input>
    </ion-item>
    <ion-button fill="solid" color="secondary" (click)="validate()" [disabled]="!totp">
      Validate
      <ion-icon slot="end" name="help-circle-outline"></ion-icon>
    </ion-button>
  </div>
</ion-content>

The controller:

import { Component } from '@angular/core'
import { ApiService } from '../sercices/api.service'
import { ToastController } from '@ionic/angular'

@Component({
  selector: 'app-home',
  templateUrl: 'home.page.html',
  styleUrls: ['home.page.scss']
})
export class HomePage {
  public totp

  constructor (private api: ApiService, public toastController: ToastController) {}

  validate () {
    this.api.get('/check/' + this.totp).then(data => this.alert(data.status))
  }

  async alert (status) {
    const toast = await this.toastController.create({
      message: status ? 'OK' : 'Not valid code',
      duration: 2000,
      color: status ? 'primary' : 'danger',
    })
    toast.present()
  }
}

I’ve also put a simple security system. In a real life application we’ll need something better, but here I’ve got a Auth Bearer harcoded and I send it en every http request. To do it I’ve created a simple api service

import { Injectable } from '@angular/core'
import { isDevMode } from '@angular/core'
import { HttpClient, HttpHeaders, HttpParams } from '@angular/common/http'
import { CONF } from './conf'

@Injectable({
  providedIn: 'root'
})
export class ApiService {

  private isDev: boolean = isDevMode()
  private apiUrl: string

  constructor (private http: HttpClient) {
    this.apiUrl = this.isDev ? CONF.API_DEV : CONF.API_PROD
  }

  public get (uri: string, params?: Object): Promise<any> {
    return new Promise((resolve, reject) => {
      this.http.get(this.apiUrl + uri, {
        headers: ApiService.getHeaders(),
        params: ApiService.getParams(params)
      }).subscribe(
        res => {this.handleHttpNext(res), resolve(res)},
        err => {this.handleHttpError(err), reject(err)},
        () => this.handleHttpComplete()
      )
    })
  }

  private static getHeaders (): HttpHeaders {

    const headers = {
      'Content-Type': 'application/json'
    }

    headers['Authorization'] = 'Bearer ' + CONF.bearer

    return new HttpHeaders(headers)
  }

  private static getParams (params?: Object): HttpParams {
    let Params = new HttpParams()
    for (const key in params) {
      if (params.hasOwnProperty(key)) {
        Params = Params.set(key, params[key])
      }
    }

    return Params
  }

  private handleHttpError (err) {
    console.log('HTTP Error', err)
  }

  private handleHttpNext (res) {
    console.log('HTTP response', res)
  }

  private handleHttpComplete () {
    console.log('HTTP request completed.')
  }
}

And that’s all. Here one video with a working example of the prototype:

Source code here

Playing with lambda, serverless and Python

Couple of weeks ago I attended to serverless course. I’ve played with lambdas from time to time (basically when AWS forced me to use them) but without knowing exactly what I was doing. After this course I know how to work with the serverless framework and I understand better lambda world. Today I want to hack a little bit and create a simple Python service to obtain random numbers. Let’s start

We don’t need Flask to create lambdas but as I’m very comfortable with it so we’ll use it here.
Basically I follow the steps that I’ve read here.

from flask import Flask

app = Flask(__name__)


@app.route("/", methods=["GET"])
def hello():
    return "Hello from lambda"


if __name__ == '__main__':
    app.run()

And serverless yaml to configure the service

service: random

plugins:
  - serverless-wsgi
  - serverless-python-requirements
  - serverless-pseudo-parameters

custom:
  defaultRegion: eu-central-1
  defaultStage: dev
  wsgi:
    app: app.app
    packRequirements: false
  pythonRequirements:
    dockerizePip: non-linux

provider:
  name: aws
  runtime: python3.7
  region: ${opt:region, self:custom.defaultRegion}
  stage: ${opt:stage, self:custom.defaultStage}

functions:
  home:
    handler: wsgi_handler.handler
    events:
      - http: GET /

We’re going to use serverless plugins. We need to install them:

npx serverless plugin install -n serverless-wsgi
npx serverless plugin install -n serverless-python-requirements
npx serverless plugin install -n serverless-pseudo-parameters

And that’s all. Our “Hello world” lambda service with Python and Flask is up and running.
Now We’re going to create a “more complex” service. We’re going to return a random number with random.randint function.
randint requires two parameters: start, end. We’re going to pass the end parameter to our service. The start value will be parameterized. I’ll parameterize it only because I want to play with AWS’s Parameter Store (SSM). It’s just an excuse.

Let’s start with the service:

from random import randint
from flask import Flask, jsonify
import boto3
from ssm_parameter_store import SSMParameterStore

import os
from dotenv import load_dotenv

current_dir = os.path.dirname(os.path.abspath(__file__))
load_dotenv(dotenv_path="{}/.env".format(current_dir))

app = Flask(__name__)

app.config.update(
    STORE=SSMParameterStore(
        prefix="{}/{}".format(os.environ.get('ssm_prefix'), os.environ.get('stage')),
        ssm_client=boto3.client('ssm', region_name=os.environ.get('region')),
        ttl=int(os.environ.get('ssm_ttl'))
    )
)


@app.route("/", methods=["GET"])
def hello():
    return "Hello from lambda"


@app.route("/random/<int:to_int>", methods=["GET"])
def get_random_quote(to_int):
    from_int = app.config['STORE']['from_int']
    return jsonify(randint(from_int, to_int))


if __name__ == '__main__':
    app.run()

Now the serverless configuration. I can use only one function, handling all routes and letting Flask do the job.

functions:
  app:
    handler: wsgi_handler.handler
    events:
      - http: ANY /
      - http: 'ANY {proxy+}'

But in this example I want to create two different functions. Only for fun (and to use different role statements and different logs in cloudwatch).

service: random

plugins:
  - serverless-wsgi
  - serverless-python-requirements
  - serverless-pseudo-parameters
  - serverless-iam-roles-per-function

custom:
  defaultRegion: eu-central-1
  defaultStage: dev
  wsgi:
    app: app.app
    packRequirements: false
  pythonRequirements:
    dockerizePip: non-linux

provider:
  name: aws
  runtime: python3.7
  region: ${opt:region, self:custom.defaultRegion}
  stage: ${opt:stage, self:custom.defaultStage}
  memorySize: 128
  environment:
    region: ${self:provider.region}
    stage: ${self:provider.stage}

functions:
  app:
    handler: wsgi_handler.handler
    events:
      - http: ANY /
      - http: 'ANY {proxy+}'
    iamRoleStatements:
      - Effect: Allow
        Action: ssm:DescribeParameters
        Resource: arn:aws:ssm:${self:provider.region}:#{AWS::AccountId}:*
      - Effect: Allow
        Action: ssm:GetParameter
        Resource: arn:aws:ssm:${self:provider.region}:#{AWS::AccountId}:parameter/random/*
  home:
    handler: wsgi_handler.handler
    events:
      - http: GET /

And that’s all. “npx serverless deploy” and my random generator is running.

Data Analysis with Python. Pivot tables with Pandas

One of the first post in my blog was about Pivot tables. I’d created a library to pivot tables in my PHP scripts. The library is not very beautiful (it throws a lot of warnings), but it works. These days I’m playing with Python Data Analysis and I’m using Pandas. The purpose of this post is something that I like a lot: Learn by doing. So I want to do the same operations that I did eight years ago in the post but now with Pandas. Let’s start.

I’ll start with the same datasource that I used almost ten years ago. One simple recordset with cliks and number of users

I create a dataframe with this data

import numpy as np
import pandas as pd

data = pd.DataFrame([
    {'host': 1, 'country': 'fr', 'year': 2010, 'month': 1, 'clicks': 123, 'users': 4},
    {'host': 1, 'country': 'fr', 'year': 2010, 'month': 2, 'clicks': 134, 'users': 5},
    {'host': 1, 'country': 'fr', 'year': 2010, 'month': 3, 'clicks': 341, 'users': 2},
    {'host': 1, 'country': 'es', 'year': 2010, 'month': 1, 'clicks': 113, 'users': 4},
    {'host': 1, 'country': 'es', 'year': 2010, 'month': 2, 'clicks': 234, 'users': 5},
    {'host': 1, 'country': 'es', 'year': 2010, 'month': 3, 'clicks': 421, 'users': 2},
    {'host': 1, 'country': 'es', 'year': 2010, 'month': 4, 'clicks': 22, 'users': 3},
    {'host': 2, 'country': 'es', 'year': 2010, 'month': 1, 'clicks': 111, 'users': 2},
    {'host': 2, 'country': 'es', 'year': 2010, 'month': 2, 'clicks': 2, 'users': 4},
    {'host': 3, 'country': 'es', 'year': 2010, 'month': 3, 'clicks': 34, 'users': 2},
    {'host': 3, 'country': 'es', 'year': 2010, 'month': 4, 'clicks': 1, 'users': 1}
])

Pivot_tables

Now we want to do a simple pivot operation. We want to pivot on host

pd.pivot_table(data,
   index=['host'],
   values=['users', 'clicks'],
   columns=['year', 'month'],
   fill_value=''
  )

Pivot_tables_2

We can add totals

pd.pivot_table(data,
               index=['host'],
               values=['users', 'clicks'],
               columns=['year', 'month'],
               fill_value='',
               aggfunc=np.sum,
               margins=True,
               margins_name='Total'
              )

Pivot_tables_3

We can also pivot on more than one column. For example host and country

pd.pivot_table(data,
               index=['host', 'country'],
               values=['users', 'clicks'],
               columns=['year', 'month'],
               fill_value=''
              )

Pivot_tables_4

and also with totals

pd.pivot_table(data,
               index=['host', 'country'],
               values=['users', 'clicks'],
               columns=['year', 'month'],
               aggfunc=np.sum,
               fill_value='',
               margins=True,
               margins_name='Total'
              )

Pivot_tables_5

We can group by dataframe and calculate subtotals

data.groupby(['host', 'country'])[('clicks', 'users')].sum()

Pivot_tables_6

data.groupby(['host', 'country'])[('clicks', 'users')].mean()

Pivot_tables_7

And finally we can mix totals and subtotals.

out = data.groupby('host').apply(lambda sub: sub.pivot_table(
    index=['host', 'country'],
    values=['users', 'clicks'],
    columns=['year', 'month'],
    aggfunc=np.sum,
    margins=True,
    margins_name='SubTotal',
))

out.loc[('', 'Max', '')] = out.max()
out.loc[('', 'Min', '')] = out.min()
out.loc[('', 'Total', '')] = out.sum()

out.index = out.index.droplevel(0)

out.fillna('', inplace=True)

Pivot_tables_8

And that’s all. A lot of to learn yet about data analysis, but Pandas will be definitely a good friend of mine.

You can see the Jupiter notebook in my github account