OAuth2 Authentication in Streamlit Applications with Nginx and OAuth2-Proxy

Normally, when I want to provide authentication to a service, I use OAuth2. There are libraries to integrate this authentication mechanism into a web application, but sometimes we cannot do this easily because it is a third-party service over which we have no control. In these cases, it is possible that this third-party service has support for OAuth2 and can also log in with OAuth2. But sometimes this is not possible, or it is too complicated. In these cases, a solution is to use a proxy that handles the authentication and communicates with the third-party service. In this example, we will use a Streamlit application as if it were a third-party application.

import streamlit as st

st.set_page_config(
    page_title="Home",
    page_icon="👋",
)
st.write("# Welcome to Streamlit! 👋")
st.markdown(
    """
    Streamlit is an open-source app framework built specifically for
    Machine Learning and Data Science projects.
    **👈 Select a demo from the sidebar** to see some examples
    of what Streamlit can do!
    ### Want to learn more?
    - Check out [streamlit.io](https://streamlit.io)
    - Jump into our [documentation](https://docs.streamlit.io)
    - Ask a question in our [community
        forums](https://discuss.streamlit.io)
    ### See more complex demos
    - Use a neural net to [analyze the Udacity Self-driving Car Image
        Dataset](https://github.com/streamlit/demo-self-driving)
    - Explore a [New York City rideshare dataset](https://github.com/streamlit/demo-uber-nyc-pickups)
"""
)

st.sidebar.success("Select a demo above.")

Our Streamlit application has a page.

from random import randint

import streamlit as st

st.set_page_config(
    page_title="Hello",
    page_icon="👋",
)

st.markdown("# Plotting Demo")
st.sidebar.header("Plotting Demo")
st.write("This demo illustrates a combination of plotting with Streamlit. Enjoy!")

data = [dict(name=f"name{i}", value=randint(1, 1000)) for i in range(1, 101)]

progress_bar = st.sidebar.progress(0)
status_text = st.sidebar.empty()
chart = st.line_chart([item['value'] for item in data])

progress_bar.empty()

st.button("Re-run")

To use OAuth authentication in the Streamlit application, we are using Nginx as a reverse proxy with the auth_request directive to direct requests to an OAuth2-proxy service deployed in our stack. OAuth2-proxy can be configured to authenticate any OAuth2 server compatible with OpenID. In my example, I am using GitHub, but you can use ActiveDirectory, Google, Keycloak, or even your own OAuth2 server. This is my Nginx configuration:

This is my Nginx configuration:

upstream app {
    server streamlit:8501;
}

upstream oauth2 {
    server oauth2-proxy:4180;
}

server {
    listen 8000;

    location / {
        auth_request /oauth2/auth;
        error_page 401 = @error401;
        try_files $uri @proxy_to_app;
    }

    location /_stcore/stream {
        auth_request /oauth2/auth;
        error_page 401 = @error401;
        proxy_pass http://app/_stcore/stream;
        proxy_http_version 1.1;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_read_timeout 86400;
    }

    location @error401 {
        return 302 /oauth2/sign_in;
    }

    location /oauth2/ {
        try_files $uri @proxy_to_oauth2;
    }

    location @proxy_to_oauth2 {
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_pass http://oauth2;
    }

    location @proxy_to_app {
        proxy_set_header X-Forwarded-Proto https;
        proxy_set_header X-Url-Scheme $scheme;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_redirect off;
        proxy_pass http://app;
    }
}

The complete stack can be seen in the docker-compose.yml:

version: '3.9'

services:
  streamlit:
    build: .
    environment:
      - ENVIRONMENT=docker
    command: ["streamlit", "run", "st.py", "--server.port=8501", "--server.address=0.0.0.0"]

  nginx:
    build: .docker/nginx
    ports:
      - "8000:8000"

  oauth2-proxy:
    image: quay.io/oauth2-proxy/oauth2-proxy:v7.8.1
    env_file:
      - .env

And that’s all. The advantage of using oauth2-proxy is that we don’t need to do anything within the Streamlit application to have OAuth2 authentication. This greatly simplifies the integration process, as all the authentication logic is handled outside the main application. Additionally, oauth2-proxy is compatible with any OAuth2 server that complies with OpenID, giving us the flexibility to use different authentication providers. By using Nginx as a reverse proxy, we can efficiently redirect and manage authentication requests, ensuring that only authenticated users can access our Streamlit application.

Full code available in my github account.

Deploying Python Application using Docker and Kubernetes

I’ve learning how to deploy one Python application to Kubernetes. Here you can see my notes:

Let’s start from a dummy Python application. It’s a basic Flask web API. Each time we perform a GET request to “/” we increase one counter and see the number of hits. The persistence layer will be a Redis database. The script is very simple:

from flask import Flask
import os
from redis import Redis

redis = Redis(host=os.getenv('REDIS_HOST', 'localhost'),
              port=os.getenv('REDIS_PORT', 6379))
app = Flask(__name__)

@app.route('/')
def hello():
    redis.incr('hits')
    hits = int(redis.get('hits'))
    return f"Hits: {hits}"


if __name__ == "__main__":
    app.run(host='0.0.0.0')

First of all we create a virtual environment to ensure that we’re going to install your dependencies isolatedly:

python -m venv venv

We enter in the virtualenv

source venv/bin/activate

And we install our dependencies:

pip install -r requirements.txt

To be able to run our application we must ensure that we’ve a Redis database ready. We can run one with Docker:

docker run -p 6379:6379 redis

Now we can start our application:

python app.py

We open our browser with the url: http://localhost:5000 and it works.

Now we’re going to run our application within a Docker container. First of of all we need to create one Docker image from a docker file:

FROM python:alpine3.8
ADD . /code
WORKDIR /code
RUN pip install -r requirements.txt

EXPOSE 5000

Now we can build or image:

docker build -t front .

And now we can run our front image:

docker run -p 5000:5000 front python app.py

If we open now our browser with the url http://localhost:5000 we’ll get a 500 error. That’s because our Docker container is trying to use one Redis host within localhost. It worked before, when our application and our Redis were within the same host. Now our API’s localhost isn’t the same than our host’s one.

Our script the Redis host is localhost by default but it can be passed from an environment variable,

redis = Redis(host=os.getenv('REDIS_HOST', 'localhost'),
              port=os.getenv('REDIS_PORT', 6379))

we can pass to our our Docker container the real host where our Redis resides (suposing my IP address is 192.168.1.100):

docker run -p 5000:5000 --env REDIS_HOST=192.168.1.100 front python app.py

If dont’ want the development server we also can start our API using gunicorn

docker run -p 5000:5000 --env REDIS_HOST=192.168.1.100 front gunicorn -w 1 app:app -b 0.0.0.0:5000

And that works. We can start our app manually using Docker. But it's a bit complicated. We need to run two containers (API and Redis), setting up the env variables.
Docker helps us with docker-compose. We can create a docker-compose.yaml file configuring our all application:


version: '2'

services:
  front:
    image: front
    build:
      context: ./front
      dockerfile: Dockerfile
    container_name: front
    command: gunicorn -w 1 app:app -b 0.0.0.0:5000
    depends_on:
      - redis
    ports:
      - "5000:5000"
    restart: always
    environment:
      REDIS_HOST: redis
      REDIS_PORT: 6379
  redis:
    image: redis
    ports:
      - "6379:6379"

I can execute it

docker-compose up

Docker compose is pretty straightforward. But, what happens if our production environment is a cluster? docker-compose works fine in a single host. But it our production environment is a cluster, we´ll face problems (we need to esure manually things like hight avaiavility and things like that). Docker people tried to answer to this question with Docker Swarm. Basically Swarm is docker-compose within a cluster. It uses almost the same syntax than docker-compose in a single host. Looks good, ins’t it? OK. Nobody uses it. Since Google created their Docker conainer orchestator (Kubernetes, aka K8s) it becames into the de-facto standard. The good thing about K8s is that it’s much better than Swarm (more configurable and more powerfull), but the bad part is that it isn’t as simple and easy to understand as docker-compose.

Let’s try to execute our proyect in K8s:

First I start minikube

minikube start

and I configure kubectl to connect to my minikube k8s cluster

eval $(minikube docker-env)

The API:

First we create one service:

apiVersion: v1
kind: Service
metadata:
  name: front-api
spec:
  # types:
  # - ClusterIP: (default) only accessible from within the Kubernetes cluster
  # - NodePort: accessible on a static port on each Node in the cluster
  # - LoadBalancer: accessible externally through a cloud provider's load balancer
  type: LoadBalancer
  # When the node receives a request on the static port (30163)
  # "select pods with the label 'app' set to 'front-api'"
  # and forward the request to one of them
  selector:
    app: front-api
  ports:
    - protocol: TCP
      port: 5000 # port exposed internally in the cluster
      targetPort: 5000 # the container port to send requests to
      nodePort: 30164 # a static port assigned on each the node

And one deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: front-api
spec:
  # How many copies of each pod do we want?
  replicas: 1

  selector:
    matchLabels:
      # This must match the labels we set on the pod!
      app: front-api

  # This template field is a regular pod configuration
  # nested inside the deployment spec
  template:
    metadata:
      # Set labels on the pod.
      # This is used in the deployment selector.
      labels:
        app: front-api
    spec:
      containers:
        - name: front-api
          image: front:v1
          args: ["gunicorn", "-w 1", "app:app", "-b 0.0.0.0:5000"]
          ports:
            - containerPort: 5000
          env:
            - name: REDIS_HOST
              valueFrom:
                configMapKeyRef:
                  name: api-config
                  key: redis.host

In order to learn a little bit of K8s I’m using a config map called ‘api-config’ where I put some information (such as the Redis host that I’m going to pass as a env variable)

apiVersion: v1
kind: ConfigMap
metadata:
  name: api-config
data:
  redis.host: "back-api"

The Backend: Our Redis database:

First the service:

apiVersion: v1
kind: Service
metadata:
  name: back-api
spec:
  type: ClusterIP
  ports:
    - port: 6379
      targetPort: 6379
  selector:
    app: back-api

And finally the deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: back-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: back-api
  template:
    metadata:
      labels:
        app: back-api
    spec:
      containers:
        - name: back-api
          image: redis
          ports:
            - containerPort: 6379
              name: redis

Before deploying my application to the cluster I need to build my docker image and tag it

docker build -t front .
docker tag front front:v1

Now I can deploy my application to my K8s cluster:

kubectl apply -f .k8s/

If want to know what’s the external url of my application in the cluster I can use this command

minikube service front-api --url

Then I can see it running using the browser or with curl

curl $(minikube service front-api --url)

And that’s all. I can delete all application alos

kubectl delete -f .k8s/ 

Source code available in my github