Deployment

While the CLI is useful to test the crawler, in production, you should prefer to deploy a server. Given that Polymath is a collection of libraries, you can create your own server quite easily. However, you can use our own server. The documentation below shows you how.

⚠️ To have an API over HTTP or RPC, you need to use a customised server.

Docker

We recommend that you deploy the polymath crawler using Docker (or Podman). In this example, we’re going to deploy polymath with its extension to save pages on Apache Solr.

Create a docker-compose.yaml and write:

services:
    solr:
        image: solr:9-slim
        ports:
            - 8983:8983
        volumes:
            - data:/var/solr
        command:
            - solr-precreate
            - gettingstarted

    zookeeper:
        image: wurstmeister/zookeeper
        ports:
            - 2181:2181

    kafka:
        image: wurstmeister/kafka
        depends_on:
            - zookeeper
        ports:
            - 9092:9092
        environment:
            KAFKA_ADVERTISED_LISTENERS: INSIDE://kafka:9092,OUTSIDE://localhost:9093
            KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
            KAFKA_LISTENERS: INSIDE://0.0.0.0:9092,OUTSIDE://0.0.0.0:9093
            KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
            KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
            KAFKA_CREATE_TOPICS: "baeldung:1:1"


    polymath:
        image: ghcr.io/lubmminy/polymath
        depends_on:
            - solr
            - kafka

volumes:
    data: