Scoopi Cluster Containers
We can also run Scoopi Cluster as plain Docker containers.
Install Scoopi from Docker Image
Scoopi releases are available as docker image from DockerHub. To run the image you need Docker installed in the system. The following command pulls Scoopi image, creates and run container named scoopi.
docker run --name scoopi codetab/scoopi
It executes quick-start example which outputs single record to an output file. However, we will not be able to view the output file nor modify the conf files as they are within the container. We need to externalize these folders with following commands.
mkdir scoopi
cd scoopi
docker cp scoopi:/scoopi/conf .
docker cp scoopi:/scoopi/output .
docker cp scoopi:/scoopi/docker .
docker cp scoopi:/scoopi/defs .
docker cp scoopi:/scoopi/logs .
Here, we make a folder named scoopi and then copy conf, output, docker, defs and logs folders from the container to the scoopi folder. Now, we can modify conf, def files and also, view the output file without login into the container. Next, remove the container as we are going to recreate it with a new set of parameters.
docker rm scoopi
By default, Scoopi runs in solo mode. Change to cluster mode via scoopi.cluster.enable config in conf/scoopi.properties
file. Optionally, change scoopi.defs.dir from quickstart example to ex-13. The scoopi.properties
file is in conf
directory we have copied above.
scoopi.cluster.enable=true
scoopi.defs.dir=/defs/examples/fin/jsoup/ex-13
Docker Compose
It is quite easy to run Scoopi cluster with Docker Compose. The docker
folder contains docker-cluster.yml which boots up Scoopi cluster with three servers. Change to folder where you installed scoopi from docker image and run
cd scoopi
cp docker/docker-cluster.yml .
docker-compose -f docker-cluster.yml up
In case docker compose is not available, then we can bring up the cluster with docker run
command as explained in next section.
With Docker Run
Install Scoopi from Docker image and update scoopi.properties
as explained above. By default, to form the cluster, we have to run three Scoopi containers. Open terminal and run following command. Wait till Cluster boots up and shows message: wait for cluster quorum of 3 nodes, timeout 60.
NODE_NAME=scoopi-node-1
JAVA_OPTS="-Dscoopi.cluster.config.file=/hazelcast-multicast.xml"
docker run --name $NODE_NAME -d \
-v $PWD/conf:/scoopi/conf -v $PWD/defs:/scoopi/defs \
-v $PWD/logs:/scoopi/logs -v $outputDir:/scoopi/output \
-v $PWD/data:/scoopi/data \
-e JAVA_OPTS="$JAVA_OPTS" codetab/scoopi:latest
Next, run above command by changing NODE_NAME to scoopi-node-1. Start the third instance with metrics enabled with following command.
NODE_NAME=scoopi-node-3
JAVA_OPTS="-Dscoopi.cluster.config.file=/hazelcast-multicast.xml"
JAVA_OPTS+=" -Dscoopi.metrics.server.enable=true"
docker run --name $NODE_NAME -d \
-v $PWD/conf:/scoopi/conf -v $PWD/defs:/scoopi/defs \
-v $PWD/logs:/scoopi/logs -v $outputDir:/scoopi/output \
-v $PWD/data:/scoopi/data \
-e JAVA_OPTS="$JAVA_OPTS" codetab/scoopi:latest
Once third instance boots up, cluster is formed and scoopi proceeds and completes the quick start example and the output folder will have scraped data.
For clients, we need to update conf/hazelcast-client.xml
file with the bridge network used by docker containers for connection between them. When you run the server instance, at the startup the IP:Port used by server instance is output in console something like - this member address: /172.17.0.2:5701. Update the network element in conf/hazelcast-client.xml
as
<network>
<cluster-members>
<address>172.17.0.1</address>
<address>172.17.0.2</address>
<address>172.17.0.3</address>
</cluster-members>
</network>
The subnet may vary in your system, use the subnet used by your docker bridge network. Now, run client instance with,
NODE_NAME=scoopi-node-4
JAVA_OPTS="-Dscoopi.cluster.mode=client"
JAVA_OPTS+=" -Dscoopi.cluster.config.file=/hazelcast-client.xml"
docker run --name $NODE_NAME -d \
-v $PWD/conf:/scoopi/conf -v $PWD/defs:/scoopi/defs \
-v $PWD/logs:/scoopi/logs -v $outputDir:/scoopi/output \
-v $PWD/data:/scoopi/data \
-e JAVA_OPTS="$JAVA_OPTS" codetab/scoopi:latest
For more about configuring the hazelcast network in various scenarios refer Configuring Hazelcast in non-orchestrated Docker environments.