Bookmark this page

Guided Exercise: Creating a Connect Cluster

In this exercise you will create and configure a Kafka Connect Cluster with a predefined source connector.

Outcomes

You should be able to create a Kafka Connect Cluster with AMQ Streams on the Red Hat OpenShift Container Platform.

To perform this exercise, ensure you have the following:

  • Access to a configured and running OpenShift cluster.

  • Access to an installed and running Kafka instance in the OpenShift cluster.

  • A configured Python virtual environment, including the grading scripts for this course.

  • The OpenShift CLI (oc).

  • A Quay.io account.

From your workspace directory, activate the Python virtual environment, and use the lab command to start the scenario for this exercise.

[user@host AD482]$ source .venv/bin/activate
(.venv) [user@host AD482]$ lab start connect-cluster
...output omitted...

Important

On Windows, use the Activate.ps1 script to activate your Python virtual environment.

PS C:\Users\user\AD482> ./.venv/Scripts/Activate.ps1

The lab command copies the exercise files, from the AD482-apps/connect-cluster/apps/resources directory, which is in your local Git repository, into the connect-cluster directory, which is in your workspace.

Procedure 5.1. Instructions

  1. Create a Quay.io public repository and download the Quay.io credentials data. Create a push secret that can push the Kafka Connect cluster image while creating the cluster.

    1. Navigate to https://quay.io/YOUR_QUAY_USERNAME and create a public repository with the name ad482-filewatch-connect-cluster. This will be your repository for your Kafka Connect image.

    2. Navigate to your account settings, click Generate Encrypted Password. Click Docker Configuration, and then click Download YOUR_QUAY_USERNAME-auth.json to download the credentials file.

    3. Create a secret by using the credentials file that you just downloaded. Specify the file path of the YOUR_QUAY_USERNAME-auth.json file with the from-file parameter.

      (.venv) [user@host AD482]$ oc create secret generic \
       filewatch-connect-cluster-push-secret \
       --from-file=.dockerconfigjson=/absolute/path/to/YOUR_QUAY_USERNAME-auth.json \
       --type=kubernetes.io/dockerconfigjson
      secret/filewatch-connect-cluster-push-secret created
  2. Create a Kafka Connect cluster with name filewatch-connect-cluster.

    1. Change to the connect-cluster directory.

      (.venv) [user@host AD482]$ cd connect-cluster
      (.venv) [user@host connect-cluster]$
    2. By using your editor of choice, open the filewatch-connect-cluster.yaml file. The file has a Kafka Connect cluster configuration that requires the Camel File Watch Kafka Connector plug-in. The plugin enables the Kafka Connect cluster image file watching capabilities.

      Replace your Quay.io username with YOUR_QUAY_USERNAME in the YAML file.

      apiVersion: kafka.strimzi.io/v1beta2
      kind: KafkaConnect
      ...output omitted...
        build:
          output:
            image: 'quay.io/YOUR_QUAY_USERNAME/ad482-filewatch-connect-cluster:latest'
            pushSecret: filewatch-connect-cluster-push-secret
            type: docker
          plugins:
            - artifacts:
                - type: tgz
                  url: >-
                    https://repo1.maven.org/maven2/org/apache/camel/kafkaconnector/camel-file-watch-kafka-connector/0.10.1/camel-file-watch-kafka-connector-0.10.1-package.tar.gz
              name: filewatch-source-connector
        config:
      ...output omitted...
        replicas: 1
        version: 2.8.0
    3. Run the following command to create the cluster.

      (.venv) [user@host connect-cluster]$ oc create \
       -f filewatch-connect-cluster.yaml
      kafkaconnect.kafka.strimzi.io/filewatch-connect-cluster created
  3. Verify that the Kafka Connect cluster is working properly.

    1. Run the following command to verify that the Kafka Connect cluster is in the Ready state.

      Important

      You may need to run the command several times until the output shows the desired state.

      (.venv) [user@host connect-cluster]$ oc get pods
      NAME                                                 READY   STATUS      RESTARTS   AGE
      ...output omitted...
      filewatch-connect-cluster-connect-57488b8f95-qfkv7   1/1     Running     0          2m41s
      filewatch-connect-cluster-connect-build-1-build      0/1     Completed   0          3m19s
      ...output omitted...
    2. Verify that the Kafka Connect cluster plug-ins provided in the pod.

      (.venv) [user@host connect-cluster]$ oc exec \
       filewatch-connect-cluster-connect-POD_SUFFIX \
       -- ls -R plugins/filewatch-source-connector
      ...output omitted...
      camel-file-watch-kafka-connector-0.10.1.jar
      ...output omitted...
  4. Create the connector resource by using the preconfigured YAML.

    (.venv) [user@host connect-cluster]$ oc create \
     -f filewatch-source-connector.yaml
    kafkaconnector.kafka.strimzi.io/filewatch-source-connector created
  5. Interact with the Kafka Connect REST API to get the cluster and connectors information.

    Note

    Because the Kafka Connect cluster is not accessible from outside the OpenShift cluster directly, you must run the curl commands in the cluster pod to check out the REST API.

    1. Run the following command to get the general cluster info.

      Important

      You might need to run the command several times. It can take time for the Kafka Connect cluster and connector to be ready.

      (.venv) [user@host connect-cluster]$ oc exec -it \
       filewatch-connect-cluster-connect-POD_SUFFIX \
       -- curl -X GET localhost:8083
      {"version":"2.8.0.redhat-00005","commit":"5568dc65ed44500b",
      "kafka_cluster_id":"pkAfTbvQTuqfo10Hlg_PxQ"}
    2. Run a curl command for the path /connectors in the pod to get the connectors list.

      (.venv) [user@host connect-cluster]$ oc exec -it \
       filewatch-connect-cluster-connect-POD_SUFFIX \
       -- curl -X GET localhost:8083/connectors
      ["filewatch-source-connector"]
    3. Run the following command, which gets the filewatch-source-connector detail. For this, you use the REST API with the connectors/filewatch-source-connector path.

      (.venv) [user@host connect-cluster]$ oc exec -it \
       filewatch-connect-cluster-connect-POD_SUFFIX \
       -- curl -X GET localhost:8083/connectors/filewatch-source-connector
      {"name":"filewatch-source-connector","config":{"connector.class":"org.apache.camel.kafkaconnector.filewatch.
      CamelFilewatchSourceConnector","camel.source.path.path":"/tmp","tasks.max":"1","name":"filewatch-source-connector"},"tasks":[{"connector":"filewatch-source-connector","task":0}],"type":"source"}
  6. Test the Kafka Connect cluster by checking the connector that the cluster provides. It is a file watch connector that watches /tmp path and writes any changes to a topic.

    1. Copy the test directory to the Kafka Connect pod's /tmp path.

      (.venv) [user@host connect-cluster]$ oc rsync ./test \
       filewatch-connect-cluster-connect-POD_SUFFIX:/tmp
      sending incremental file list
      test/
      test/file1.txt
      test/file2.txt
      
      sent 208 bytes  received 70 bytes  79.43 bytes/sec
      total size is 34  speedup is 0.12
    2. Run the console-consumer command in one of the Kafka broker pods. Verify that the Kafka Connect cluster produced the file changes within the /tmp directory via the provided connector.

      (.venv) [user@host connect-cluster]$ oc exec \
      -it my-cluster-kafka-0 -- bin/kafka-console-consumer.sh \
       --bootstrap-server my-cluster-kafka-brokers:9092 \
       --topic test --from-beginning
      /tmp/test
      /tmp/test/file1.txt
      /tmp/test/file2.txt

      Stop the console consumer.

      Important

      The preceding consumer output may slightly change due to the transfer speed of oc rsync. Slow transfer may lead Camel File Watch plug-in to capture temp files like /tmp/test/.file1.txt.CzKvvh. You might see these types of files in the console output.

Finish

Return to your workspace directory and use the lab command to complete this exercise. This is important to ensure that resources from previous exercises do not impact upcoming exercises.

(.venv) [user@host AD482]$ lab finish connect-cluster

This concludes the guided exercise.

Revision: ad482-1.8-cc2ae1c