Sunday, September 17, 2017

Elastic Stack on Z

The Elastic Stack, also known as ELK stack, is a popular choice to manage logs. ELK is an acronym for its three main components Elasticsearch, Logstash and Kibana; Elastic Stack is the more recent name for it. ELK is written in Java and maintained by Elastic. The three building blocks have a clear separation of duty:
  • Elasticsearch is a database for storing
  • Logstash ingests logs in various formats and can transform them for efficient processing with Elasticsearch
  • Kibana is a graphical, web-based front end to Elasticsearch
E, L and K can operate in a Linux on IBM Z environment. IBM's Common Data Provider can even handle z/OS logs like SMF data. Here's how to run ELK on the mainframe -- of course in containers:
First, we need to create the E, L and K containers. A good starting point is https://github.com/linux-on-ibm-z/dockerfile-examples/. To optimize on the Java VM used (and thus ELK performance), we can tweak these Dockerfiles a bit.

Let's start with Elasticsearch: this application is known to not work with IBM Java. So for the mainframe, OpenJDK is the choice. With version 9 of OpenJDK (to be officially released in just a few days), s390x has got a Just-in-time compiler (JIT) in the Java Virtual Machine. Obviously, that is a prerequisite for decent performance. A few tweaks are necessary when building Elasticsearch, since its code is not yet considering OpenJDK 9 a lot.
The official image of openjdk already provides a JITting Java Virtual Machine, so building Elasticsearch can be done with Dockerfile like this one:
FROM openjdk:9-jdk
ADD gradle.diff /tmp
ENV LANG="en_US.UTF-8" JAVA_TOOL_OPTIONS="-Dfile.encoding=UTF8" _JAVA_OPTIONS="-Xmx10g" SOURCE_DIR="/tmp/"
ENV JDK_JAVA_OPTIONS="--illegal-access=permit"
WORKDIR $SOURCE_DIR

RUN apt-get update &&  apt-get install -y \
    ant autoconf automake ca-certificates ca-certificates-java curl \
    git libtool libx11-dev libxt-dev locales-all make maven patch \
    pkg-config tar texinfo unzip wget \
 && wget https://services.gradle.org/distributions/gradle-3.3-bin.zip \
 && unzip gradle-3.3-bin.zip \
 && mv gradle-3.3/ /usr/share/gradle \
 && rm -rf gradle-3.3-bin.zip \
# Download and build source code of elastic search
 && cd $SOURCE_DIR \
 && git clone https://github.com/elastic/elasticsearch \
 && cd elasticsearch \
 && git checkout v5.5.2 \
 && patch -p1 < /tmp/gradle.diff \
 && export PATH=$PATH:/usr/share/gradle/bin \
 && gradle -Dbuild.snapshot=false assemble -Djavax.net.ssl.trustStore=/usr/lib/jvm/java-9-openjdk-s390x/lib/security/cacerts -Djavax.net.ssl.trustStorePassword=changeit \
 && cd $SOURCE_DIR/elasticsearch/distribution/tar/build/distributions/ \
 && tar -C /usr/share/ -xf elasticsearch-5.5.2.tar.gz \
 && mv /usr/share/elasticsearch-5.5.2 /usr/share/elasticsearch \
 && mv /usr/share/elasticsearch/config/elasticsearch.yml /etc/ \
 && ln -s /etc/elasticsearch.yml /usr/share/elasticsearch/config/elasticsearch.yml \
# Clean up cache data and remove dependencies that are not required
 && apt-get remove -y ant autoconf automake git libtool libx11-dev libxt-dev \
    maven patch pkg-config unzip wget \
 && apt-get autoremove -y \
 && apt autoremove -y \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/* /usr/share/gradle /root/.gradle/* /tmp/elasticsearch

EXPOSE 9200 9300

ENV PATH=/usr/share/elasticsearch/bin:$PATH

CMD ["elasticsearch"]
In the build directory, gradle.diff needs to be present -- that is required to address a glitch of gradle with openjdk 9:
diff -Bub a/build.gradle b/build.gradle
--- a/build.gradle        2017-09-11 21:16:10.455783363 +0000
+++ b/build.gradle        2017-09-11 21:18:47.995590949 +0000
@@ -158,7 +158,7 @@
       }
     }
     // ignore missing javadocs
-    tasks.withType(Javadoc) { Javadoc javadoc ->
+    tasks.withType(Javadoc) { enabled=false } /* Javadoc javadoc ->
       // the -quiet here is because of a bug in gradle, in that adding a string option
       // by itself is not added to the options. By adding quiet, both this option and
       // the "value" -quiet is added, separated by a space. This is ok since the javadoc
@@ -166,15 +166,15 @@
       // see https://discuss.gradle.org/t/add-custom-javadoc-option-that-does-not-take-an-argument/5959
       javadoc.options.encoding='UTF8'
       javadoc.options.addStringOption('Xdoclint:all,-missing', '-quiet')
-      /*
+      / *
       TODO: building javadocs with java 9 b118 is currently broken with weird errors, so
       for now this is commented out...try again with the next ea build...
       javadoc.executable = new File(project.javaHome, 'bin/javadoc')
       if (project.javaVersion == JavaVersion.VERSION_1_9) {
         // TODO: remove this hack! gradle should be passing this...
         javadoc.options.addStringOption('source', '8')
-      }*/
-    }
+      } * /
+    } */
   }

   /* Sets up the dependencies that we build as part of this project but
(credits for the diff to the Toronto porting team, who also created and manages the initial Dockerfile at the link above). Put both files in a directory and compile the container image for Elasticsearch.

For Logstash, we can use IBM Java (also a Docker official image), since it won't build with Java 9 at this time. This Dockerfile does the trick:
FROM ibmjava:8-sdk
WORKDIR "/root"
ENV JAVA_HOME=/opt/ibm/java/jre
RUN apt-get update && apt-get install -y \
    ant gcc make tar unzip wget \
# Download the logstash source from github and build it
 && wget https://artifacts.elastic.co/downloads/logstash/logstash-5.5.2.zip \
 && unzip -u logstash-5.5.2.zip \
 && wget https://github.com/jnr/jffi/archive/master.zip \
 && unzip master.zip && cd jffi-master && ant && cd .. \
 && mkdir logstash-5.5.2/vendor/jruby/lib/jni/s390x-Linux  \
 && cp jffi-master/build/jni/libjffi-1.2.so logstash-5.5.2/vendor/jruby/lib/jni/s390x-Linux/libjffi-1.2.so \
 && cp -r /root/jffi-master  /usr/share \
 && cp -r /root/logstash-5.5.2 /usr/share/logstash \
# Cleanup cache data, unused packages and source files
 && apt-get remove -y ant make unzip wget \
 && apt-get autoremove -y && apt-get clean \
 && rm -rf /root/ \
 && rm -rf /var/lib/apt/lists/*

# Define mountable directory
VOLUME ["/data"]

# Expose ports
EXPOSE 514 5043 5000 8081 8202/udp 9292

ENV PATH=/usr/share/logstash/bin:$PATH
ENV LS_JAVA_OPTS="-Xms1g -Xmx10g"

CMD ["logstash","-f","/etc/logstash"]
Kibana can be built either way, with IBM Java or OpenJDK 9. Again, here is the Dockerfile:
FROM ibmjava:8-sdk
WORKDIR "/root"
ENV PATH=/usr/share/node-v6.9.1/bin:/usr/share/kibana/bin:$PATH

# Install the dependencies and NodeJS
RUN apt-get update && apt-get install -y \
    apache2 g++ gcc git make nodejs python unzip wget tar \
 && wget https://nodejs.org/dist/v6.9.1/node-v6.9.1-linux-s390x.tar.gz \
 && tar xvzf node-v6.9.1-linux-s390x.tar.gz \
 && mv /root/node-v6.9.1-linux-s390x/ /usr/share/node-v6.9.1 \
# Download and setup Kibana
 && cd /root/ \
 && wget https://artifacts.elastic.co/downloads/kibana/kibana-5.5.2-linux-x86_64.tar.gz \
 && tar xvf kibana-5.5.2-linux-x86_64.tar.gz \
 && mv /root/kibana-5.5.2-linux-x86_64 kibana-5.5.2 \
 && cd /root/kibana-5.5.2 \
 && mv node node_old \
 && ln -s /usr/share/node-v6.9.1/bin/node node \
 && mkdir /etc/kibana \
 && cp config/kibana.yml /etc/kibana \
 && mv /root/kibana-5.5.2/ /usr/share/kibana \
# Cleanup cache data, unused packages and source files
 && apt-get remove -y git make unzip  wget \
 && apt-get autoremove -y && apt-get clean \
 && rm -rf /root/kibana-5.5.2-linux-x86_64.tar.gz /root/node-v6.9.1-linux-s390x.tar.gz \
 && rm -rf /var/lib/apt/lists/*

# Expose 5601 port used by Kibana
# Expose 80 port used by apache
EXPOSE 5601 80

# Start Kibana service
CMD ["kibana","-H","0.0.0.0"]
To build these containers, put each Dockerfile into a separate directory (add the gradle.diff patch into the Elasticsearch directory) and start the builds using
docker build -t <image-name> <directory-name>
Create kibana.yml containing:
elasticsearch.url=http://elasticsearch:9200/
And to convince elasticsearch, that you are running with just once instance and that is ok, create elasticsearch.yml, containing:
cluster.name: my-cluster
path.data: /data
http.host: 0.0.0.0
discovery.zen.minimum_master_nodes: 1
Finally, starting each container with the right configuration is all you need to do. A quick hack is something like this:
docker run --name elasticsearch -v $PWD/elasticsearch-data:/data -v $PWD/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -p 9200:9200 -p 9300:9300 -d elasticsearch:5.5.2
docker run --name logstash --link elasticsearch:elasticsearch -v $PWD/ELK:/etc/logstash -p 514:514 -p 5043:5043 -p 8081:8081 -p 8202:8202/udp -p 9292:9292 -d logstash:5.5.2
docker run --name kibana -v $PWD/kibana.yml:/usr/share/kibana/config/kibana.yml -p 5601:5601 -d kibana:5.5.2
Make sure you replace the image names with the ones you used during the build.

Alternatively, a compose-file is a good way to build and start things up (instead of docker build and docker run). Make sure you have the Dockerfiles (plus the diff for elasticsearch) in the directories E/, L/ and K/. Then create this docker-compose.yml file:
version: '2'
services:
  elasticsearch:
    build: ./E
    volumes:
      - ./elasticsearch-data:/data
      - ./elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
    ports:
      - "9200:9200"
      - "9300:9300"
    networks:
      - elk

  logstash:
    build: ./L
    volumes:
      - ./ELK:/etc/logstash
    ports:
      - "514:514"
      - "5043:5043"
      - "8081:8081"
      - "8202:8202/udp"
      - "9292:9292"
    networks:
      - elk
    depends_on:
      - elasticsearch

  kibana:
    build: ./K
    volumes:
      - ./kibana.yml:/usr/share/kibana/config/kibana.yml
    ports:
      - "5601:5601"
    networks:
      - elk
    depends_on:
      - elasticsearch

networks:
  elk:
    driver: bridge
A docker-compose up will build the images, if necessary, and start them (this has been updated 2017/09/19).

The ELK directory referenced during the start of Logstash contains the Logstash configuration and is mapped into that container. All files of this directory are simply concatenated and used as configuration by Logstash. This allows to specify input and output parameters of Logstash, as well as log entry parsing on the way.For instance, put a file in that directory with this content:
input {
 syslog {
      port => 514
      type => "docker"
      }
}

filter {
}

output {
 elasticsearch {
      hosts => "elasticsearch:9200"
      }
}
and you will be able to receive logs (assuming port 514 is exposed when starting the container. To use this logging infrastructure, starting containers (on any host) just needs to be done adding "--log-driver=syslog --log-opt syslog-address=tcp://logstash-hostname:514" to the "docker run" parameters. Alternatively, it can be set up permanently for the docker daemon. This will put all log messages into the Elastic stack for further processing, and it uses the syslog protocol and docker log driver.

An alternative is the gelf format ("Graylog Extended Format"). This approach provides more metadata to log messages as understood by Logstash. A Logstash configuration could look like this:
input {
 gelf {
      port => 8202
      }
}

filter {
}

output {
 elasticsearch {
      hosts => "elasticsearch:9200"
      }
}
 Again, starting containers will render their output in ELK, e.g. in "docker run -tid --log-driver=gelf --log-opt gelf-address=udp://logstash-hostname:8202 ubuntu bash".

Once the three E/L/K containers are started, point your browser to port 5601 of the (Logstash) host to work with log entries and create your individual visualizations and dashboards:

No comments:

Post a Comment