Spring Boot with Docker

Introduction

This blog will show how to integrate Spring Data with Spring Boot. The flow will be from the Rest Controller which will wire the service to make call to JPA Repository as shown below.

Maven Dependency

Initialize the Spring Boot Application using start.spring.io with Web and JPA dependency added to it.

Need to add below dependency to support Spring Data JPA and embedded h2 database

  1. spring-boot-starter-data-jpa [ org.springframework.boot ]
  2. h2 [ com.h2database ]
Spring Data and Embedded Dependency

Repository Changes

Define an interface that extends JpaRepository for the entity along with the primary key. Here, we have an Employee Entity and the primary key data type for the entity is String.

Service Changes

Inject the Repo as dependency in Service Layer and make use of findAll() method and getById method to return the list of employees or return a specific employee

Entity Changes

Define a model, that maps entity bean to a table.

Note : Please add @JsonIgnoreProperties to override below exception

com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serialiser found for class org.hibernate.proxy.pojo.bytebuddy.ByteBuddyInterceptor

Starter Changes

Need to add below annotation

  1. @EnableJpaRepositories
  2. @EnableEntityScan

SQL Changes

Place the SQL files in main/resources folder to load it into DB , on application startup

Config Changes

H2 Console Dashboard

We can able to access the H2 dashboard @ localhost:8095/h2-console ( Here the port no specified in application.properties should match )

And the JDBC URL should be : jdbc:h2:mem:testdb

Then click on test connection, to see its successful.

Testing

Test the endpoint through Postman

Happy Learning !

Bharathy Poovalingam

Docker Entrypoint Vs CMD and Kubernetes

Docker gives two ways of running the main process ENTRYPOINT and CMD. I’ve seen lot of confusion about these options. Both can be used to tell  docker what should be executed to run main process – so what should you be using. Is one better than other ? Can you use both ? 

Kubernetes uses different names for these options adding to the confusion. So lets dive in and clear it up. 

ENTRYPOINT Vs CMD 

If you want to run nginx in foreground as your main process – you can use either of the below code in Dockerfile and either will work fine. 

CMD [“nginx”, “-g”, “daemon off;”]

OR 

ENTRYPOINT [“nginx”, “-g”, “daemon off;”]

However what if you need to do some pre-processing before starting your main process (in this case nginx). In such cases you can see the real use for  ENTRYPOINT Vs CMD 

# Run the startup script as ENTRYPOINT, for preprocessing. ENTRYPOINT [“/app/scripts/docker-entrypoint.sh”] # start nginx now CMD [“nginx”, “-g”, “daemon off;”]

How Docker interprets ENTRYPRONT and CMD 

Docker concatenates ENTRYPOINT and CMD (in that order) into one array of arguments to be passed to the container. In above example array passed to  docker container will be as follows. 

[“/app/scripts/docker-entrypoint.sh”, “nginx”, “-g”, “daemon off;” ] 

Docker Entrypoint script (why you need exec “$@”) 

Based on above example it is easy to see why docker-entrypoint.sh script should use exec “$@” at the end. This way docker-entrypoint.sh is able to access arguments entered as CMD and execute them as main process after completing the per-processing. 

#Docker Entry point script # Find host and IP of the container HOSTNAME=$(hostname) HOST_IP=$(ip addr show eth0 | grep -w inet| awk ‘{print $2}’) echo “Serving request from Host : ${HOSTNAME} / IP : ${HOST_IP}” >> /www/html /index.html # Execute CMD in Dockerfile exec “$@”

You can also checkout this PostgreSQL initialisation example on Docker Website https://success.docker.com/article/use-a-script-to-initialize-stateful container-data 

Overriding ENTRYPOINT and CMD in local Docker environment

You can override either ENTRYPOINT Or CMD Or Both when running docker container locally 

# Override Entrypoint. This will use original CMD – but skip the pre-proessing  done by docker-entrypoint.sh docker run –entrypoint “” mynginx:latest # Overide CMD. Note that this will not skip the pre-proessing done by docker entrypoint.sh docker run mynginx:latest — “cat /www/hmtl/index.hml” #Override Entrypoint and CMD both docker run –entrypoint “” mynginx:latest — “cat /www/hmtl/index.hml”

This makes overriding concept extremely useful for troubleshooting container startup issues. Overriding ENTRYPOINT Or CMD with using /bin/bash allows you start the container by skipping the main process and instead getting Shell into that container. Then you can go on looking at application  directories / logs / running test scripts etc. 

docker run mynginx:latest — “/bin/bash” #OR docker run –entrypoint /bin/bash mynginx:latest

Overriding ENTRYPOINT and CMD in Kubernetes 

You can also override either ENTRYPOINT Or CMD Or Both when running container in Kubernetes. However Kubernetes manifest files use different spec  names as compared to Dockerfile. 

This is useful table from Kubernetes Documentation. 

Docker field name Kubernetes field name
The command run by the container Entrypoint command
The arguments passed to the command Cmd args

Following yaml file will override CMD 

apiVersion: v1 kind: Pod metadata:  name: mynginx spec:  containers:  – name: mynginx  image: myrepo/mynginx:1.2  args:  – sh  – “-c”  – “printenv;sleep 10000”

This Yaml will override ENTRYPOINT

apiVersion: v1 kind: Pod metadata:  name: mynginx spec:  containers:  – name: mynginx  image: myrepo/mynginx:1.2  command: [“sh”, “-c”, “printenv;sleep 10000”]

AWS Elastic Beanstalk

Overview

Deploy and Scale Web Applications. Developers focus on writing codes and don’t need to worry about infrastructure .It’s a managed service from AWS, like AppEngine from GCP

( GCP Equivalent Offering is : App Engine ).

It provides support for various languages like Java, NodeJS, Python, PHP , Single Container Docker, multi container Docker and Preconfigured Docker

Elastic Beanstalk helps in

Configuring Load Balancer ( ELB ) and Auto Scaling Group ( ASG )

Configuring EC2 ( Elastic Cloud Compute VM )

Creating S3 Bucket and also provisioning RDS instance for storing the data

Integrates with CloudTrail ( for auditing events ) and CloudWatch ( for Monitoring and logging) and also comes with internal dashboard

Beanstalk Components

Application Version : Deploy the code and create a version to deploy in environment

Environment : Webserver and Worker environments. Can offload long running tasks to Worker tier environments.

Configurations: Configure logs to get stored in S3, Configure X-Ray daemon to instrument the application trace etc, configure logs to stream into Cloud Watch for monitoring and also configure cloud watch alarms, ELB (Elastic Load Balancer )and ASG etc.

RDS also can be configured along with Elastic Beanstalk. But if EBS got deprovisioned, RDS provisioned along with this will also go off.

Once we chose the load balancer ( Application/Network/Gateway ) for Elastic Beanstalk, it can’t be altered later

To create a sample app in Elastic Beanstalk

Navigate to EBS section in aws console and click on applications

Then provide application name and choose the required platform

Choose the required platform

Can use sample application code or upload your own code

Click on create Application, it will provision the app as below

EBS will create buckets in S3 to hold code and configuration data. It also provisioned Security Group and Elastic IP for deploying the application. All this can be viewed in Events section

CloudFormation

CloudFormation is Infrastructure as Code to provision resources in AWS. It’s AWS native infrastructure tool.

It’s similar to Terraform ( IAC) tool to provision resources, as used in GCP

Elastic Beanstalk relies on CloudFormation to provision other AWS Services

Configuration for provisioning resources can be defined in a configuration file with “.config” extension which resides under .ebextensions folder in CloudFormation

Beanstalk Deployment Modes

EBS Deployment modes comes in 4 categories

All At Once : Fastest, but there will be a downtime

Rolling : Rolling updates with minimal capacity

Rolling with Additional Batch : Rollback with minimal cost

Immutable : Rollback quickly , but expensive. Full capacity

Elastic Beanstalk Lifecycle Management

EBS Lifecycle setting helps to maintain versions and clear/delete the old version from EBS. We can set the max no of versions to be maintained and also option to delete/retain the source bundle in S3.

Elastic Beanstalk Cloning

Clone an environment with exact configuration.

Even after cloning the EBS, one can not change the loadbalancer type. User can create a new environment except LB and choose the required LB

EB Command Line Interface

In addition to AWS CLI to perform commands, user can install EB CLI to work with Elastic Bean Stalk

Happy Learning !

Bharathy Poovalingam

#Learning #AWS #ElasticBeanstalk #iGreenData

Terraform

Introduction:

Terraform are cloud agnostic unlike CloudFormation which is cloud native with AWS and AzureResourceManager which is native with Azure cloud provider. Terraform helps to maintain single codebase to maintain infrastructure related code to provision resources across multiple cloud providers.

Setup:

To install Terraform, need to download the binary from “https://www.terraform.io/downloads.html” and set it in PATH variable. Run ‘sudo mv terraform /usr/local/bin’.

To verify the installation, please run the command terraform — version

IAC ( Infrastructure as Code )

  1. Adhoc Scripts
  2. Configuration Management Tools
  3. Service Template Tools
  4. Orchestration Tools
  5. Service Provisioning Tools

AdHoc Scripts

  1. General purpose programming language to write scripts
  2. Good for small tasks and it’s not good for Complicated Tasks
  3. Code can be messy, if we need to manage more servers, loadbalancers, here comes the advantage of IAC tools which provides features like IdemPotency and API support

“sudo apt-get update && sudo apt-get install apache2 -y && echo ‘<!doctype html><html><body><h1>Adhoc Script to install Apache server !</h1></body></html>’ | sudo tee /var/www/html/index.html”

Configuration Management Tools

  1. Chef, Puppet, Ansible and SaltStack — Designed to install and manage softwares on existing servers
  2. Enforce Coding Conventions and support for Idempotence
  3. Good for managing large number of remote servers

Server Templating Tools

  1. Docker, Packer and Vagrant
  2. Create an image of a Server that captures everything ( OS, Software and files ) and use the Configuration Management tool like Ansible to install the image into multiple servers
  3. Shift to Immutable infrastructure component

Orchestration Tools

  1. Kubernetes,Nomad,Marathon/Mesos
  2. Container Orchestration

Provisioning Tools

  1. Terraform, CloudFormation (Specific to AWS), OpenStack Heat

Add alt text

Approaches

  1. Declarative vs Imperative
  2. Immutable infrastructure vs Mutable Infrastructure

Terraform Files:

To start with Terraform, we need to specify the providers for Infrastructure. It can be any cloud service providers GCP or AWS.

The file can be named as provider.tf and it may have the following content.

provider.tf

Add alt text

variables.tf:

Variables can be declared inside variables.tf file and can be refereed in resources or main.tf file. It can be created using keyword variable. It can have the default value and description key to it.

Add alt text

Terraform Commands:

  1. terraform init ( To initialize the terraform )
  2. terraform plan ( To plan or preview the changes )
  3. terraform deploy ( To deploy or install the changes )
  4. terraform destroy ( To destroy the resources once finished )
  5. terraform fmt ( To do formatting )
  6. terraform refresh ( to refresh the state )
  7. terraform show ( to view the terraform.tfstate file )

Happy Learning !

Bharathy Poovalingam

#GCP #Terraform #Service Template #Learning

Installing Kafka on Mac

Installation:

http://kafka.apache.org/

To download Kafka, navigate to kafka.apache.org and click on Download button

Please select the version you want to download ( here I have selected version 2.8.0 ) and choose the Scala version as Scala 2.13 to download the tgz file : https://apache.mirror.digitalpacific.com.au/kafka/2.8.0/kafka_2.13-2.8.0.tgz

Copy the tar file to desired location and unzip it

Configuration

The directory structure of Kafka would look like below

Modify config/server.properties

Using nano editor : nano config/server.properties need to set below properties

Startup:

bin/kafka-server-start.sh config/server.properties

broker.id=0

listeners=PLAINTEXT://127.0.0.1:9092

zookeeper.connect=127.0.0.1:2181

Start the zookeeper by running the command

bin/zookeeper-server-start.sh config/zookeeper.properties

Now if the log looks good ( without error ) , we can able to create topic and list it out

Topic Creation

To create topic, please run the below command

bin/kafka-topics.sh — bootstrap-server 127.0.0.1:9092 — create — topic demo_topic

To view the list of topics, run below command

bin/kafka-topics.sh — bootstrap-server 127.0.0.1:9092 — list [ its actually double hypen list ]

Happy Learning !

Bharathy Poovalingam

#Kafka #Learning #iGreenData

GCP Data Engineer : Cloud Storage

Storage Services in GCP

This blog will focus on Cloud Storage service in GCP.

Object Storage ( Buckets ) : To store unstructured data and for archival use-cases.

Instance Storage ( Persistent Disks ) : To work with VMs, Kubernetes Clusters

SQL ( Cloud SQL, Cloud Spanner ) : For Relational DB use-case and for transaction support

NoSQL ( BigTable, DataStore ) : For storing non-relational data

Analytic ( Cloud BigQuery ) : For Data warehousing

Cloud Storage

Cloud Storage helps in storing unstructured data , by having files or images been stored into buckets in GCS.

Login to cloud console : https://console.cloud.google.com/storage and move to Storage service section.

While creating a bucket, below are the storage class type options provided

Once bucket got created, user can login to the console to upload files or folders to move into the bucket.

Loading Data into Storage

For less Volume of data, we can use one of the following

gsutil : command line utility to copy files into bucket, to create bucket and to move files. Activate Cloud Shell from Cloud Console ( appear on the right corner with image “>”

Cloud Console

API

For Bulk and large volume of data, we can use

Cloud Storage Transfer : Moving data from On Premises and other cloud services, to move data between cloud storage buckets, transfer more than 1Tb from on-premises. Supports one-time and recurring transfers.

Transfer Appliance : High Capacity storage server, to move large volumes of data (100 Tb)

Security

Data stored in GCP is encrypted by default. While creating bucket, under advanced settings, you can chose the option to encryption as below.

Data uploaded into GCP is divided into multiple data chunks, each been encrypted with its own key ( Data Encryption Key : DEK ) and data encryption keys are been encrypted using Key Encryption Key ( KEK ) which gets stored in KMS ( Key Management Service )

ACLs

Uniform Bucket Level Access : Recommended, uses IAM (Identity and Access Management ), permission at Bucket Level

Fine-grained Access : Uses ACLs, permission apply at both Bucket and Object Level

Signed URLs

Time Limit read or write access to an object ( short-term access )

Allow access to those without IAM authorizations. Generate signed url comprise of two steps

Creating Service Account Key with relevant permission

Using gsutil command to generate the url passing the service account key as json and the time for url to be accessed using param : “-d”

Signed Policy Documents

Specify what can be loaded to a bucket with a form post

Allow greater control over size,content type and other upload characteristics than signed urls.

Lifecycle Policy Management

It consist of Actions and Conditions. What is the action to be performed on the bucket when the condition met

Action can be moving object from one storage class to another one

Actions executed when condition applies eg : Change Storage class based on age, Delete object based on date, Purge versions etc

Happy Learning !

Bharathy Poovalingam

GCP Learning Series_ Cloud Functions

Introduction

Cloud Functions — FaaS ( Function as a Service ) / serverless compute service provided by GCP to handle events generated by GCP resources. It is fully managed service by GCP as shown below.

Developers don’t have to worry about administrative infrastructure

  • runtime environments
  • Security Updates
  • OS Patches
  • Scaling
  • Capacity

Developers can focus on writing softwares

Events, Triggers and Functions

  • Events — particular action happen in GCP ( file upload/archive, message published to pubsub)
  • Triggers ( Responding to an event )
  • Functions ( Functions associated with trigger to handle event )

Supported Events

Cloud Functions support even based driven architecture for the events emitted in GCP. For the event, we can wire or tag a CF to trigger a API or an action as shown in below diagram. It does not support for all events in GCP. It supports below GCP resource events.

HTTP Events

Cloud Storage

Cloud Pub/Sub

Firebase

Cloud Firestore

The above resource events can be tagged into two categories

  • Background Functions ( Cloud Storage / Cloud Pub/Sub )
  • HTTP Functions ( Web hook / HTTP Requests )

Language Support

Cloud Functions support below languages

Node JS

Python

Go

Java 11

It does not provide support for containers yet. i.e docker images containerized can not be deployed into Cloud Functions.

Deployment of Cloud Functions

Login to Google Cloud Console and click on hamburger menu and choose Cloud Function to create /deploy cloud function

  • Provide Function name
  • Trigger
  • Event Type
  • Resource
  • Runtime

CF can be deployed using gcloud command also

To deploy Cloud Function to handle event when user uploads file into Google Cloud Storage, it can be deployed by providing resource name ( bucket name ) and resource event ( on file upload completion )

gcloud functions deploy cloud_function_test_gcs — runtime python37 — trigger-resource gcs-bucket-test — trigger-event google.storage.object.finalize

Similarly if we need to deploy cloud function to handle pubsub event, it can be defined as below

gcloud functions deploy cloud_function_test_pubsub — runtime python37 — trigger-resource test-topic

For cloud pubsub, there is only one event ( i.e message published ) on the topic, here it is test-topic.

Limitations

Cloud functions will timeout after one minute, although timeout can be configured to extend up to 9 minutes.

Hence cloud functions are suitable for handling events and should not be applicable for time consuming process.

#GCP #LearningContinues #Serverless

Thank You

Bharathy Poovalingam

Spring Data with Spring Boot

Introduction

This blog will show how to integrate Spring Data with Spring Boot. The flow will be from the Rest Controller which will wire the service to make call to JPA Repository as shown below.

Maven Dependency

Initialize the Spring Boot Application using start.spring.io with Web and JPA dependency added to it.

Need to add below dependency to support Spring Data JPA and embedded h2 database

  1. spring-boot-starter-data-jpa [ org.springframework.boot ]
  2. h2 [ com.h2database ]

Repository Changes

Define an interface that extends JpaRepository for the entity along with the primary key. Here, we have an Employee Entity and the primary key data type for the entity is String.

Service Changes

Inject the Repo as dependency in Service Layer and make use of findAll() method and getById method to return the list of employees or return a specific employee

Entity Changes

Define a model, that maps entity bean to a table.

Note : Please add @JsonIgnoreProperties to override below exception

com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serialiser found for class org.hibernate.proxy.pojo.bytebuddy.ByteBuddyInterceptor

Starter Changes

Need to add below annotation

  1. @EnableJpaRepositories
  2. @EnableEntityScan

SQL Changes

Place the SQL files in main/resources folder to load it into DB , on application startup

Config Changes

H2 Console Dashboard

We can able to access the H2 dashboard @ localhost:8095/h2-console ( Here the port no specified in application.properties should match )

And the JDBC URL should be : jdbc:h2:mem:testdb

Then click on test connection, to see its successful.

Testing

Test the endpoint through Postman

Happy Learning !

Bharathy Poovalingam

Configuring Swagger in Spring Boot Application

SpringBoot Initializr:

Initialize a Sample Spring Boot Application using Spring Initializr : https://start.spring.io/

Key in the Group, artifact id and the name for your sample project .

Choose your favourite language, it can be Java/Kotlin/Groovy.

Also choose the language version, Packaging option and the Project to be built in. It can be Maven or Gradle.

For this demo, I have Chosen Java 8 for language , Maven as a build tool and packaging as jar

Once you have clicked generate, it will generate a zip file with required dependencies been added to the pom.xml file. Import it as a Maven project in your IDE. I have chosen Eclipse for this demo.

Maven Dependency

Add the SpringFox Swagger Dependency as below to the pom.xml

Note: For Spring boot 3 version, adding “springfox-boot-starter” dependency is enough and we don’t need to add swagger ui dependency and also not to include @EnableSwagger2 in the starter class.

Then run command :mvn clean install to download the swagger dependency to your project.

Code Changes

To enable Swagger in the Spring Boot Application, need to make code changes in below sections

Spring Boot Starter Class

Need to define a Bean “Docket” to customize the Swagger Documentation. Say if we access the swagger UI : http://localhost:<<PORT_NO>>/swagger-ui/index.html. It will provide api docs for the controller been defined along with the error controllers. To restrict API definition to the controllers been defined in our application, need to have this docket Bean with the PathSelectors and RequestHandlerSelectors point to the controllers defined for our application as below. Here I restrict the scan to the package “com.nikki.demo.controller” to scan for this controller package

Spring Rest Controller

Then to have the controller APIs to display more additional information about the endpoint been exposed, can add API operation annotation to it.

Here each endpoint been annotated with @ApiOperation to give more details about the endpoint

Accessing the Swagger UI

Then, hit the Swagger UI as below ( after running your application as mvn clean spring-boot:run )

Happy Learning !

Bharathy Poovalingam

https://www.linkedin.com/in/bharathy-poovalingam-09ab681a/