AWS – Elastic Beanstalk ( EBS )

Posted on October 28, 2021October 28, 2021 by igreendataadmin

Overview

Deploy and Scale Web Applications. Developers focus on writing codes and don’t need to worry about infrastructure .It’s a managed service from AWS, like AppEngine from GCP

( GCP Equivalent Offering is : App Engine ).

It provides support for various languages like Java, NodeJS, Python, PHP , Single Container Docker, multi container Docker and Preconfigured Docker

Elastic Beanstalk helps in

Configuring Load Balancer ( ELB ) and Auto Scaling Group ( ASG )

Configuring EC2 ( Elastic Cloud Compute VM )

Creating S3 Bucket and also provisioning RDS instance for storing the data

Integrates with CloudTrail ( for auditing events ) and CloudWatch ( for Monitoring and logging) and also comes with health dashboard

Beanstalk Components

Application Version : Deploy the code and create a version to deploy in environment

Environment : Webserver and Worker environments.

Can offload long running tasks to Worker tier environments. ( To run tasks/jobs asynchronously )

Configurations: Configure logs to get stored in S3, Configure X-Ray daemon to instrument the application trace etc, configure logs to stream into Cloud Watch for monitoring and also configure cloud watch alarms, ELB (Elastic Load Balancer )and ASG etc.

RDS also can be configured along with Elastic Beanstalk. But if EBS got deprovisioned, RDS provisioned along with this will also go off.

Once we chose the load balancer ( Application/Network/Gateway ) for Elastic Beanstalk, it can’t be altered later

To create a sample app in Elastic Beanstalk

Navigate to EBS section in aws console and click on applications

Then provide application name and choose the required platform

Choose the required platform

Can use sample application code or upload your own code

Click on create Application, it will provision the app as below

EBS will create buckets in S3 to hold code and configuration data. It also provisioned Security Group and Elastic IP for deploying the application. All this can be viewed in Events section

CloudFormation

CloudFormation is Infrastructure as Code to provision resources in AWS. It’s AWS native infrastructure tool.

It’s similar to Terraform ( IAC) tool to provision resources, as used in GCP

Elastic Beanstalk relies on CloudFormation to provision other AWS Services

Configuration for provisioning resources can be defined in a configuration file with “.config” extension which resides under .ebextensions folder in CloudFormation

Beanstalk Deployment Modes

EBS Deployment modes comes in 4 categories

All At Once : Fastest, but there will be a downtime

Rolling : Rolling updates with minimal capacity

Rolling with Additional Batch : Rollback with minimal cost

Immutable : Rollback quickly , but expensive. Full capacity

Elastic Beanstalk Lifecycle Management

EBS Lifecycle setting helps to maintain versions and clear/delete the old version from EBS. We can set the max no of versions to be maintained and also option to delete/retain the source bundle in S3.

Elastic Beanstalk Cloning

Clone an environment with exact configuration.

Even after cloning the EBS, one can not change the loadbalancer type. User can create a new environment except LB and choose the required LB

EB Command Line Interface

In addition to AWS CLI to perform commands, user can install EB CLI to work with Elastic Bean Stalk

Happy Learning !

Bharathy Poovalingam

#Learning #AWS #ElasticBeanstalk #iGreenData

Deploying Microservice on Kubernetes(MiniKube)

Posted on October 28, 2021October 28, 2021 by igreendataadmin

This blog is to demonstrate, how to deploy spring boot application into minikube cluster and access it

Pre-requisites:

Need to have Java, Docker and Minikube installed on mac

Minikube Installation:

Install minikube using brew command :

brew install minikube

Start the minikube using below command

minikube start

Please verify the status of minikube by running below command

minikube status

And to bring down the minikube, pls run below command

minikube stop

Code Build :

Package your application ( Spring Boot Microservice ) and using maven/gradle build tool to generate jar using particular command/task ( mvn clean install )

Build:

docker build -t igd/spring-jpademo:1 ( Build and Tag )

Run:

Run the image to make sure, you can able to run the image and hit it
docker run -p 8080:8080 igd/spring-jpademo:1

Create a Dockerfile to build a image
Use docker build to tag a image ( which can be pushed to artifactory or container registry )
docker run to verify the image

Namespace:

Before creating a deployment, can have namespace created to have deployments tagged to the namespace

Namespace:
Syntax : kubectl create ns <<namespace_name>>
kubectl create ns igd

Deployment:

To create a deployment, need to have a deployment yaml file

apiVersion: apps/v
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: employee-app
  name: employee-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: employee-app
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: employee-app
    spec:
      containers:
      - image: igd/spring-jpademo:1
        name: employee-app
        imagePullPolicy: Never
        resources: {}
        ports:
          - containerPort: 8080
status: {}1

Command to create a deployment and tag it to the namespace “igd” created before

Deployment
kubectl create -f app_deploy.yml -n igd

Verify the deployment is created and its running fine

kubectl get deploy -n igd

Pods:

Verify the pods are up and running by below command

kubectl get pods -n igd

Services:

Need to create service as type “LoadBalancer”, so that it can be accessed outside the cluster

kubectl expose deployment employee-app -n igd --type=LoadBalancer --port=8080 

kubectl get services -n igd

Please give some time for the external ip to popup, then you can able to access the services

Minikube Dashboard:

minikube dashboard

Once the dashboard opens up, please choose the namespace from the dropdown, to view the services been deployed

To bring down the cluster, please run below command

minikube service stop

Bharathy Poovalingam

Happy Learning !

Banner 5

Posted on October 12, 2021October 25, 2021 by igreendataadmin

INNOVATE

Harness the power of new technologies to unlock the unlimited potential of your business

Learn more

Spring Boot with Docker

Posted on October 1, 2021October 7, 2021 by igreendataadmin

Introduction

This blog will show how to integrate Spring Data with Spring Boot. The flow will be from the Rest Controller which will wire the service to make call to JPA Repository as shown below.

Maven Dependency

Initialize the Spring Boot Application using start.spring.io with Web and JPA dependency added to it.

Need to add below dependency to support Spring Data JPA and embedded h2 database

spring-boot-starter-data-jpa [ org.springframework.boot ]
h2 [ com.h2database ]

Repository Changes

Define an interface that extends JpaRepository for the entity along with the primary key. Here, we have an Employee Entity and the primary key data type for the entity is String.

Service Changes

Inject the Repo as dependency in Service Layer and make use of findAll() method and getById method to return the list of employees or return a specific employee

Entity Changes

Define a model, that maps entity bean to a table.

Note : Please add @JsonIgnoreProperties to override below exception

com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serialiser found for class org.hibernate.proxy.pojo.bytebuddy.ByteBuddyInterceptor

Starter Changes

Need to add below annotation

@EnableJpaRepositories
@EnableEntityScan

SQL Changes

Place the SQL files in main/resources folder to load it into DB , on application startup

Config Changes

H2 Console Dashboard

We can able to access the H2 dashboard @ localhost:8095/h2-console ( Here the port no specified in application.properties should match )

And the JDBC URL should be : jdbc:h2:mem:testdb

Then click on test connection, to see its successful.

Testing

Test the endpoint through Postman

Happy Learning !

Bharathy Poovalingam

Docker Entrypoint Vs CMD and Kubernetes

Posted on September 30, 2021October 7, 2021 by igreendataadmin

Docker gives two ways of running the main process ENTRYPOINT and CMD. I’ve seen lot of confusion about these options. Both can be used to tell docker what should be executed to run main process – so what should you be using. Is one better than other ? Can you use both ?

Kubernetes uses different names for these options adding to the confusion. So lets dive in and clear it up.

ENTRYPOINT Vs CMD

If you want to run nginx in foreground as your main process – you can use either of the below code in Dockerfile and either will work fine.

CMD [“nginx”, “-g”, “daemon off;”]

ENTRYPOINT [“nginx”, “-g”, “daemon off;”]

However what if you need to do some pre-processing before starting your main process (in this case nginx). In such cases you can see the real use for ENTRYPOINT Vs CMD

# Run the startup script as ENTRYPOINT, for preprocessing. ENTRYPOINT [“/app/scripts/docker-entrypoint.sh”] # start nginx now CMD [“nginx”, “-g”, “daemon off;”]

How Docker interprets ENTRYPRONT and CMD

Docker concatenates ENTRYPOINT and CMD (in that order) into one array of arguments to be passed to the container. In above example array passed to docker container will be as follows.

[“/app/scripts/docker-entrypoint.sh”, “nginx”, “-g”, “daemon off;” ]

Docker Entrypoint script (why you need exec “$@”)

Based on above example it is easy to see why docker-entrypoint.sh script should use exec “$@” at the end. This way docker-entrypoint.sh is able to access arguments entered as CMD and execute them as main process after completing the per-processing.

#Docker Entry point script # Find host and IP of the container HOSTNAME=$(hostname) HOST_IP=$(ip addr show eth0 | grep -w inet| awk ‘{print $2}’) echo “Serving request from Host : ${HOSTNAME} / IP : ${HOST_IP}” >> /www/html /index.html # Execute CMD in Dockerfile exec “$@”

You can also checkout this PostgreSQL initialisation example on Docker Website https://success.docker.com/article/use-a-script-to-initialize-stateful container-data

Overriding ENTRYPOINT and CMD in local Docker environment

You can override either ENTRYPOINT Or CMD Or Both when running docker container locally

# Override Entrypoint. This will use original CMD – but skip the pre-proessing done by docker-entrypoint.sh docker run –entrypoint “” mynginx:latest # Overide CMD. Note that this will not skip the pre-proessing done by docker entrypoint.sh docker run mynginx:latest — “cat /www/hmtl/index.hml” #Override Entrypoint and CMD both docker run –entrypoint “” mynginx:latest — “cat /www/hmtl/index.hml”

This makes overriding concept extremely useful for troubleshooting container startup issues. Overriding ENTRYPOINT Or CMD with using /bin/bash allows you start the container by skipping the main process and instead getting Shell into that container. Then you can go on looking at application directories / logs / running test scripts etc.

docker run mynginx:latest — “/bin/bash” #OR docker run –entrypoint /bin/bash mynginx:latest

Overriding ENTRYPOINT and CMD in Kubernetes

You can also override either ENTRYPOINT Or CMD Or Both when running container in Kubernetes. However Kubernetes manifest files use different spec names as compared to Dockerfile.

This is useful table from Kubernetes Documentation.

	Docker field name	Kubernetes field name
The command run by the container	Entrypoint	command
The arguments passed to the command	Cmd	args

Following yaml file will override CMD

apiVersion: v1 kind: Pod metadata: name: mynginx spec: containers: – name: mynginx image: myrepo/mynginx:1.2 args: – sh – “-c” – “printenv;sleep 10000”

This Yaml will override ENTRYPOINT

apiVersion: v1 kind: Pod metadata: name: mynginx spec: containers: – name: mynginx image: myrepo/mynginx:1.2 command: [“sh”, “-c”, “printenv;sleep 10000”]

AWS Elastic Beanstalk

Posted on September 30, 2021October 7, 2021 by igreendataadmin

Overview

Deploy and Scale Web Applications. Developers focus on writing codes and don’t need to worry about infrastructure .It’s a managed service from AWS, like AppEngine from GCP

( GCP Equivalent Offering is : App Engine ).

It provides support for various languages like Java, NodeJS, Python, PHP , Single Container Docker, multi container Docker and Preconfigured Docker

Elastic Beanstalk helps in

Configuring Load Balancer ( ELB ) and Auto Scaling Group ( ASG )

Configuring EC2 ( Elastic Cloud Compute VM )

Creating S3 Bucket and also provisioning RDS instance for storing the data

Integrates with CloudTrail ( for auditing events ) and CloudWatch ( for Monitoring and logging) and also comes with internal dashboard

Beanstalk Components

Application Version : Deploy the code and create a version to deploy in environment

Environment : Webserver and Worker environments. Can offload long running tasks to Worker tier environments.

RDS also can be configured along with Elastic Beanstalk. But if EBS got deprovisioned, RDS provisioned along with this will also go off.
Once we chose the load balancer ( Application/Network/Gateway ) for Elastic Beanstalk, it can’t be altered later

To create a sample app in Elastic Beanstalk

Navigate to EBS section in aws console and click on applications

Then provide application name and choose the required platform

Choose the required platform

Can use sample application code or upload your own code

Click on create Application, it will provision the app as below

EBS will create buckets in S3 to hold code and configuration data. It also provisioned Security Group and Elastic IP for deploying the application. All this can be viewed in Events section

CloudFormation

CloudFormation is Infrastructure as Code to provision resources in AWS. It’s AWS native infrastructure tool.

It’s similar to Terraform ( IAC) tool to provision resources, as used in GCP

Elastic Beanstalk relies on CloudFormation to provision other AWS Services

Configuration for provisioning resources can be defined in a configuration file with “.config” extension which resides under .ebextensions folder in CloudFormation

Beanstalk Deployment Modes

EBS Deployment modes comes in 4 categories

All At Once : Fastest, but there will be a downtime

Rolling : Rolling updates with minimal capacity

Rolling with Additional Batch : Rollback with minimal cost

Immutable : Rollback quickly , but expensive. Full capacity

Elastic Beanstalk Lifecycle Management

Elastic Beanstalk Cloning

Clone an environment with exact configuration.

Even after cloning the EBS, one can not change the loadbalancer type. User can create a new environment except LB and choose the required LB

EB Command Line Interface

In addition to AWS CLI to perform commands, user can install EB CLI to work with Elastic Bean Stalk

Happy Learning !

Bharathy Poovalingam

#Learning #AWS #ElasticBeanstalk #iGreenData

Terraform

Posted on September 30, 2021October 7, 2021 by igreendataadmin

Introduction:

Terraform are cloud agnostic unlike CloudFormation which is cloud native with AWS and AzureResourceManager which is native with Azure cloud provider. Terraform helps to maintain single codebase to maintain infrastructure related code to provision resources across multiple cloud providers.

Setup:

To install Terraform, need to download the binary from “https://www.terraform.io/downloads.html” and set it in PATH variable. Run ‘sudo mv terraform /usr/local/bin’.

To verify the installation, please run the command terraform — version

IAC ( Infrastructure as Code )

Adhoc Scripts
Configuration Management Tools
Service Template Tools
Orchestration Tools
Service Provisioning Tools

AdHoc Scripts

General purpose programming language to write scripts
Good for small tasks and it’s not good for Complicated Tasks
Code can be messy, if we need to manage more servers, loadbalancers, here comes the advantage of IAC tools which provides features like IdemPotency and API support

“sudo apt-get update && sudo apt-get install apache2 -y && echo ‘<!doctype html><html><body><h1>Adhoc Script to install Apache server !</h1></body></html>’ | sudo tee /var/www/html/index.html”

Configuration Management Tools

Chef, Puppet, Ansible and SaltStack — Designed to install and manage softwares on existing servers
Enforce Coding Conventions and support for Idempotence
Good for managing large number of remote servers

Server Templating Tools

Docker, Packer and Vagrant
Create an image of a Server that captures everything ( OS, Software and files ) and use the Configuration Management tool like Ansible to install the image into multiple servers
Shift to Immutable infrastructure component

Orchestration Tools

Kubernetes,Nomad,Marathon/Mesos
Container Orchestration

Provisioning Tools

Terraform, CloudFormation (Specific to AWS), OpenStack Heat

Add alt text

Approaches

Declarative vs Imperative
Immutable infrastructure vs Mutable Infrastructure

Terraform Files:

To start with Terraform, we need to specify the providers for Infrastructure. It can be any cloud service providers GCP or AWS.

The file can be named as provider.tf and it may have the following content.

provider.tf

Add alt text

variables.tf:

Variables can be declared inside variables.tf file and can be refereed in resources or main.tf file. It can be created using keyword variable. It can have the default value and description key to it.

Add alt text

Terraform Commands:

terraform init ( To initialize the terraform )
terraform plan ( To plan or preview the changes )
terraform deploy ( To deploy or install the changes )
terraform destroy ( To destroy the resources once finished )
terraform fmt ( To do formatting )
terraform refresh ( to refresh the state )
terraform show ( to view the terraform.tfstate file )

Happy Learning !

Bharathy Poovalingam

#GCP #Terraform #Service Template #Learning

Installing Kafka on Mac

Posted on September 30, 2021October 7, 2021 by igreendataadmin

Installation:

http://kafka.apache.org/

To download Kafka, navigate to kafka.apache.org and click on Download button

Please select the version you want to download ( here I have selected version 2.8.0 ) and choose the Scala version as Scala 2.13 to download the tgz file : https://apache.mirror.digitalpacific.com.au/kafka/2.8.0/kafka_2.13-2.8.0.tgz

Copy the tar file to desired location and unzip it

Configuration

The directory structure of Kafka would look like below

Modify config/server.properties

Using nano editor : nano config/server.properties need to set below properties

Startup:

bin/kafka-server-start.sh config/server.properties

broker.id=0

listeners=PLAINTEXT://127.0.0.1:9092

zookeeper.connect=127.0.0.1:2181

Start the zookeeper by running the command

bin/zookeeper-server-start.sh config/zookeeper.properties

Now if the log looks good ( without error ) , we can able to create topic and list it out

Topic Creation

To create topic, please run the below command

bin/kafka-topics.sh — bootstrap-server 127.0.0.1:9092 — create — topic demo_topic

To view the list of topics, run below command

bin/kafka-topics.sh — bootstrap-server 127.0.0.1:9092 — list [ its actually double hypen list ]

Happy Learning !

Bharathy Poovalingam

#Kafka #Learning #iGreenData

GCP Data Engineer : Cloud Storage

Posted on September 30, 2021October 7, 2021 by igreendataadmin

Storage Services in GCP

This blog will focus on Cloud Storage service in GCP.

Object Storage ( Buckets ) : To store unstructured data and for archival use-cases.

Instance Storage ( Persistent Disks ) : To work with VMs, Kubernetes Clusters

SQL ( Cloud SQL, Cloud Spanner ) : For Relational DB use-case and for transaction support

NoSQL ( BigTable, DataStore ) : For storing non-relational data

Analytic ( Cloud BigQuery ) : For Data warehousing

Cloud Storage

Cloud Storage helps in storing unstructured data , by having files or images been stored into buckets in GCS.

While creating a bucket, below are the storage class type options provided

Once bucket got created, user can login to the console to upload files or folders to move into the bucket.

Loading Data into Storage

For less Volume of data, we can use one of the following

gsutil : command line utility to copy files into bucket, to create bucket and to move files. Activate Cloud Shell from Cloud Console ( appear on the right corner with image “>”

Cloud Console

API

For Bulk and large volume of data, we can use

Cloud Storage Transfer : Moving data from On Premises and other cloud services, to move data between cloud storage buckets, transfer more than 1Tb from on-premises. Supports one-time and recurring transfers.

Transfer Appliance : High Capacity storage server, to move large volumes of data (100 Tb)

Security

Data stored in GCP is encrypted by default. While creating bucket, under advanced settings, you can chose the option to encryption as below.

Data uploaded into GCP is divided into multiple data chunks, each been encrypted with its own key ( Data Encryption Key : DEK ) and data encryption keys are been encrypted using Key Encryption Key ( KEK ) which gets stored in KMS ( Key Management Service )

ACLs

Uniform Bucket Level Access : Recommended, uses IAM (Identity and Access Management ), permission at Bucket Level

Fine-grained Access : Uses ACLs, permission apply at both Bucket and Object Level

Signed URLs

Time Limit read or write access to an object ( short-term access )

Allow access to those without IAM authorizations. Generate signed url comprise of two steps

Creating Service Account Key with relevant permission

Using gsutil command to generate the url passing the service account key as json and the time for url to be accessed using param : “-d”

Signed Policy Documents

Specify what can be loaded to a bucket with a form post

Allow greater control over size,content type and other upload characteristics than signed urls.

Lifecycle Policy Management

It consist of Actions and Conditions. What is the action to be performed on the bucket when the condition met

Action can be moving object from one storage class to another one

Actions executed when condition applies eg : Change Storage class based on age, Delete object based on date, Purge versions etc

Happy Learning !

Bharathy Poovalingam

GCP Learning Series_ Cloud Functions

Posted on September 30, 2021October 7, 2021 by igreendataadmin

Introduction

Cloud Functions — FaaS ( Function as a Service ) / serverless compute service provided by GCP to handle events generated by GCP resources. It is fully managed service by GCP as shown below.

Developers don’t have to worry about administrative infrastructure

runtime environments
Security Updates
OS Patches
Scaling
Capacity

Developers can focus on writing softwares

Events, Triggers and Functions

Events — particular action happen in GCP ( file upload/archive, message published to pubsub)
Triggers ( Responding to an event )
Functions ( Functions associated with trigger to handle event )

Supported Events

Cloud Functions support even based driven architecture for the events emitted in GCP. For the event, we can wire or tag a CF to trigger a API or an action as shown in below diagram. It does not support for all events in GCP. It supports below GCP resource events.

HTTP Events
Cloud Storage
Cloud Pub/Sub
Firebase
Cloud Firestore

The above resource events can be tagged into two categories

Background Functions ( Cloud Storage / Cloud Pub/Sub )
HTTP Functions ( Web hook / HTTP Requests )

Language Support

Cloud Functions support below languages

Node JS
Python
Go
Java 11

It does not provide support for containers yet. i.e docker images containerized can not be deployed into Cloud Functions.

Deployment of Cloud Functions

Provide Function name
Trigger
Event Type
Resource
Runtime

CF can be deployed using gcloud command also

To deploy Cloud Function to handle event when user uploads file into Google Cloud Storage, it can be deployed by providing resource name ( bucket name ) and resource event ( on file upload completion )

gcloud functions deploy cloud_function_test_gcs — runtime python37 — trigger-resource gcs-bucket-test — trigger-event google.storage.object.finalize

Similarly if we need to deploy cloud function to handle pubsub event, it can be defined as below

gcloud functions deploy cloud_function_test_pubsub — runtime python37 — trigger-resource test-topic

For cloud pubsub, there is only one event ( i.e message published ) on the topic, here it is test-topic.

Limitations

Cloud functions will timeout after one minute, although timeout can be configured to extend up to 9 minutes.

Hence cloud functions are suitable for handling events and should not be applicable for time consuming process.

#GCP #LearningContinues #Serverless

Thank You

Bharathy Poovalingam