Deployments story of manual chaos to automated salvation š„
Hello, Iām Shreyansh šš¼, part of the engineering team at Fyle.
Managing deployments across multiple regions has always been a critical task for us. Initially, we relied on manual processes, which, while functional, were time-consuming and prone to errors. As the product and services grew, we recognized the need for a more efficient solution and transitioned to automated deployments.
I'll share our journey from manual deployments to embracing ArgoCD for continuous delivery. We'll explore the challenges we faced, the solutions we implemented, and how tools like Kustomize were pivotal in streamlining our deployment processes.
Chapter 1: The Setup That Seemed Fine šļø
Let me paint the picture of where we were. We had a decent setup, or so we thought. Our services ran on Kubernetes clusters in two regions - India š®š³ and US šŗšø. Each service lived in its own Docker container š³, neatly organized into different namespaces. Our GitHub workflows automatically built new Docker images whenever developers pushed code.
On paper, it looked modern and efficient. āØ
But the deployment part? That's where our modern infrastructure meets our stone-age processes. šæ
Chapter 2: The Old Way - A Daily Nightmare šµ
Picture this: It's Monday morning, and we must deploy five services across both regions. Here's what that looked like. š¤
First, I had to SSH into our India production machine. The terminal would greet me with:
$ ssh user@india-prod-01
Welcome to Ubuntu 20.04.3 LTS
Last login: Mon Oct 23 08:15:32 2023
Then I'd navigate to our deployment repository and run our "helpful" bash script: š¤
$ cd /opt/prod-deployment-repo
$ ./deploy default api:api_v2.1.3,worker:worker_v1.8.2,scheduler:scheduler_v3.0.1 payments payment-service:payment_v4.2.0 notifications email-service:email_v2.5.1
Look at that command. It's a monster. š¹ And this was just for one region. I had to repeat the entire process for the US region, carefully making sure I didn't mix up the versions.
The script would then use sed
commands to find and replace image versions in our YAML files:
sed -i 's|api:.*|api:api_v2.1.3|g' deployments/default/api-deployment.yaml
sed -i 's|worker:.*|worker:worker_v1.8.2|g' deployments/default/worker-deployment.yaml
Then it would apply the changes:
kubectl apply -f deployments/default/api-deployment.yaml
kubectl apply -f deployments/default/worker-deployment.yaml
Chapter 3: When Everything Went Wrong š„
We had deployed a critical fix for our api service. The command looked like this:
./deploy api api-service:api_v4.1.0
But due to a typo š¤¦āāļø , instead of api_v4.0.1
, we deployed api_v4.1.0
- an older version with a known bug.
The deployment "succeeded" but the service was running an unintended version of code.
To make matters worse, we couldn't quickly see what went wrong. There was no easy way to compare what was deployed versus what should have been deployed. We had to check each service's current image version manually:
kubectl get deployment api-service -o jsonpath='{.spec.template.spec.containers[0].image}'
The rollback process was equally manual and stressful. We had to remember the previous working versions and run the deployment script again with the correct images. š
Chapter 4: The Problems That Kept Growing š
This wasn't an isolated incident. Our manual deployment process was a ticking time bomb with multiple problems: š£
The SSH Bottleneck š: Only certain team members had production access. When they were unavailable, deployments waited.
Command Line Nightmares š„ļø: Our deployment commands grew longer as we added services. A typical deployment command would span three lines:
./deploy default api:api_v2.1.3,worker:worker_v1.8.2,scheduler:scheduler_v3.0.1,logger:logger_v1.4.2 \
payments payment-service:payment_v4.2.0,fraud-detector:fraud_v2.1.0 \
notifications email-service:email_v2.5.1,sms-service:sms_v1.9.0,push-service:push_v3.1.4
One wrong character and the entire deployment would fail, or deploy the wrong versions. ā
The Secret Update Dance š: When we updated database passwords or API keys, the process was brutal. First, we'd update the Kubernetes secret:
kubectl create secret generic db-secret --from-literal=password=newpassword123 --dry-run=client -o yaml | kubectl apply -f -
Then we had to manually delete all pods that used this secret:
kubectl delete pods -l app=api-service
kubectl delete pods -l app=notification-service
If the secret was used by ten services, we'd have to remember and restart all ten. Miss one, and that service keeps using the old credentials until it restarts naturally. š¤·āāļø
Version Chaos šŖļø: Different team members would sometimes deploy different versions to different regions. We'd end up with API v2.1.3 in India and v2.1.1 in the US.
The Knowledge Gap š§ : New team members took weeks to learn our deployment process. They had to shadow experienced engineers multiple times before being trusted with production deployments.
Chapter 5: The Breaking Point š
The final straw came during a major product launch. We needed to deploy updates to twelve services across both regions. The deployment took hours because:
Two services failed due to typos in the deployment command ā
We discovered a config map needed updating, which required manually restarting six services š
Version mismatches between regions caused API compatibility issues š
The rollback process took another hour when we found a critical bug š
Chapter 6: The Search for Something Better - Enter ArgoCD š¦øāāļø
ArgoCD is like having a dedicated deployment assistant that never sleeps, never makes typos, and never forgets to update something.
Here's how it works: You put all your deployment configurations in a Git repository. ArgoCD watches this repository continuously. When you commit changes, ArgoCD automatically applies those changes to your Kubernetes clusters.
Think of it like this - instead of manually applying changes, you declare what you want your system to look like in Git, and ArgoCD makes it happen.
The magic happens through something called the "GitOps loop":
Developer commits code changes
CI pipeline builds and pushes a new Docker image
Someone updates the deployment repository with the new image version
ArgoCD detects the change within 3 minutes or instantaneously if you have configured webhooks.
ArgoCD applies the changes to all configured clusters automatically
No SSH. No manual commands. No typos.
Chapter 7: Adding Structure with Kustomize šļø
Along with ArgoCD, we adopted Kustomize to organize our deployment files better. Kustomize uses a "base and overlay" pattern that eliminates duplication while handling environment differences elegantly. šØ
Here's how we structured it: š
The Base Configuration šļø
We created base configurations that contain the common parts of our deployments:
# base/api-service/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
labels:
app: api-service
spec:
replicas: 3
selector:
matchLabels:
app: api-service
template:
metadata:
labels:
app: api-service
spec:
containers:
- name: api
image: api-service:latest
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
The base kustomization file references all our common resources:
# base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- api-service/deployment.yaml
- api-service/service.yaml
- notification-service/deployment.yaml
- notification-service/service.yaml
- worker-service/deployment.yaml
commonLabels:
environment: production
The Regional Overlays š
Then for each region, we create overlays that specify only the differences:
# overlays/india/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
# Image versions for India region
images:
- name: api-service
newTag: v2.1.3
- name: payment-service
newTag: v4.2.1
- name: worker-service
newTag: v1.8.2
# India-specific patches
patchesStrategicMerge:
- api-service-patch.yaml
# India-specific config
configMapGenerator:
- name: region-config
literals:
- REGION=india
- TIMEZONE=Asia/Kolkata
- CURRENCY=INR
secretGenerator:
- name: db-credentials
literals:
- url=postgresql://india-db.internal:5432/app
The patch file handles India-specific requirements:
# overlays/india/api-service-patch.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
spec:
replicas: 5 # More replicas for higher traffic in India
template:
spec:
containers:
- name: api
resources:
limits:
memory: "1Gi" # More memory for India region
cpu: "1"
env:
- name: LOG_LEVEL
value: "INFO" # Different log level for India
The US overlay looks similar but with US-specific values:
# overlays/us/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
images:
- name: api-service
newTag: v2.1.3
- name: payment-service
newTag: v4.2.1
- name: worker-service
newTag: v1.8.2
configMapGenerator:
- name: region-config
literals:
- REGION=us
- TIMEZONE=America/New_York
- CURRENCY=USD
secretGenerator:
- name: db-credentials
literals:
- url=postgresql://us-db.internal:5432/app
Chapter 8: The New Way - Deployment Zen š§āāļø
Now here's what our deployment process looks like. š
When we want to deploy new versions, we simply update the image tags in our kustomization files. For example, to deploy API v2.1.4 to both regions:
# overlays/india/kustomization.yaml
images:
- name: api-service
newTag: v2.1.4 # Updated from v2.1.3
# overlays/us/kustomization.yaml
images:
- name: api-service
newTag: v2.1.4 # Updated from v2.1.3
Then we commit and push:
git add .
git commit -m "Deploy API v2.1.4 to all regions"
git push origin main
That's it. ArgoCD detects the changes instantly and deploys to both regions automatically. ā”
The Magic of Automatic Secret Updates š®
Remember our painful secret update process? Now it's automatic. When we need to update database credentials, we use Kustomize's secret generator:
# overlays/india/kustomization.yaml
secretGenerator:
- name: db-credentials
literals:
- url=postgresql://india-db.internal:5432/app
- password=newSecurePassword123
When we commit this change, Kustomize generates a new secret with a unique name like db-credentials-a1b2c3d4
. All services that reference this secret automatically get redeployed with the new version. No manual pod deletion required. š
Chapter 9: The Complete Transformation š
The change was dramatic. Let me show you with a real example.
Before ArgoCD: Deploying a critical security patch to all services in both regions. š°
Time: 1 hour | Steps: 5 manual commands | People involved: 3 engineers | Errors: 2 version mismatches | 1 forgotten service
After ArgoCD: The same deployment. āØ
Time: 5 minutes | Steps: 3 Git commits | People involved: 1 engineer | Errors: 0
The Benefits Keep Coming š
No More SSH Nightmares: We eliminated production machine access entirely. Deployments happen through Git commits that anyone can make and review. š
Version Consistency: Both regions always run the same versions because they reference the same base configurations. Version drift became impossible. šÆ
Instant Rollbacks: Rolling back became as simple as reverting a Git commit. ArgoCD automatically applies the previous state within minutes. āŖ
Automatic Secret Management: Services automatically redeploy when secrets or configs change. No more manual pod restarts or forgotten updates. š
New Team Member Onboarding: New engineers can deploy safely on their first day. They just need to know Git workflows, which they already understand. šØāš»
Visibility: ArgoCD's dashboard shows the deployment status of all services across all regions in real-time. We can spot issues immediately. šļø
Reduced Stress: Deployments went from high-stress, error-prone events to routine operations that anyone can perform confidently. š
A Real Example of the New Process š«
Last week, we needed to deploy a new feature that involved updating four services. Here's exactly what happened:
Developer finished the feature and pushed code to the main branch š»
GitHub workflows automatically build new Docker images:
api-service:v2.2.0
worker-service:v1.9.0
notification-service:v2.6.0
scheduler-service:v3.1.0
š³
I updated the image versions in both regional overlays:
# overlays/india/kustomization.yaml and overlays/us/kustomization.yaml
images:
- name: api-service
newTag: v2.2.0
- name: worker-service
newTag: v1.9.0
- name: notification-service
newTag: v2.6.0
- name: scheduler-service
newTag: v3.1.0
Committed and pushed the changes:
git add overlays/
git commit -m "Deploy new messaging feature v2.2.0 to all regions
- Updated API service to v2.2.0 with new messaging endpoints
- Updated worker service to v1.9.0 with message processing
- Updated notification service to v2.6.0 with new templates
- Updated scheduler service to v3.1.0 with message queuing"
git push origin main
ArgoCD automatically deployed to both regions within 5 minutes ā”
Conclusion: From Chaos to Confidence šÆ
Our journey from manual deployments to ArgoCD wasn't just a technical upgrade - it was a transformation in how we work. We went from dreading deployments to treating them as routine operations. š
Most importantly, we can focus on what matters: building features that help our users instead of fighting with deployment processes. šŖ
If you're still doing manual deployments, especially across multiple environments, you don't have to suffer through what we did. Tools like ArgoCD and Kustomize can transform your deployment experience just like they transformed ours. š
The setup takes some initial work, but the payoff is immediate and compounds over time. Your future self - and your team - will thank you for making the switch. š