Hello, I’m Satyam Jadhav, I joined the Fyle team as a Senior Member of the Technical Staff in March. My first challenge was to implement robust engineering practices into a particular project – a micro-service built on Python 2.7. This involved a series of steps: Dockerizing the application for seamless local development, meticulously linting and formatting the entire codebase, establishing a comprehensive CI pipeline, increasing code coverage from a mere zero to an impressive 90%, and finally transitioning the project from Python 2.7 to Python 3.11.
After investing more than two months into this initiative and successfully completing it, I had a talk with my ex-manager, Vikas, about what went well and what could have been done better. I realized that I gained a lot of technical as well as non-technical knowledge during the initiative. I'm sharing my insights here, and I hope you find them useful.
Technical learnings
"In software engineering, we follow a simple principle: automate whatever can be automated." - Yukihiro Matsumoto
Implementing coverage check on code difference
I have used bitbucket + sonar-cloud which always does coverage check on code difference when PR is raised. Having a coverage check on difference has few benefits instead of having a global coverage check of above a certain percentage. Below are the benefits of having coverage check on code difference.
Incremental improvement: Instead of getting coverage above a certain percentage and then adding a check, we could have coverage check on code difference which gives the ability to improve coverage incrementally. Newly added code always has to pass a coverage check, while old code will need to pass a coverage check whenever it is updated in the future.
Continuous improvement: Code coverage on difference helps continuously increase code coverage. If we add a check on the whole codebase then after achieving a certain percentage there is no improvement forced by the check but with a check on code difference, it always makes you write new test cases as you have to achieve code coverage on difference. This effect becomes visible when smaller PRs are raised and the uncovered line becomes a major part of the difference.
Since sonar cloud and other available solutions for checking coverage on code difference are paid, to implement a such check without any paid solution we can use the diff-cover library which uses coverage.xml report along with git diff
to check code coverage on code difference. Below is a sample Github action for implementing code coverage check on code difference using diff-cover.
name: Unit Tests
on:
push:
branches: [ master ]
pull_request:
branches:
- '**'
jobs:
unit-test:
name: Unit Tests
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Run pytest
run: pytest -rA -q --cov --cov-report=term --cov-report=xml:test_reports/coverage.xml
- name: Setup Python
if: ${{ github.ref != 'refs/heads/master' }}
uses: actions/setup-python@v4
with:
python-version: 3.11.3
- name: Compare
if: ${{ github.ref != 'refs/heads/master' }}
run: |
python3 -m pip install diff-cover
git fetch origin master --depth=1
diff-cover test_reports/coverage.xml --compare-branch=origin/master --fail-under=90 > result.txt 2>&1
- name: Comment Coverage Difference
if: ${{ always() && github.ref != 'refs/heads/master' }}
uses: thollander/actions-comment-pull-request@v2
with:
filePath: result.txt
Designing mocking for a series of HTTP calls made in API/code block under test
The project has a lot of APIs that make a series of calls to the third-party endpoint, to write unit test cases for these APIs, I had to write generic mocking for these third-party endpoints so that it can be used in every unit test case. Below is the study about different alternatives.
Observations:
We make a series of calls to the third-party endpoint in a single API.
Every call to the third-party endpoint returns status_code, headers, and body.
API logic depends upon
status_code
,headers
, andbody
returned.
Requirements for mocking:
For a single API under test we should be able to specify what status, headers, and body each call to the third-party endpoint returns.
Possible solutions:
Python’s unittest.mock: We can use Python’s
unittest.mock
module to patch calls to third-party endpoints. It doesn't work well as we can not specify a series of return values while patching as it accepts only a single return valueDecorator: We can write a decorator which can be applied to the test function. It can take a list of responses to be returned for each call made to the third-party endpoint during the test.
Requests mock: Requests mock is a Python library in which we can patch multiple endpoints (all endpoints have to be different) with the required body, headers, and status. The downside is to mock each call to the third-party endpoint, we have to specify the complete endpoint. Another downside is that it is not possible to have different outputs for the same endpoint if we call it more than once in a single test.
For our use case, the decorator solution suited well. The Below code shows a sample decorator which can be used for mocking a series of calls. Here Calls to the third-party endpoints are made by common function some_module.actual_call_function
.
def mock_calls(payloads):
payloads = iter(payloads)
def decorator(test_function):
@wraps(test_function)
def wrapper(*args, **kwargs):
import some_module
call_backup = some_module.actual_call_function
def call_mock(
method, url, payload
):
mock_payload = next(payloads)
# verifying if call is made with correct payload or not
expected_payload_arg = mock_payload.pop("expected_payload_arg", None)
if expected_payload_arg:
assert payload == expected_payload_arg
response = Response()
response.status_code = mock_payload["status_code"]
response.headers = mock_payload["headers"]
response._content = mock_payload["body"].encode()
return response
some_module.actual_call_function = call_mock
try:
result = test_function(*args, **kwargs)
some_module.actual_call_function = call_backup
return result
except Exception as error:
logger.exception(
"Error: %s occurred while running %s test"
% (str(error), test_function.__name__)
)
some_module.actual_call_function = call_backup
raise error
return wrapper
return decorator
@mock_dwolla(
[
{
"status_code": 201,
"headers": {
"Location": ""
},
"body": "",
"expected_payload_arg": {"key": "value"},
},
{
"status_code": 200,
"headers": {},
"body": {}
},
]
)
def test_function():
pass
Creating Swagger UI to render OpenAPI spec in Flask App
We also wanted to write documentation for APIs in the projects using OpenAPI spec. To render OpenAPI spec files in the Swagger UI, I used the swagger-ui-bundle package. This package allows you to render Swagger UI with your Flask App. Following steps are used to set up Swagger UI.
Create a Swagger-UI blueprint.
# api_docs/resources.py
from flask import Blueprint, Response, render_template, send_from_directory, url_for
from swagger_ui_bundle import swagger_ui_path
api_docs_resources = Blueprint(
"api_docs_resources",
__name__,
static_url_path="",
static_folder=swagger_ui_path,
template_folder=swagger_ui_path,
)
@api_docs_resources.route("/")
def index():
"""
Renders swagger ui to view API docs.
ref: <https://pypi.org/project/swagger-ui-bundle/>
"""
# Root API doc specification, served using /spec endpoint.
openapi_spec_url = url_for("api_docs_resources.spec", path="openapi.yaml")
config_url = url_for("api_docs_resources.spec", path="swagger-config.json")
return Response(
render_template(
"index.j2", openapi_spec_url=openapi_spec_url, configUrl=config_url
),
mimetype="text/html",
)
@api_docs_resources.route("/spec/<path:path>")
def spec(path):
"""
Serves API doc specification files from the root directory.
"""
return send_from_directory("..", path, max_age=None)
Register the blueprint with the app.
# server/app.py
import os
from flask import Flask
DEV_ENV = os.getenv("DEV_ENV", default=False)
app = Flask(__name__)
if DEV_ENV:
# API docs resources
from api_docs import resources as api_docs_resources
app.register_blueprint(api_docs_resources, url_prefix="/api/docs")
Create
openapi.yaml
file at the root level
openapi: 3.0.0
info:
title: My Service
description: My Service
version: 1.0.0
servers:
- url: <http://0.0.0.0:8895>
description: Local development server
paths:
# endpoints
/ready:
$ref: server/api_docs/index.yaml
Create swagger-config.json file
{
"supportedSubmitMethods": []
}
Running pip-compile on arm64 architecture to produce output for x86 architecture using pre-commit
Pre-commit is the manager for git hook scripts. It allows you to run different checks like linting and formatting using tools written in different languages (other than your source code language) without the need of setting different language environments.
At Fyle every developer uses an Apple M1 laptop having arm64 architecture for their local development while our applications run on x86 architecture.
We wanted to use pip-compile for pinning our direct and transitive dependencies using pre-commit. Pre-commit would run pip-compile whenever the requirements.in
file changes. On implementing the pre-commit hook to do so, I found that the CI pipeline for pre-commit checks started failing. On debugging, I realized that pip-compile is producing different outputs on my local system running on arm64 architecture and in the CI pipeline running on x96 architecture.
To the solve problem of the environment, I wrapped pip-tools in docker and created pre-commit hooks that run inside docker. This way we can compile requirements on the Apple M1 laptop for x86 architecture. Refer to https://github.com/fylein/pre-commit-docker-pip-tools for hook usage.
Breaking tests if uncommitted DB transactions are found
Unit tests should be independent of each other. Changing one test should not impact other tests. Similarly, failure of one test should not result in failure of other tests.
For APIs that change DB state, we should roll back DB changes at the end of the test so that other tests are not impacted. How do we do this in the Flask-SQLAlchemy project? There is a solution on the internet for SQLAlchemy where commit calls are patched to not persist transactions, Transactions are rolled back at the end of the test.
Patching commits calls creates a new problem. If the developer misses to write a commit call in code, the test for the code won’t fail. Ideally, we want that test code to catch such misses and result in failure. To solve this problem we could find if there are any uncommitted transactions at the end of the test and break the test if we found one. I wrote a separate blog about how I implemented this solution in detail here, give it a read.
Python 2.7 to Python 3.11.3 upgrade
This was the first time I was doing Python 2.7 to Python 3.11.3 upgrade. I followed the below steps to carry out the upgrade and it went well, having code coverage above 90% did help here.
Change the docker file and start using the new Python version.
Fix all docker build issues.
We were using Alpine Linux and it was not having a lot of python-dev dependencies needed to compile
psycopg2
.Changed OS to Debian by using python-slim image.
Fix all issues to run tests successfully. I encountered the following types of issues.
Issues with dependencies (older library versions).
Issues with Python 2.7 syntax and deprecated APIs e.g.
iteritems
. To fix such issues Python’s2to3
tool could be run on each file.
Upgrade all libraries to the latest version. Upgrading to the latest version of Python gives us the ability to upgrade libraries to the latest available versions. This is useful as newer versions of the libraries may have critical bug and vulnerability fixes and also performance improvements.
Only keep direct dependencies, remove transitive dependencies, and pin transitive dependencies using pip-compile.
Upgrade all packages to the latest version.
Run tests and fix issues.
Do manual testing of critical areas of the application. Manual testing may target testing integrations between different units that are not covered in unit tests.
Non-Technical learnings
"Continuous improvement is better than delayed perfection." - Mark Twain
Deploy every single unit as soon as it gets ready
Even if it is not related to application logic like unit test cases we should deploy it as soon as it gets ready. Doing so has the following advantages:
No backlogs: We don’t accumulate deployment backlogs as we keep developing new stuff. This shows the correct progress of the initiative.
No big deployments: By deploying smaller units from time to time is less risky. We may have to put a lot of effort into stabilizing big change. When we deploy smaller units we can do basic sanity and manual testing and know very easily if a particular change is stable, if not it is very easy to revert the change. Doing root cause analysis (RCA) of bugs becomes easy with smaller deployments.
Improves confidence: When you deploy smaller changes successfully it improves the confidence of the developer. This is especially helpful if you are new to the project and deployment process. For the new developer of the project, it also helps to get familiar with the deployment process. More so it provides a sense of accomplishment and builds momentum for upcoming tasks.
Get code covered before doing big changes
For tasks having a big impact on application stability like Python upgrades, linting & formatting the whole codebase at once, refactoring codebase, and critical library upgrades having code coverage of project above acceptable percentage e.g. greater than 90% becomes very helpful. Having code covered gives mental peace to developers and improves confidence while working on big changes. It is somewhat like having insurance, we pay by adding tests, and whenever do big changes these tests pay us back!
Faster development, fewer worries: When code coverage is there, doing big change development becomes faster as every change is validated by already written tests. Lots of manual testing efforts can be saved. We don’t have to worry about missing some cases while doing manual testing.
Easy POCs/experiments: If good enough code coverage then doing POCs/experiments like library upgrades, python upgrades, and OS distribution changes can be done very easily by validating its compatibility by runnings tests.
Planning linting and formatting
I had an experience where implementing linting and formatting for the whole project at once resulted in a pile of issues raised by linting and formatting tools and fixing them in a single go resulted in lots of bugs. After implementing it for the project at Fyle seamlessly, I realized that with better planning and strategy such a task can be accomplished without much of a friction. The following aspects could be taken into account while implementing linting and formatting for the entire project all at once.
One file/module at a time: At the start, exclude all files from linting and formatting tools. Fix one file/module at a time and include that file/module for linting and formatting tool checks.
Don’t consider it a small task: In my previous experience, the engineering team considered it a very simple task and ran a linting and formatting tool on the entire codebase, resulting in numerous changes and a large pull request. This led to an unstable application. Linting and formatting the codebase is a time-consuming task, as it may involve a lot of iterations to achieve an error-free state. Proper time should be allocated for it; it should not be considered just a one-day effort.
Great blog post, Satyam! Your guide on moving to Python 3.11 and adding coverage and Swagger UI in Flask is super helpful. Thanks for sharing your expertise!