Unit Testing With Code Generation
Hello, I am Jay, and today I am going to tell you how I used code generation to automate a part of our initiative, reducing the manual work done. I have developed a CLI tool that generates a test file for components with basic setup already done for the user; this includes declaring any spy objects, importing dependencies, and generating test suites. What follows is a detailed description of my experience building something from scratch and the things I have learned along the way.
The Question: Why the need to make something from scratch?
We had started an initiative to achieve maximum code coverage using automated testing, primarily through unit tests. As the larger services were getting covered, I observed how certain steps are repeated for every new service we take on. This involves declaring and creating spies, importing the necessary dependencies, setting up the test bench, and finally, when the file is ready for writing test suites, Writing the test suites themselves requires careful attention to ensure that the block of code being tested performs as expected.
This is where I tried to look for a tool that can do some amount of this work for me, but unfortunately, even after finding some packages that seemed promising, they were far from useful. Some would not do exactly what I wanted; others would offer too little to actually see them as an advantage. Here I saw an opportunity to automate the first few redundant steps involved in the process that needs dedicated time; automation would help in reducing grunt work and enable the user to focus solely on the critical part of the process, saving time and completing tasks faster. Although one could argue that this is not a very daunting task at first, actually reading through the code to check which functions are to be spied upon, especially if the file has more than double-digit dependencies, can turn tedious very quickly.
The Solution: Angular Test Generator.
Code Generation
When the idea of code generation was presented to me, I was perplexed for a bit until I realised that this is something we do every day. Code snippets are easily the most simple and largely visible example of code generation that we use every day. For example, Emmet is a toolkit very popular among web developers. Using separators like. and >
to structure HTML tags in a certain way helps to scaffold large HTML documents quickly. This is an example of code generation and the power it gives.
CLIs are also quite common and the de-facto way for frameworks and libraries to initialise large projects and set up multiple dependencies to quickly get started. They make sure all dependencies are compatible with each other and give the user enough to start with.
Our app being written in Angular, which comes with its own CLI, only sets up basic things in the spec file that it generates. With a meagre component import and an empty Test Bench, along with a describe block, the user has quite a lot to do. So I decided to go forward with CLIs since they are easily distributable and anyone with NodeJS installed can run them irrespective of the operating system, whereas code snippets would require more work.
So far, I had established that I wanted to create a CLI that would generate a setup for test suites, but a very crucial question still lingers. How can I determine which dependencies exist? Or from where to import them?
Parsing The Files
In order to find a viable solution, I kept looking for ways to parse a file and came across ASTs. The Abstract Syntax Tree (AST) is a simpler version of the parse tree generated in the Syntax Analysis phase of a compiler. It is different from the parse tree in the sense that it is more expressive in nature, with all the nodes telling more information about the source language, all the while removing the rules of the grammar.
There are libraries in place that actually help to generate ASTs and traverse them to manipulate the source code in between the compiler phases. Now I had everything in place. So let’s make a test generator.
Putting it all together
The first step is to create the basic files and directory structure, and then I started the hunt for the right packages that can help me do all I want with minimal effort. So I went ahead with the following npm packages:
inquirer: an easily embeddable tool to create interactive prompts for the CLI with questions and answers. It offers a wide variety of options to allow users to input strings, numbers, checkboxes, or choose from a given list of options. It is async in nature, using callbacks to allow the developer to provide proper error messages.
Babel: We all know Babel as the go-to tool for transpiling ES6 code to ensure backwards compatibility. It is also a compiler with utilities to generate and traverse ASTs, which can then be converted back to code.
chalk and gradient-string: Because the prompts look good.
ejs: A very prominent templating engine used in NodeJS projects It will be used to generate files by injecting the content into a pre-specified setup.
The Code.
So far, we have discussed what we set out to do, why, and what tools we can use to get there. Let’s now take a deeper look at the code and see how I combined the packages mentioned above to write a script by showing some small examples.
//Importing the libraries
import { parse } from "@babel/parser";
import { traverse } from "@babel/core";
import inquirer from "inquirer";
import gradient from "gradient-string";
import chalk from "chalk";
import { readFileSync, writeFile } from "fs";
import * as ejs from "ejs";
import path from "path";
import * as prettier from "prettier"; //To prettify the file once it is generated
Here we have imported all the required packages; notice there are a few extras like the FS module to interact with the file system and prettier to format the output file as ejs does not take care of this step. Something new here is parse and traverse from Babel. They are very important functions that form the core logic of our script and that generate, modify, and convert ASTs back to code.
Next, we create a prompt for our user to input the path to the file we want to generate tests for.
//Prompting the user for input
inquirer
.prompt([
{
name: "filepath", // Name of the prompt
message: "What is the absolute path to the file?", // Message to the user
type: "input", // The type of answers we are expecting
},
])
.then((answers) => {
// Do something
})
So, inquirer.prompt accepts a list of questions that we would like to ask the user, it returns a Promise, and hence it takes a success callback to access the inputs from the users.
Now that we have the path to the file, we can use the functions provided by the fs module in Node to get the text content of the file, which we will then pass to the parser.
const ast = parse(content, { // Passing the content of the file as text
plugins: ["typescript", "decorators", "throwExpressions"],
sourceType: "module",
});
The above code snippets are the most important part of the tool; here we are passing the file content to the parse function with certain configuration options. Plugins are used to tell the parser to use the appropriate plugin to deal with syntactical and design patterns. The typescript plugin is used to parse typescript code, while decorators are used to deal with any decorators present in the code, which are prevalent in Angular codebases. The source type is used to allow module declarations and strict checking.
Now that we have the code, we can see a nested object like the one below.
// If the content of the file is:
// let a = 10;
{
"type": "File",
"start": 0,
"end": 11,
....
},
"errors": [],
"program": {
"type": "Program",
"start": 0,
"end": 11,
....
"sourceType": "module",
"interpreter": null,
"body": [
{
"type": "VariableDeclaration",
"start": 0,
....
"declarations": [
{
"type": "VariableDeclarator",
"start": 4,
....
"id": {
"type": "Identifier",
"start": 4,
....
"identifierName": "a"
},
"name": "a"
},
"init": {
"type": "NumericLiteral",
"start": 8,
"end": 10,
"loc": {
"start": {
"line": 1,
"column": 8,
"index": 8
...
],
"directives": []
},
"comments": []
}
Don't worry, I've just shown the crucial sections of the tree; I've left the non-essentials out. However, you can see that the tree is already too huge for a single line of code. Each line is divided into smaller tokens, which are then appended to a tree containing further information about each token. As you can see, each node includes characteristics such as type, name, and value, and blocks of code will have a body property that contains further nodes.
Now that we have our tree, we can traverse it for all of the nodes to receive information about them, or we can alter the nodes to make changes to the code.
For e.g.
//Original Code snippet: let a = 10
console.log(ast.program.body[0].type) // VariableDeclaration
console.log(ast.program.body[0].declarations[0].id.name) // a
console.log(ast.program.body[0].declarations[0].init.value) // 10
Already seems daunting? 😆. We can use the visitor pattern, a very common design pattern used in object-oriented languages, to separate the algorithm from the actual object structure so we can perform operations on it.
So we can either write our own algorithm and parse the tree above, or we can use the traverse utility provided by Babel to use the visitor pattern on the generated tree to do whatever manipulations we would like to perform on various kinds of nodes. Let’s see how we can do that.
traverse(ast,{
VariableDeclaration(path){
path.id.name = b; // Setting a to b
}
},
);
In the above code snippet, here I am simply reassign the value at the node, so when we generated it back to code, the name of the variable will be changed to b. The traverse function takes two parameters: the AST and an object that has overridden methods for each type of node you want to perform operations on. Want to change the name of the class? Use ClassDeclaration. Want to change the parameter type for a particular method? Use FuntionDeclaration’s parameters property to modify the params. You can even change or add nodes like this, which can be generated into code using the generate utility also provided by Babel.
So using the process above, I parse the entire files and use node types to actually get names and values to generate lists of imports and function declarations, then merge them into lists. These lists are then passed on to the templates to generate entire spec.ts files for our components.
E.g.
<% for (const imp of importList) {%>
import { <%= imp.import%> } from "<%= imp.source%>";
<% } %>
A small portion of our template The above code generates all the single imported dependencies from our main file onto our test file.
What’s next?
In its current version, the tool can import dependencies for you, create test suites as per the names of the functions, and declare spy variables for you. Although this isn’t much, the time it saves is considerable. There are a few features planned for the future. Like the creation of spies and the setting up of test benches, there’s also the possibility of injecting more content into the test suites themselves. As of now, we are using it internally to test it out in real-life scenarios. I hope to make it available to others as soon as possible.
The extras
The process of finding the right set of tools to create a prototype has been discussed in the blog above. Here, I would like to talk about things that didn’t make it and what else can be done with the knowledge gained.
There are multiple options available for parsers like Acorn, Esprima, and Espree. Even typescript exposes a part of its compiler to generate and traverse ASTs. But there were issues with all of them; they were either too difficult or did not have enough. There were a lot of considerations regarding code generators as well. Between Yeoman and Hygen, it seemed like a very good option for files; with front-formatters and a language-agnostic approach, it is a really nice and default choice for fast and easy entry to code generation. But the entire setting up of Hygen and the problem with templates having to be written for every project did not seem very suitable for our use case. So I decided to use the general template systems available and forgo the additional functionality of Hygen. There’s still a need for a module bundler to package the application before publishing it, among many other things, before it is ready.
There are many such use cases for code generation, like generating classes from UML diagrams, running mass migrations across a large code base, and using cutting-edge context-based language models to write code from prompts like Github Co-pilot. Code generation will increase in the future in smaller codebases as well, because larger companies use them to enforce their design patterns and ensure any new code written is as per proper guidelines. AI is already intervening with webpages and the design being developed as well based on the user’s description.
I am going to keep my eyes peeled for further use cases and opportunities to apply what I’ve learned to other projects in Fyle. Till then, goodbye! 👋🏻
References
Try out the tool here: https://github.com/fylein/angular-test-gen
Code generation:
https://tomassetti.me/code-generation/ ( a great starter for code generation )
TypeScript Compiler API:
https://github.com/microsoft/TypeScript/wiki/Using-the-Compiler-API ( the only official guide to typescript compiler API)
https://levelup.gitconnected.com/typescript-compiler-and-compiler-api-part-1-4bb0d24a565e ( explaining the typescript compiler and how to use it )
Abstract Syntax Trees:
https://medium.com/basecs/leveling-up-ones-parsing-game-with-asts-d7a6fc2400ff ( really nice blog about ASTs using visuals to explain concepts )
https://dev.to/balapriya/abstract-syntax-tree-ast-explained-in-plain-english-1h38