Exploring the Abstract Syntax Tree
Hacktoberfest had me get out of my comfort zone and contribute to different codebases. One of them, a python linter gave me exposure using the python Abstract Syntax Tree.
What is an Abstract Syntax Tree?
An abstract syntax tree (AST) is a way of representing the syntax of a programming language as a hierarchical tree-like structure.
In essence we can take a line of code such as this:
pressure = 30
Enter fullscreen mode Exit fullscreen mode
and convert it into a tree structure:
Wikipedia has a slightly different definition:
An AST is usually the result of the syntax analysis phase of a compiler. It often serves as an intermediate representation of the program through several stages that the compiler requires, and has a strong impact on the final output of the compiler.
So the AST is one of the stages towards creating compiled code. Definitely feels like we are getting closer to what the machine understands!
What this enables us to do is to step through the structure of a program and report any issues back (similar to intellisense/linters) or even change the code that is written.
Python provides a library for parsing and navigating an Abstract Syntax Tree and is aptly called ast.
Using the previous example we can create an ast by using the following command:
import ast
code = 'pressure = 3'
tree = ast.parse(code)
Enter fullscreen mode Exit fullscreen mode
Simply printing the tree won’t display the structure and nodes that we want. Instead we can create a node visitor that will traverse the tree and give us details on each of the nodes:
class Visitor(ast.NodeVisitor):
def generic_visit(self, node):
print(type(node).__name__)
ast.NodeVisitor.generic_visit(self, node)
Enter fullscreen mode Exit fullscreen mode
Now if we create an instance of this visitor class, when we call visitor.visit(code) we will get the following output:
Module
Assign
Name
Store
Num
For a linter this is quite useful to see if any of these node orderings are invalid. A good example of this is when you have if True:…
a redundant if statement since it always evaluates to the same result.
When we run visit on an if True:… we get the following:
Module
If
NameConstant
Expr
Ellipsis
In this case the True
value is the NameConstant
. The ast visitor allows us to create specific methods that get invoked when a particular ClassName
is visited. This is achieved using the following syntax:
def visit_{className}(self, node):
Enter fullscreen mode Exit fullscreen mode
In this case we are wanting to visit any If node and check it’s test condition to ensure that it isn’t just a NameConstant
(True, False or None). This is where the ast documentation is quite useful as you can see what properties are available for each node type. We can access the node.test condition like so:
statement = "If statement always evaluates to {} on line {} "
def visit_If(self, node):
condition= node.test
if isinstance(condition, ast.NameConstant):
print(statement.format(condition.value, node.lineno))
ast.NodeVisitor.generic_visit(self, node)
Enter fullscreen mode Exit fullscreen mode
Running this on our previous example gives us a nice detailed message:
If statement always evaluates to True on line 1
You are not limited to only ‘visiting’ a node in the AST. Using another one of pythons classes you can use ast.NodeTransformer
to modify them too! This leads to really cool possibilities like inserting temporary lines of code to test code coverage of your program or even transpile to other languages
I recommend checking out the following resources if you are looking to make use of ast in python:
Green Tree Snakes – the missing Python AST docs
^ This one even includes a live web AST visualizer which can help see the code structure quickly!
A copy of the code in this post can be found here
The next thing I would like to investigate is using the NodeTransformer to potentially transpile from python over to another language like Javascript.
Thanks for reading!
Share your experience/use cases for AST in the comments below!
暂无评论内容