Aumit Leon

“If you want to build a ship, don’t drum up the men to gather wood, divide the work, and give orders. Instead, teach them to yearn for the vast and endless sea.“ – Antoine de Saint-Exupery

Read this first

Writing and Publishing Python Modules

There is a time and a place for specific solutions, and generalizable solutions. This article is aimed at the latter, using python as an example. When building out large systems, it’s important to keep in mind the DRY principle — Do not Repeat Yourself! Repetition in a code base can turn a simple change into a tangled mess befuddled by human errors — hence, spaghetti code. One way to reduce repetition in a single code base is by modularizing functionality — by decomposing repetitious code into modular functions, an update to a single function can replace the arduous task of updating code in different corners of the codebase. But what if you’re building something bigger, and shared code exists beyond a single repo?

Enter, packages. Every major programming language has some mechanism through which code can be shared and used by different people — Ruby has Gems, Node has npm, and Python...

Continue reading →

Programming and Allegory

The universe is filled with disorder. With a little hand waving and a lot of generalization, you might consider the mark of an intelligent species as one that attempts to bring order to the chaos. Order is in the eye of the beholder. For the thousands of years that humans have been on Earth, order has been synonymous with civilization. In fermenting the uniquely human perception of civilization, we erected buildings where forests once stood, developed cars that guzzle fossil fuels, and marred the environment in irreversible ways. The imposition of human order and the development of civilization are two sides of the same coin.

The human need to impose order is everywhere. From the bridges we build, to the software we write. While the connection between physical and digital infrastructure may seem transient, they share many similarities. In fact, digital infrastructure is often inspired...

Continue reading →

Federated Learning and the Future of Edge Computing

The use of federated learning to train models on distributed training data is a privacy-first approach to machine learning. With the application of machine learning techniques to domains as diverse as game-playing, image recognition, and predictive text, access to troves of data is an increasingly valuable component in the effective development of high-value algorithms. Data is often likened to the new oil, an ode to the ways in which information has become a commodity. The value of data, much like oil, is dependent on how much of it you have. Deep learning methods that define the state-of-the-art across a number of domains required millions of training examples: AlexNet, the canonical image recognition benchmark, was trained on the ImageNet dataset with over 14 million images. GPT-2, a text generation model released by OpenAI, was trained on 8 billion webpages and over 40GB of text from...

Continue reading →

Leveraging AWS to Scale R&D Workflows

Originally posted to Indigo Ag’s Engineering Blog.

In order to identify and deliver commercially viable products for our growers, Indigo’s Research and Development teams analyze bacterial and fungal microbes through bioinformatic analysis tools at scale. In order to deliver the scalability and efficiency required to support the high throughput nature of Indigo’s R&D pipelines, the Biomation team has built out cloud native solutions through AWS.

The tools that Biomation builds infrastructure around are written by Indigo’s data scientists, usually in the form of python modules. These tools require varying amounts of compute resources, making it difficult and highly inefficient to run these pipelines locally. Difficult because every tool requires special setup within whatever local environment they’re running from (specific binaries, configurations, etc), and inefficient because compute...

Continue reading →

Learning About Design Patterns in Software

One of the most effective ways to develop your skills as a Software Engineer is by familiarizing yourself with design patterns. You might have some understanding of patterns that already exist from projects you’ve worked on or used, but the mark of an effective problem solver is a deep and insightful repertoire. A deep repertoire ensures that you as an engineer arrive at the best fitting solution for a given problem. It’s often the case that an engineer will apply one generalizable solution to a wide suite of problems — while this might solve the immediate problem, if this solution does not account for the particular nuances associated with the problem at hand, the solution will fail to scale, produce technical debt, and almost certainly cause growing pains down the line that might influence a costly redesign. With an understanding of why we should build our knowledge of design patterns...

Continue reading →