raag nair dot com

A place to disorganize my thoughts. Writing and building and writing about building, to keep sane.

Tech Lead & Staff Engineer @ Coinbase
ex Google, Goldman Sachs

Stuttering in Programming Languages

This is an effort to document the difficulty I’ve faced in choosing a programming language for my toy projects. To dispense with the obvious suspicions, I will clarify that I am not a perfectionist; I neither expect my language to be the fastest, nor the neatest. Yet I have found that a few traits of my chosen language tug incessantly at my attention, which never keep me far from the all-pervading question of whether I’m making the best choice or not.

For this article we will assume that I haven’t conquered my urge to second guess myself, and to go back to the drawing board for no reason other than the call of analysis paralysis. In other words, we will bypass the obvious advice to shut up and just build—however correct the advice may be—in favor of pursuing a perfect language for its own sake.


Scala

Pros:

  • Memory managed
  • Very strong static types
    • Monads everywhere
  • Null safety
  • Pekko’s Actor model

Cons:

  • Dead in the industry
  • Cannot achieve hard realtime
  • Unhappy web app experience
    • Zio is ugly
    • Cats Effect is ugly
  • No easy web or mobile experience

I mention Scala first because it comes as close as I can imagine to a perfect language. The type system is so expressive that entire domains can be conveyed as a tapestry of Traits and Interfaces, and when combined with the Akka/Pekko ecosystem, entire behaviors were easy to enshrine into code. Combined with a reasonable review policy discouraging the overuse of DSLs and Implicit conversions, a Scala codebase avails itself as an absolute pleasure to read and modify.

My experience using Scala 2.13 at Trumid was so enjoyable that I despaired having to one day regress to any other language.

That brings me to the great tragedy of Scala. It’s a dead language. Notwithstanding the stalwart optimism of those over at r/scala, it is indeed the case that Scala never quite hit critical mass, and the window of opportunity for it to take off has passed. All that’s left of its perennially early adopters are some rare industry holdouts and tinkerers in academia.

Should you ask why the popularity of the language matters, I have no satisfactory answer to offer. My intention in choosing a language for my toy projects is in a way more a meditation on the language, than the project for its own sake. Hence I am not satisfied in hiding away in a distant cave, furnishing a growingly esoteric set of skills. I want mastery of the ugly details of this language to materialize to my benefit out in the open, and Scala is not up to that task.


Typescript

Pros:

  • High demand in the industry
  • Memory managed
  • Very easy full stack web apps
    • React & Node
  • Strong (compile-time) static types
  • Default parameter values

Cons:

  • Not fast
  • Nullable objects
  • No viable gaming story
  • Questionable job quality

On the subject of industry viability, it would be remiss not to speak of Typescript (and by extension, Javascript). It’s true that this is the most in-demand of all languages. No one can contend with the dominance of “HTML”, “CSS”, and “JS” at the top of language rankings that scrape job postings, because most of those “HTML” and “CSS” mentions are associated with Front End developer openings.

That Typescript is not thereby the winner among my desired programming languages is the exception that proves the rule: maximizing on industry marketability isn’t the end all be all. I want to enjoy writing programs in this language, and more specifically, I want the option of performance and gaming available to me.

In playing with Typescript code, I am ever aware of the strangeness of prototypical inheritance, of how any class can have any method attached to it along the chain of my imports. I am aware that the program runs on a single thread. But none of these quirks weigh as heavily on my mind as the foreknowledge of the limits of the language. Should I ever want real performance from a backend, I would be shooting myself in the foot by sticking to any language that compiles down to Javascript. The absence of any serious gaming pathway in the Javascript world attests to this limitation.


Rust

Pros:

  • Fast, low memory
  • Very strong static types
  • Null safety

Cons:

  • Niche in the industry
  • Borrow checker
    • Painful prototyping and refactoring
    • Lifetime annotations
  • Ugly syntax

Rust is the salad that I forgo for pizza.


Java

Pros:

  • High demand in the industry
  • Memory managed
  • Easy full stack web apps
    • Quarkus & Vertx for performance
  • Static types
  • Gaming library with libGDX

Cons:

  • Type erasure in generics
  • Cannot achieve hard realtime
    • Unless using ugly Java
  • Checked exceptions
  • Nullable objects
  • No default parameters
  • No easy web or mobile experience

Java is Old Faithful, the first programming language I used in earnest. As of writing, it appears to be in the top 3 of most demanded jobs in both Big Tech and Finance.

Festival

Late to arrive,
later than the rest,
we sneak and we snake our way to the midst

Hardly invited,
but here nonetheless,
on time—right on time—we insist we insist

Curious and starved,
parched and alone,
we worry and fret over chances we missed

Maybe this feeling,
is initiation,
maybe the spoils are for those who persist

Still these reveries,
of food and of friends,
land in our hand, then pass through our fist

This yearning and want,
is a fast of its own,
a fast we withstand, to withhold our wits

Until the night song,
delivers its spell,
divines us a hymn, disarms with a kiss

Enchanted we dance,
devoted and true,
this smile our offering, this laugh our gift

Others who join,
we love for a while,
a while is forever, if time would permit

But time will forget,
this festival will end,
the night will reclaim, the lives that we lived

hope monsoon

brief candle in the storm,
cold hands hold a star
by doubt and the rain,
by the darkness my love,
we are surrounded my love

how long have we walked,
too far to go home
the night asks of me,
do you light the way
do you

Babylon 5, a dumb reason to drop a show

I have always envied fans of Star Trek. There’s something pure about loving a show that’s already finished, and exists untouched and in the past. They don’t have to deal with new installments coming out from different directors, with different visions, and different world views, each serving as a new battleground for the fans to duke it out.

Incidentally, they also don’t have to find out that the writers of the show are leaving for greener pastures, and will be rushing the last few seasons of the show, giving a modest effort at tying off all loose ends and pushing the story to an end as soon as possible. 

Star Trek fans get to have their story, told at its own leisurely pace, insulated away from the noise of modern internet culture.

So I was excited when I found Babylon 5, the entire series, on HBO Max. It seemed like a new frontier to explore – a story that I could bite into without feeling like I had to catch up to the existing ocean of Star Trek fans. Youtube insisted that the show was fantastic, so I gave the first episode a gander. Instantly, I was hooked. 

Armed with my friend’s password and a lazy Sunday afternoon, I dove right in.

The world of Babylon 5 feels long established, lived in, but still welcoming to the fresh spectator. There were previous space stations before the titular fifth one, previous tragedies that fertilize the traumatic pasts of our characters, and previous wars which have forged the strong personalities of the show. All of these details were begging to be uncovered through passionate dialog and poorly aged CGI fight scenes. In short, it had all the things I love in any sci fi or fantasy show.

So you can imagine my utter disappointment when the main protagonist of the story, Jeffrey Sinclair, was nowhere to be found in Season 2 Episode 1. Oddly enough, the episode spent a great deal of time focusing on some new face, John Sheridan. What was at first a lingering suspicion grew with every passing minute of the episode, until I reached the final act and just had to pause it to google: “babylon 5 captain change”, and my worst fear was confirmed.

Due to health complications, the lead actor Michael O’Hare had to leave production. In response to this, the show creator wrote out his character, rather than keeping the character and replacing the actor. To me, this felt like the greatest of creative betrayals. 

  • So now the rough-around-the-edges Chief of Security, Michael Garibaldi, who was angled as Sinclair’s confidant and best friend, is going to become some other character’s best friend?
  • And the enigmatic Delenn, who was slated to be Sinclair’s long term love interest and has ostensibly already become his fiancee, is going to romance up some other bloke?  

Ew.

Arya Stark shanking the Night King was enough expectation subversion for me. 

Sure, there’s something to be said about how the show is more than just the protagonist. Maybe if I watched the first season again with a greater focus on the world and the other characters, I could muster up some interest in seeing the continuation of the world. But the whiplash in narrative was too bitter to swallow. 

So, just like that, I dropped the show from my rotation.

How I Faked a Blog

First I should clarify my terms. When I say that I faked a blog, what I mean to say is that I faked having a CMS (content management system) on my website (raagnair.com).

But why?

  1. I don’t want to host the posts/media that I upload
  2. I don’t want to ever worry about migration/backups
  3. I don’t want to pay for the bandwidth

TLDR:

I uploaded my posts/photos on my tumblr blog, then used JavaScript inside of a simple index.html file to load them, and synthetically mimic a “front page” along with individual “blog post” pages.

Background:

I blame it on ego, really. In my college databases class, my TA had a personal website hosted on WordPress, and somehow that offended me. In my over-eager college brain I thought: How can someone who, ostensibly, knows how to program, rely on the same medium of content management that the layman does?

Well, years later it came time for me to make my own website, and I ran into the issue of how to host a blog under the same domain name, without succumbing to the devil of WordPress.

Approach:

As it so happens, my friends at the time were avid fans of Tumblr. And with a bit of sniffing around I found out that Tumblr has a simple querying API that can be accessed via Ajax. No OAuth, no API key, just open fearless access to all of the internet.

After a little bit of playing around, I put together a website that:

  1. Queries my Tumblr blog for a handful of posts
  2. Grabs the first <img> from the post to use as the preview pic
  3. Presents these several posts, along with their titles, dates, and preview pic

Then, when the user clicks on any one of the previews, the URL parameter changes to indicate the specific blog ID. The JavaScript then queries my Tumblr blog for that blog ID and shows the full post body.

github.com/raagnair/tumblrsed

Shortcomings:

Barely any SEO to speak of. With my homepage online for a few months it struggled to index blog posts and attach meaningful captions to the few pages that it did capture. This is because none of these blog posts have static URLs, they are just URL params attached to the end of the index.html.

Formatting was a nightmare. Sure, the Tumblr v1 API lets me use a raw HTML editor to format my Tumblr posts, which means I somewhat control how they display on my website. But Tumblr is notorious for infecting posts with their custom nonsense, like figure tags I never wanted, and metadata all over the place. It was a nightmare writing posts on Tumblr, because if I ever clicked into the WYSIWYG editor by mistake, all of my custom HTML was instantly nuked.

Conclusion:

An impractical, albeit fun, excursion.

Bad Analysis – How Data Migration Turned Zen Parable to Zeno’s Paradox

This will be my first addition to my #errlog diaries – a chronicling of different failures in my past. Before I begin, I’d like to assure the reader that this entire post isn’t just an excuse to use that title.  

This failure is localized to a single year-long project, and as such, I’m able to break it up into smaller, easy to understand, parts.

  1. Preface
  2. Challenge
  3. Failure
  4. Consequence

Let’s dive right in.

PREFACE

The entirety of the firm’s data layer is based on Cassandra 3.0. The decision to use Cassandra stemmed from a few core characteristics of our system.

  • Required: Fast Insertions – The vast majority of inserts into our database is ‘timeseries’, which is to say that inserts happen in the order of real-time events.
  • Required: Fast Seeks – We wanted constant-time fetching of data if we have a sufficient set of query parameters.
  • Not Required: Immediate Consistency – All real-time relevant information in the middle of the trading day is communicated from one JVM to another via Java Multicast. The data layer is for the vast ocean of analytics that consumes this data posthoc.
  • Not Required: Low Maintenance – We hosted the Cassandra installation on local ny4 dataservers over in Secaucus, New Jersey. That’s light jogging distance from our New York Headquarters. This means we could buy powerful machines close enough for us to service within the hour of any problems arising.

The above profile of requirements and non-requirements paints a pretty clear picture. We started off, as a company, trying to attack the US Equities market. So it made sense to have a local datacenter. It made sense to store a single piece of data several times in different tables so that we could have constant-time seek calls even if our queries were very different.

I’m describing our legacy data layer in the ‘Preface’ section because the data layer is intricately tied to the application layer of all of Clearpool’s code.

Every single create, read, update, or delete statement ever written was written with our Cassandra setup in mind. But this becomes a problem when words like “global” and “scalable” come into view.

No one ever promised that our local Cassandra setup was going to be able to serve requests from Canada in an appropriate amount of time. Hell, Clearpool is even flirting with the idea of letting the clients ‘own’ the data in their own data silos.

And that doesn’t mesh at all with a local Cassandra database, right? How do we ‘give’ them their own data? How do we open our European Clearpool branch when each query has to travel across the Atlantic Ocean?

In comes the technical architect.

He’s got the answer to this problem, and his answer is magical. We’re going to migrate our entire data layer into the Amazon Cloud, and we’re going to set it up in such a way that any new customer can ‘spin up’ an instance of our analytics software and run with it.

CHALLENGE

Our current setup is a local Cassandra 3 cluster running a couple miles away, in Secaucus, NJ. Now all we’ve got to do is migrate it all into the Amazon cloud. This can be our opportunity to assess inefficiencies in our data setup!

Thought 1: “We want to keep the ‘shape’ of our data generally the same.
This means that we are looking for a NoSQL Cloud database. Oh look, Amazon DynamoDB offers just what we want!

Objection! If we wanted to write the same amount of data into DynamoDB that we did into our local Cassandra, I’d probably have to start working Pro Bono, because the firm would be bleeding money into Amazon.

Thought 2: “We’ll port everything over from NoSQL to a Relational database.
I mean, this could be a really good thing! Our old data model involved serializing all objects using Google Protobuf before storing them into the Cassandra table. Serializing everything made inserts and reads super fast, but it came with a cost.

This means that all queries that didn’t hit a specific key needed to read entire chunks of data into the JVM, deserialize the data into Java objects, then apply filtering logic. We had gotten used to it, but we had several developers that would salivate at the thought of being able to run complex SQL statements against our data!

Objection! The producers and consumers of our data have become accustomed to Cassandra’s NoSQL features. A bait-and-switch under the covers that replaces the ‘column-family’ of Cassandra with a ‘table’ from PostGres is not even remotely close to smooth.

Example: The table definition is now forced to be constantly up-to-date.

Let’s say an object that has 3 fields in version 1, but a 4th field in version 2. Imagine, then, that Production is running version 1, whereas version 2 is still in Development. In Cassandra, you can still insert objects to the database using version 2 of code, because Google Protobuf is forward and backward compatible as long as no one is changing the semantic meaning of pre-existing fields.

In PostGres, however, inserting version 2 will lead to a PSQLException complaining about how that 4th column doesn’t exist.

I don’t mean to make the above challenges sound insurmountable. They’re not. Every project will face challenges, otherwise engineers wouldn’t be paid the kind of money we are. But the way we handled these challenges is what brings us to the next section.

FAILURE

Solving problems is our forte.

But we seem to be terrible at grasping how long the problem will take to solve. The classical approach to making an accurate estimate for a project is to break it down into smaller parts that more closely resemble previously completed projects. At this point, apply estimates to the smaller parts, and add it all up together.

Breaking up our data migration project into smaller parts wasn’t very difficult. Below is the general gist of how it came out. Please note that 1.0M = 1 Man-Month.

Step 1: We went through this exercise and came to a 20 Man-Month estimate.

Step 2: The development team consisted of me and a party of 2 developers.

Step 3: So we did some very simple arithmetics, and took the 20M number above, and divided by 3 to arrive at roughly 7 months of time required for a 3-man team to finish this project. 

Step 4: The architect of the project committed to finishing the Cloud project in 7 months.

Failure 1.a: I didn’t voice my immediate reservations about the timeline of this project. This was perhaps a subconsciously political move on the my part because I was, after all, coming on to a new team and trying to make a good impression.

Failure 1.b: I didn’t bother going through all assumptions made at the time that this 20M estimate was made. The architect’s expertise in the Cloud space was a comfortable safety net that dulled the my natural skepticism.

Failure 2: As new information arose that broke assumptions made during the estimation-phase, both the architect and I chose to try to simply pick up the pace to meet deadlines, instead of formally publicizing that we had heavily underestimated the project.

“Measure twice, cut once.”

Who has time for all that measuring?

If you’re curious, here’s an example of an unexpected problem…
PostGres doesn’t internally handle deadlock scenarios. For example, transaction A is waiting for a lock held by transaction B, while B is also waiting for A. This throws a wrench into the scalability of our database insertion infrastructure.

CONSEQUENCE

Achilles races a tortoise, and you won’t believe what happens next!

Achilles and a tortoise set out on a race. And to even the odds, Achilles lets the tortoise get a head start.

Let’s assume that both Achilles and the tortoise are constantly moving towards the end goal.

This means that whenever Achilles runs to where the tortoise previously was, the tortoise has moved forward, because both of them are constantly moving. Which seems to imply that Achilles will never catch up to the tortoise, even though he’s obviously faster.

Every time he’s about to catch up, he’s just a tiny bit behind.

Several corners were cut during the design and analysis phase of the project. And because of this, the stakeholders are met with a string of excuses over a series of months explaining why the delivery was delayed yet again.

Predictably, this affected the confidence that the stakeholders had in the merit of the entire project. While the architect and I found meaningful bugs and discovered insightful shortcomings in their architecture, people on the outside just saw missed deadlines.

Momentum and stakeholder confidence are everything.

And as with everything precious, they are hard to gain and easy to lose.

Interesting Failures

Well, I hope they’re interesting.

My failures haven’t been very cheap, so the least they can do is provide a good story; a chronicle of cautionary tales for myself and others that happen to stop by.

Now I’ve found myself with some time on my hands, and I’ve realized that it’s been ages since I’ve written anything remotely long form. So I’ll be doing my best to document these stories on my blog, under the #errlog tag.

Claustrophilia

An abnormal desire for confinement in an enclosed space

Merriam-Webster; claustrophilia

Merriam-Webster’s medical definition sounds a bit judgmental. Let’s ignore the word ‘abnormal’ in there. Ever since I was a kid, I’ve been enamored with the idea of surviving in some small space, where people found lesser, cheaper, alternatives to every-day life. This would be strange to me if I didn’t have an explanation for it all.

I know where it all began.

This is a picture I found in an article from the Daily Mail in a piece about how they’re redoing the New Delhi Station. Juxtaposed next to this image is another one done by an artist – a concept art of what it could look like with tremendously high ceilings, white tiles, and sleek trains. I guess the concept art was the focus point of the article. But I’m stuck staring at the before picture.

My cousin sister and I would be on the second-story bunk, while my mother and my cousin’s mother would be on the third-story. The elders had the luxury of the bottom bunk, where the window and leg-room gave the illusion that you weren’t in the company of 5 other people inside of a 5 cubic meter area.

There wasn’t much of a difference between the second-story and third, beyond the fact that perhaps the third was a bit roomier. And you didn’t have the nagging worry of the bunk above you collapsing onto you, which proved to be more of a naïve fear of my own, rather than a wide-spread concern.

But I think the most important lure for the third-story – the reason why my mother and her sister got it – was that it was the most distant from the pandemonium going on below. Children zoom by, beggars clap coins together in their hands, chaiwalas repeatedly announce their brand of tea, scammers and entrepreneurs and lovers and cripples blend into a mesh of orange and white and green and blue.

This exploding microcosm existed, impossibly, inside and around every train. Is it so hard to believe that an imaginative kid could let his mind run wild in a place like this? We were on a steam engine, roaring away from the end of the world. Outside, in the howling night wind, were the demons. And inside was our army, men and women of different origins, with different powers. The rhythmic thumps of the train car against the tracks tolled the war that raged on behind us. And in between my mother and my grandmother’s bunk, on the second story, I clutched on excitedly to a blanket that left me feeling slightly cold, drunk from the jittery excitement of being surrounded by possibility.

I’ve long since forgotten that feeling. I remember what it felt like to feel that way. But I can’t bring myself to feel it again.

Hello world! :(

Finally bit the bullet and created a WordPress blog.

Before creating this WordPress, I had originally hosted my website as a single .html file, with JavaScript to query my Tumblr blog, and simulate a “front page” as well as pages for singular blog posts. I’ll add a latest blog post about how I made this happen, the pros, cons, and even share the code.