Sohan's Blog

Things I'm Learning

2021 → 2022

My 2021 resolution was to be more mindful. But I made the rookie mistake of having a goal without metrics and a clear strategy. So, I can’t tell if I was indeed more mindful or not. In this post, I’m sharing some highlights from my 2021, in a random order. I personally find it fascinating to look at these at a later time.

Family. Thanks to work from home for the entire year, we spent a lot of time together as a family. Unfortunately, this is another year when we didn’t meet any of our extended family members due to Covid. This makes it two years in a row, and it’s definitely the lowlight of the year. Without the help of extended family, I felt the joy of parenting was often lost in the grunts of never-ending responsibilities. After all, one can only find so much joy in the infinite loop of loading and unloading dishwashers.

Relationships. I made some new relationships with some old friends. I found these new relationships helped us supporting one another through the pandemic and beyond just that. I’ve been blessed with the kindness of so many people as they spent their time to meet, cook, talk, or text with me. They’ve inspired me with words of encouragement and I hope I’ve made their lives a little better, too. Thank you folks, you know who you are.

Moves. We moved to Vancouver from Calgary this year. This was all consuming for the whole family. We touched up the Calgary house, prepared it for showing, and entertained dozens of showings while still living in the house and working from home. It was super exhausting. The move itself was actually two moves. First we moved to a corporate apartment for two months, and then we moved to a rental house. The company arranged move was as good as it gets. Yet, it was an extraordinarily difficult summer. Along the two moves, we ended up filling out 100’s of forms for medical, banking, schools, etc.

Books. This year I read and listened to many books, mostly on the topic of leadership. A friend got me Audible credits as a gift and that got me into audiobooks, too. I listened to four audiobooks in the last two months. Thank you.

Walks. Since moving to Vancouver, I’ve developed a habit of walking. I always liked to go for walks, but the weather and greenery here has been refreshing to go for more walks. This is also a positive side effect of not having the second car.

Tennis. In 2020, I played a lot of tennis. I found an enthusiastic partner who lived nearby. Also, I didn’t want to play soccer due to Covid. I kept on playing tennis in 2021, too. It was a little harder after we moved to Vancouver. Despite a large number of courts, it’s always busy here. I don’t have a partner who lives nearby, and I don’t have a second car since we moved. But overall, I think I enjoyed tennis this year as my primary sport.

Health. I was super worried about the health related vitals because of the amount of stress from all things moving. My last health check in November turned out to be ok. This was a major relief to know.

Social media. With some new relationships with old friends, I lost interest in the passive ways to connect with friends over social media. However, I still keep an eye on Twitter and use it mostly as an information radar.

Investments. The stock market indexes overall did great this year, and so did my passive investments. But I actually didn’t have any meaningful gain or loss from my active investments. I learned that I can’t time the market and nobody has a clue about valuations of companies anyway.

Work. I had a blast at Microsoft in 2021. I built strong relationships with some of my colleagues. I built my org comprised of three teams. Together we built some cool products for Azure Communication Services and I learned a ton in the process. I focused on honing my leadership skills by carefully observing others and experimenting with ideas that I learned from podcasts, talks and books on leadership and motivation.

2022. As a family, we’ve decided to move to the US after twelve years in Canada. We love the Vancouver area a lot and we can see us returning to this beautiful city at a later time. But at this time, we have concerns about affordability. With this family decision, I looked for job options to move to the US. It was incredibly stressful to make a decision because I was fortunate to have several competing options to choose from. In the end, I accepted an offer from Google for an engineering manager role starting in January. This makes 2022 both an exciting and a worrisome year for me. Seeing how much I learned from Microsoft in 2021, I can imagine the new job at Google will be a learning journey of leadership, technology, culture and more. Also, seeing how disruptive it was to move within the same country, I’m terrified of what to come when we’ll move to another country.

To keep it simple, I’ve decided to go into 2022 with one resolution - every time I’m stressed, I’ll take a pause and ask myself, “what would Scott do in this situation?”

Open-source: If You Build It Right, They’ll Come

I find open-source to be a rather loaded term; it means different things to different people. For this blog post, I’ll imagine there are two kinds of open-source products:

  1. Company-owned open-source
  2. Crowd-owned open-source

I’m writing this blog post specifically about company-owned open-source products. In a typical company-owned open-source product, the company pays us to develop and maintain the product. I’ve been working on the Azure Communication Services UI Library that falls into this category. We have full-time developers at Microsoft that work on this product. In the past few weeks, I was researching to come up with an open-source strategy for us. Here I’m sharing what I learned in the process. These findings can help others who are on the same boat.

Here’s a quick table of the community engagement modes for open-source products. The higher the layer, the more complex it gets.

Engagement Mode Key Enabler
Change existing code / architecture Community meetups and conferences
Add new features Extensibility API, Public facing product backlog
Fix bugs CI, contribution guide, coding style guides
Fork and customize Permissive license
Discuss issues and questions Developer forums
Report issues Issue reporting tools and triage process
Browse code Readable code and docs
Download and use Product is released as a package

I highlighted the topic of extensibility APIs because that was my key observation about making successful company-owned open-source products. For example, VSCode is a hugely popular product and benefits greatly from the community contribution in the form of extensions. The beauty of this process is, it provides a clear separation of concerns. The core team can focus on their company-mandated work while the community can build and maintain the plugins independently. This also allows the community to contribute without having to understand the complex internals of the core. An extensibility API is useful beyond just for the open-source community.

If you want the community to come and contribute to your open-source project, carefully design the core product with extension points in mind. Then, ensure the ergonomics are great for the extension developers. Give them well-documented and thoughtfully-designed APIs. Ensure that they have automated testing and publishing tools for the plugins. Make the plugins easily discoverable so that users can find and use compatible plugins while using the open-source product.

I have another recommendation. A company-owned open-source product should publish their community engagement strategy for clarity. For example, it’s better to be upfront if the core team isn’t ready to accept community code contributions. We should choose a suitable strategy and evolve it over time as the product itself matures.

UI SDK Design Principles

UI SDK Design Principles

At Microsoft, I’ve been working with my team on a multi-platform UI SDK for Azure Communication Services. We empower developers to build visually delightful communication experiences (chat, audio-video calling, etc.) for everyone. I’ve participated in numerous design discussions on this UI SDK and I realized many of the things I learned here apply to the domain of UI SDKs beyond just the one we are creating. So, this blog post is an attempt to make a list of design principles that developers of UI SDKs can follow to create a bigger impact out of their work.

Another reason I’m writing this blog post is, writing a UI SDK requires a different mindset than developing a frontend application. While many of us are quite familiar with writing frontend applications and using UI SDKs, very few of us have had the opportunity to write a UI SDK ourselves. This was the case in our team, too. I hope the following list of design principles will help developers see the differences between developing a UI SDK vs. developing a frontend application. In our team, we discovered these principles over time and could’ve saved a bunch of time if we had the foresight that I shared in this blog post.

Let’s get started.

It’s a 4-way journey: UX, UI, API, and DevX. The output of a UI SDK needs to optimize for all 4 of the above. It must be an eye candy💄to look at so that it inspires developers to want to peek into the API surface. And a seamless DevX is critical to convert the inspired developers to actual users of the UI SDK. UI SDKs must accompany visual documentation, possibly including design assets and rich media where a developer can get a feel for the UX and UI. Better yet, give them interactive documentation using tools such as Storybook and Expo where they can play with the UI SDK without any dev setup. Give them a familiar API that seamlessly maps to the visual documentation. Make it straight forward to download your package and use it within an existing application, or give them starter projects where they can have a fully working app to run a spike using the UI SDK. Reduce frustrations and improve discoverability by letting them use tools such as intellisense or auto-complete suggestions.

Accessible by default. Because we care and as a developer we must do our part in making the world more inclusive.

Composable. UI SDKs are used within frontend applications where they may have their own design assets and reusable components. Developers often end up with a mix and match of UI elements from various sources. In addition to visual customizations, the UI SDKs must also take care of namespace isolation so that the properties of one UI element doesn’t inadvertently leak into another one.

Customizable look and feel. A frontend application can and should be opinionated about its look and feel. However, the target of a UI SDK is to be usable within many such opinionated applications. As a result, UI SDKs need to allow developers to customize the look and feel in many areas such as - branding, theming, colours, typography, layouts, positioning, sizes, styles, text, etc. UI SDKs that render a nesting of UI elements may also need to allow developers to replace some nested UI elements with their own, to fit their unique needs.

Customizable behaviour. UI SDKs are often involved with handling user and system events. Even if the UI SDK has default “event handlers” for such events, it should allow developers to hook their custom event handling code, potentially discarding the default event handler if needed.

Responsive by default. Frontend applications run on many different form factors and device capabilities. It’s delightful if the visual output of a UI SDK just works on all devices. That said, it must provide APIs for developers to opt-in or opt-out of default responsive behaviours, e.g. not all applications support landscape orientation, so, even if a UI SDK supports it, the host app may want to opt-out of automated landscape mode for consistency.

Localizable. While a customizable UI SDK is also localizable, I’m calling this out because a UI SDK must offer standard localization APIs including features such as being able to choose a different locale in addition to the system’s default locale setting.

Backward compatible for both API and UI. Developers rely on many 3rd-party libraries and SDKs because they want to focus and innovate on their primary business domain. Developers don’t have time to upgrade just because there’s a new version of an SDK is released with improved features. Since we want UI SDK users to use our latest and greatest, we must make it effortless by making both the API and UI backward compatible. In our own experience, we found the UI compatibility to be much harder than API compatibility. Most new API features are non-breaking because existing code works just fine, but even additional changes to the UI can be breaking, for example adding drop-shadow to an element in a UI SDK is a breaking change because it may look out of place in a host app that doesn’t use drop-shadow anywhere else. For this same reason, the change-log for a UI SDK must show both API and visual changes.

Testable. Testability impacts trustability. Automated testing is specially brittle and costly to maintain at the UI layer. On one hand, UI SDK developers must ensure reliable automated testing across many different devices and screen configurations to be able to maintain developer productivity and high quality releases. On the other hand, app developers that use the UI SDK must be able to write tests for their own use-cases. To enable this, a UI SDK needs to allow mocking and stubbing its dependencies.

Secure by default. UI SDKs handle user inputs and outputs, and often produce logs so it must be secure by default. There are well known attack vectors that affect the UI such as XSS, various kinds of script injections, session hijacking, and beyond. Since UI SDKs are used within many applications, the attack surface of a vulnerability in a UI SDK can be very large. It’s important for UI SDKs to prioritize the secure by default agenda. Additionally, there must be a process for developers to report security and vulnerability issues. To mitigate against security vulnerabilities, UI SDK dev teams must have a well-defined process of patching and vulnerability disclosure so that app developers can stay in the loop to protect their systems from being compromised.

Scalable. UI SDKs can be used within applications that deal with a lot of data or backend systems that operate at different levels of speed. As a result, UI SDKs need to be designed with concepts such as async data loading, progress indicators, and optimizing memory usage with paging and appropriate caching. While intelligent defaults work great, UI SDKs still need to expose APIs that allow developers tune such scale related optimizations based on their unique needs.

Robust. UI SDKs, like any other software applications, will eventually run into exceptions. So, it’s important to design the SDKs such that the blast radius is minimized to avoid a total application crash. Additionally, UI SDKs should provide feedback both visually and through the APIs to allow app developers respond to exceptions in a way that best fits the app’s use-case.

Lean and standalone. Transitive dependencies are awful for any SDKs, not just UI SDKs. At the UI layer, it’s tempting to take a runtime dependency on an existing package and is often the preferred way to build frontend applications. But a UI SDK that introduces transitive dependencies may not be usable for developers due to conflicts with their app, the potential bloat, or implications on licensing.

Observable. Developers need to build metrics and monitors for business and technical analytics. They also need to debug edge cases and report issues with sufficient context so that the UI SDK team can quickly identify and potentially fix the issues. To empower these use-cases, UI SDKs may need to trigger its own events, emit complete stack-traces and meaningful messages with exceptions. It also needs produce logs with varying levels of details for development and production use, of course keeping security and privacy in mind.

I admit that these design principles are not exhaustive and not necessarily ordered by their relative importance. While designing our UI SDK, we’ve come across each of these areas and have made conscious decisions on how to best achieve these goals within our constraints. If you’ve read it up to this point, thank you for your time on this long post and I hope you found it useful.

Onboarding as a New Engineering Manager

“I promise to take care of you” - believe it or not, I spent hours to come up with this message to make a first impression as a newly hired manager at Microsoft. Everyone in my team had more context than me. I was a complete newbie, yet their manager. While I had been a manager for several years, I had never started a new job as a manager. I felt super vulnerable. In my previous job, I made an impression as a technical leader before transitioning into an engineering manager. But as a new hire and a manager at Microsoft, it was a brand new experience for me. I was worried. I really wanted to start on a positive note from the get go. I had three weeks in-between jobs and I went back to my favorite leadership books, podcasts and talks for ideas. I took many long walks thinking about a heartfelt message to provide psychological safety to my new team. After many iterations, I eventually came up with that sentence, carefully choosing each word.

Based on the feedback I received, I can tell my obsession on making a great first impression has been very effective. I focused on building trust as my number one priority. I proactively reached out and connected to my team members, my peers, and leaders above me. Following radical candor suggestions, I shared a bit of my life story to start the conversation. Then, I actively listened to them, asked for suggestions, and made sure I took notes and always followed up if there was an action item for me.

In this post, I am sharing an actual artifact from my first call with my peer and a fellow engineering manager Peter Hess. Peter has over 3 decades of industry experience, and he has been at Microsoft for over 15 years, worked in many key Microsoft products in several countries. During my first one:one with Peter, I shared my screen and asked him to talk about what he does as an engineering manager at Microsoft. It helped me to get a great visual of his job, but more importantly, it helped me understand the expectations of my job as a new hire. Click on “See the board” button below to see the output of this hour-long conversation between me and Peter. Even if you are not at Microsoft, I think this mind map shows a pretty well-rounded view of the engineering manager’s job.

Of course, to succeed in a new job as a manager, you need to do more than just creating a first impression. The meta point is, prepare and take leadership in your own onboarding journey. It’s worth it.

Software Architecture - Topic 6 - Slack and Microsoft Teams

Most applications have a request-response based single-channel data-flow. In such systems, human or software triggered requests are served by software provided responses. For example, when you make your DuckDuckGo search, you initiate a request and their server produces a response back to you. Realtime multiplayer systems are quite different because the pattern of information flow is more complex, often being a two-way or many-to-many data flow, with strict latency constraints. For example, when you chat with a bunch of friends, or join them for a video call, the data-flow is quite different than when you watch a YouTube video.

I found two great talks from the folks at Slack about “How Slack Works” and “Scaling Slack”. The nice thing about these two talks is they are presented one year apart, and it gives a great view into the challenges with designing a realtime multiplayer system in the first place, and then evolving the design to meet scaling needs. To an aspiring architect, these two talks can provide a real-life example of thinking in terms of evolutionary architecture as a vital tool to strike a balance between upfront and just in time design.

I also recommend you to check the architecture of Microsoft Teams. The contrast between Slack and Teams design will show you the stark difference between the two approaches. The key difference between the two is Slack was built from scratch, and Teams was built on top of a whole bunch of existing services such as Skype, OneDrive, Sharepoint, etc. As a result of different organizational dynamics, the two products are quite unique in their architecture even though there’s a major overlap of features offered by both.

Once you get a chance to watch these talks, I recommend taking some time to think about the main conceptual design elements of realtime multiplayer systems. For example, patterns of many-to-many communication channels, low latency data-flow, security and access control, etc.

Software Architecture - Topic 5 - MongoDB

Continuing on this architecture series of posts. Similar to the post on Redis, this time let’s focus on another hugely popular distributed database called MongoDB. If you aren’t familiar with MongoDB, it’s a distributed database that allows you to store and query humongous amounts of JSON-like data.

To get an overview of MongoDB and its architecture, you can watch the following YouTube video:

Of course, you can also download the official architecture guide to learn more. Let’s focus on rationale behind MongoDB’s architectural decisions. You can find answers to all of the following questions if you went through the guide or the video. But my goal is get you to think about the why behind the decisions.

MongoDB has databases and collections. While collections are conceptually parallel to tables in traditional relational databases, MongoDB doesn’t impose any restrictions on the schema of a collection. Can you think of reasons behind this decision? What trade-offs come to your mind? For example, not having a schema allows you to store anything within a collection, essentially reducing the need for schema migrations on very large databases. But at the same time, it may complicate consumption of the data. What other trade-offs can you think of?

MongoDB uses a router named Mongos and needs config servers to route queries when data is sharded into multiple partitions. What benefits and challenges do you see with the addition of this router component?

MongoDB requires you to have an uneven number of replicas to deal with failover when a new primary node needs to be elected. How do you judge this design decision?

By design, MongoDB allows you to choose a write concern at write time to allow you to balance your need in the CAP triangle. Can you explain what you must do to ensure you’ll not lose data is 2 out of 3 of your replica servers were to go corrupt?

MongoDB allows you to run complex map-reduce queries in the form of aggregate pipelines. In a distributed database, which component within the MongoDB ecosystem is responsible for aggregating data that spans multiple nodes?

If you followed the Redis post, what would you say are the top 3 architectural differences between MongoDB and Redis? Can you reason about the why behind the differences?

Based on MongoDB’s architecture, where do you see a potential bottleneck that can affect scaling a MongoDB cluster? What can you do to workaround the limitations?

Software Architecture - Topic 4 - Redis

Welcome back to the Software Architecture series. I know at least a few people from my team are following, and that’s a great encouragement.

For today’s post, let’s focus on learning from a very popular and commonly used open-source project called Redis. To the developers, Redis is a dead simple key-value store with a super simple API as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ set today 'Thursday'
OK

$ get today
Thursday

$ set temp 20
OK

$ incr temp
21

$ incrby temp 3
24

$ get temp
24

Of course Redis has more advanced features, but not too many. I think Redis is a delightful system. It’s fun to use and has a reputation for being incredibly fast and scalable. I’m going to recommend you to spend some time on Redis architecture and see if you understand the concepts to confidently answer the following questions:

  1. Why is Redis so fast?
  2. What can you do to prevent data loss when using Redis?
  3. How does Redis distribute its data to multiple nodes?
  4. What happens when you add a new node to a Redis cluster?
  5. What happens when you remove a node from a Redis cluster?
  6. How does Redis allow you to have an even distribution of the data in your cluster?
  7. How can you build resilience using Redis when a whole datacenter fails?
  8. How does a Redis client discover which Redis server to go to?
  9. How do you know if you have enough capacity in your Redis cluster?
  10. How does Redis provide end-to-end encryption?
  11. If a Redis cluster dies, how can you restore it?
  12. What metrics would you use to monitor if your Redis cluster is healthy?

My plan is to introduce you to a bunch of open-source systems like Redis and ask similar questions. The idea is, after going through a few of these systems, you’ll start to see patterns and trade-offs for each. Being familiar with real-world systems and seeing the patterns in use, I hope you’ll be able pick and choose the patterns that best fit your system requirements, environment, and people.

Happy learning!

Can I Have a Career as a Frontend Engineer?

In my current role at Microsoft, I’m working on a UI SDK product. I hear this concern from some of my team members. More specifically, here’s a paraphrased version of what I hear:

“I talked to my friends in software and they told me it’s better to work on the backend or full-stack to have a fast-tracked career.”

First, I do agree that there are generally more full-stack or backend engineering jobs than purely frontend jobs. I also agree that there’s a general perception that frontend is easy / bunch of scripts / not real engineering, yada yada…

However, I think it’s a short-sighted view. Let me make my point here.

Let’s imagine the UI and UX of a familiar product, Google Maps. You can use Google Maps on your browser or natively on the phone. You can embed Google Maps within your own app on these platforms as well. You can ask Google Maps to give you navigation direction for walking, biking, driving, or ask it to show realtime transit and traffic infomation. If you take a detour, it’ll show you new directions on the fly. You can see the map view, or the 3D view, or a camera view of a location. At night, you can see the dark-mode. It’ll show you how many lanes you have on a road, and how fast you can legally go, in realtime. It’ll let you share your location in realtime with your buddy. You can search using your voice and it’ll also give you turn by turn voice guidance. Let’s not mention avoiding toll-free roads, u-turns, etc.

I hope you see the engineering challenge I see in the above paragraph. Very few engineers I know can architect a system such as Google Maps. Since a lot of the engineers choose the path of backend engineering, it’s incredibly hard to find frontend engineers who can pull high impact projects. If you consider yourself above average, and you probably are if you’re reading this blog, you should not follow the path of the average. Instead, if you like building visual products and obsessing about delighting millions or billions of users, you can have a very rewarding and fast-paced career in frontend engineering.

Most of your engineering knowledge is transferrable, irrespective of what part of the stack you work on. After all, you’ll learn to work with people, delight customers, build systems that are robust, scalable, secure, compliant, testable… So, you can move into a different part of the stack at will as long as your foundation is strong.

Modern frontend engineering is complex, but it’s powered by innovative tools. Most of the tooling is open-source and the community is vibrant with lots of conferences and meetups around the world. As a frontend engineer, you can create an outsized impact and differentiate yourself from the masses.

Software Architecture - Topic 3: Writing

An architect first needs to write for herself, and then for her team. Let me explain a bit.

An architect takes the trio of requirements, people, and environment, and does her research to design the most delightful system. In her research process, she uses her past experience as well as the experience of others. Even a moderately complex system design involves a lot of trade-offs without clear winners. For example, given that there are tens of different databases one can choose, how can she recommend a specific one? Only a clear mind can produce a logically sound write-up. So, in the process of writing and rewriting her design rationale, an architect can strengthen the soundness of her own logic behind the choices made. This is also known as covering one’s ass.

Secondly, an architect is a busy person and can’t scale her time if she has to personally explain her design rationale to everyone. In fact, as she’s designing the system and making certain assumptions, she must seek feedback from the team to help her find alternatives or blind-spots. Writing scales infinitely (e.g. J.K. Rowling), and after all, an architect must use a scalable system for herself, right?

In this episode, I have two all time great books to recommend, The Elements of Style and On Writing Well. If you want to learn to write with a fascinating biographic story, I loved On Writing (A Memoir of the Craft) by Stephen King. Do yourself a favor and get these books. Even if you are a native English speaker, I recommend you reading these books to make your writing interesting. As you can imagine, these books on writing are fun read, it’d be quite an irony otherwise.

As promised before, this is my last soft-skills related post in the software architecture series. Only a few senior engineers will break the glass ceiling and become an architect. Fewer will become a great architect. All the great ones I’ve met had exemplary soft-skills.

What Drives Me?

First, working with good people for a good purpose. Then, moving fast and delivering incrementally. Then, a good business case. And finally, a reasonably modern tech-stack.

Other things, such as pay, promotion and prestige are important, but it doesn’t drive me like the above.

This content is suitable for a Twitter post, but putting it here for long term retention. I want to look back after a few years to see how timeless this drivers are for me.