• Basic FBThrift example

    Facebook re-opensourced their fork of Apache Thrift some 4 years ago. Yet there is relatively little documentation and independent comparison to see if there is any performance bonus to gain by moving from Apache Thrift to FBThrift. The two are no drop-in replacement, thus, replacing one with another requires effort. This post first looks at setting up and running FBThrift. Complete code example used in this post can be found here.

  • Floating point binary representation in javascript

    Inspired by this post which explains how computer stores floating point number, Here is a bit of javascript code that print out a float32 number in binary format. Firstly, the main idea is, floating point number is represented similar to scientific representation of numbers using E notation.

  • Derivative of loss function in softmax classification

    Though frameworks like Tensorflow, Pytorch has done the heavy lifting of implementing gradient descent, it helps to understand the nuts and bolts of how it works. After all, neural network is pretty much a series of derivative functions. In this blog post, let’s look at getting gradient of the lost function used in multi-class logistic regression.

  • Perspectives in designing ML cost function

    For many different machine learning problems, finding a solution involves these similar steps.

  • A few things Firefox does better

    Recently, Microsoft has ditched EdgeHTML in favor of Chromium rendering engine in its Edge browser. It led to mourning and concern over the uncertain future of the open web, in which, a single big company may have much control over standards and features of the web. Mozilla rallied internet users to give Firefox another try. They have a point. The web has always been a messy place with changing standards with which browser vendors catching up at different speed. A nice looking application may look well on Firefox yet broken on Chrome and vice versa. Testing effort is duplicated to ensure consistent look across browsers. Yet, that messiness and duplication of effort creates a larger space for experiments, more opportunities of participation and better chance to make some thing good. Because, I think, innovation has always been done by allowing independent minds to find different approaches to the solve the same problem. And because, in technology, it is always about weighing trade-offs. The fastest implementation is not necessarily the most modular and embeddable one.

  • How dk.brics.automaton regex library works

    This brics regex library is by far the fastest when comparing with openJDK java.util.regex and com.google.re2j. Let looks at what lie under the hood. dk.brics.automaton is a Finite automata library with application in Regex. The idea is similar to google re2j, which is to construct a DFA from regex string and matching an input string means advancing from one state to another. (google re2j is surprisingly the slowest in my test case. Which is probably due to my particular regex input, or some bug with Java port. I have not examined it yet)

  • Misconceptions when scale applications

    Once in a while, engineering managers and software engineers would triumphantly tell me to use a particular piece of technology because it is scalable or good for concurrency. Unfortunately, scaling an application means trading off different factors affecting an app’s performance and no single technology comes as silver bullet to solve them all. Below are some of the misconceptions I often hear.

  • How regex is implemented

    Recently at work, I need to take a deeper look at how optimization is done by different regex libraries and how they combine regex patterns. This document is examining 2 regex implementations: java.util.regex of OpenJDK and re2 of Google.

  • How java load classes

    For a while, I have delegated the task of managing java run command to IDE. Eclipse, Netbean and IntelliJ and all seem to do a decent job of masking away complexity of supplying java with JVM parameters, classpath, debugging options… Compared to other languages, the java run command can get horrendously long, not very suitable for handcrafting. Recently, when working on some sort of command generating at work and playing around with java command line arguments, I encountered some minor gotchas.

  • I no longer use dual monitor at work

    When I first started out my career as a software engineer, professionals working with multiple displays seemed cool and power packed. Such desk arrangement creates an impression of a lots of things going on. People would quickly move their gazes between one montior and another as open files scattered across screen. Perhaps that helped multi-tasking as you can code on one monitor and refer to documents on another. I used to do that too. But now I’m more comfortable using a single monitor. It helps with the neck since I don’t have to turn around. And screen estate is small. That limitation forces me to close unnecessary stuff. Focus on one thing at a time.

Subscribe via RSS