Nelson Liu's Blog

Easy Progress Bars For Python File Reading with tqdm

I've been a fan of the tqdm Python module for quite some time, but I found it difficult to find a reason to use it; generally, loops run fast enough...

(GSoC Week 8) MAE PR #6667 Reflection: 15x speedup from beginning to end

If you've been following this blog, you'll notice that I've been talking a lot about the weighted median problem, as it is intricately related to optimizing the mean absolute error...

(GSoC Week 6) Efficient Calculation of Weighted Medians

In my previous blog post, I discussed a method for using two heaps to efficiently find the median for use in the MAE criterion for finding the best split. However,...

(GSoC Week 4) MAE and Median Calculation

In the first part of my project, I am implementing the Mean Absolute Error criterion for the scikit-learn DecisionTreeRegressor. In this blog post, I'll talk about what the criterion does,...

(GSoC Week 2) Intro to decision trees

Apologies for the late post, I had this sitting in my drafts and forgot to publish it! Decision Trees (DTs) are a non-parametric supervised learning method used for classification and...

(GSoC Week 0) How fast is fast, how slow is slow? A look into Cython and Python

The scikit-learn tree module relies heavily on Cython to perform fast operations on NumPy arrays, so I've been learning the language (if you can even call it that) in order...

An Intro to Google Summer of Code

I'm participating in the Google Summer of Code, a program in which students work with an open source organization on a 3 month programming project over summer; I'll be working...