Nelson Liu's Blog

gsoc

My Journey in Open Source / How to Get Started Contributing

I just finished the Google Summer of Code Program, wherein I worked on the Python machine learning package scikit-learn. Since I began working with the project in November 2015, I've...

scikit-learn GSoC Summary, Lessons Learned, and Future Work

This summer, I was quite fortunate to work on the scikit-learn project with my mentors Jacob Schreiber and Raghav RV as part of the Google Summer of Code Program. I...

(GSoC Week 10) scikit-learn PR #6954: Adding pre-pruning to decision trees

The scikit-learn pull request I opened to add impurity-based pre-pruning to DecisionTrees and the classes that use them (e.g. the RandomForest, ExtraTrees, and GradientBoosting ensemble regressors and classifiers) was...

(GSoC Week 8) MAE PR #6667 Reflection: 15x speedup from beginning to end

If you've been following this blog, you'll notice that I've been talking a lot about the weighted median problem, as it is intricately related to optimizing the mean absolute error...

(GSoC Week 6) Efficient Calculation of Weighted Medians

In my previous blog post, I discussed a method for using two heaps to efficiently find the median for use in the MAE criterion for finding the best split. However,...

(GSoC Week 4) MAE and Median Calculation

In the first part of my project, I am implementing the Mean Absolute Error criterion for the scikit-learn DecisionTreeRegressor. In this blog post, I'll talk about what the criterion does,...

(GSoC Week 2) Intro to decision trees

Apologies for the late post, I had this sitting in my drafts and forgot to publish it! Decision Trees (DTs) are a non-parametric supervised learning method used for classification and...

(GSoC Week 0) How fast is fast, how slow is slow? A look into Cython and Python

The scikit-learn tree module relies heavily on Cython to perform fast operations on NumPy arrays, so I've been learning the language (if you can even call it that) in order...

An Intro to Google Summer of Code

I'm participating in the Google Summer of Code, a program in which students work with an open source organization on a 3 month programming project over summer; I'll be working...