Jenn Wong Data Science & Machine Learning
    About     Projects

Machine Learning & Experimentation

While doing code review for a Machine Learning project, I saw that a colleague approached data cleaning differently than I do–they applied data transformations on the entire dataframe, while I have always done it on select columns. I thought it was interesting, and wondered which was more computationally efficient (if there was a clear difference at all). Upon scouring google and StackExchange, I found that the answer to my question could not be found on the internet. google_search I decided this was a perfect example to use for a mental exercise in Experimentation and Statistical Analysis.

Read more

Topic Modeling with Non-Negative Matrix Factorization

Unstructured text data provides a wealth of information, but require time and resources to parse through the corpus. As the corpus increases in volume, the energy and resource investment greatly increases as well. Luckily with Topic Modeling, we are able to utilize Statistics and Linear Algebra to quickly and efficiently extract insights. This creates a framework and structure to analyze data in the future, and defined dimensions which allow for clearer communication.

Read more