I’m a data scientist with 6 years of experience developing and delivering machine learning solutions. This includes 5 years with the Tennessee Valley Authority, as well 2 years as an undergraduate analytics consultant at the Univerisity of Evansville. I’m currently enrolled in Georgia Tech’s Online Master’s of Computer Science program, where I continue to advance my skills.
For more questions or examples of my work, please reach out to proofthatnicklewisexists@gmail.com.
-
I have professional experience leveraging tree-based methods, neural networks, and linear models in production settings and managing MLOps pipelines. I’m a believer in the No Free Lunch theorem and that understanding the structure of a problem is essential to determining the structure of your solution.
At various points in my academic career, I’ve been able to focus in on reinforcement learning, deep learning, and computer vision.
-
I have experience buliding custom retrieval-augmented generation (RAG) pipelines, evaluating their performance, and communicating their limitations. I approach these problems within ‘traditional’ AI patterns such as planning, meta-reasoning, and correcting mistakes.
-
I have experience in distributed, cloud computing pipelines and SQL Server database design. I’ve identified and fixed bugs, implemented upgrades and improvements, and supported new downstream analytics offerings.
-
Plenty of my work also falls into the ‘other’ category. This includes automation, data visualization, project management, R Shiny App development, various EDA, and teaching.
TECHNICAL SKILLS
Machine Learning: Time Series Modeling, Reinforcement Learning, Deep Learning, RAG Techniques
Data Engineering & MLOps: Databricks, Dataiku, SQL Server, SSIS
Programming: Python (PyTorch, PySpark, etc.), R (Shiny, dplyr, etc.), SQL
Data Visualization: Power BI, Plotly, ggplot2
PROFESSIONAL EXPERIENCE
Quantitative Analyst II | TENNESSEE VALLEY AUTHORITY | May 2023-Present
Facilitated adoption of Dataiku as TVA’s MLOps solution across 3 data science teams.
Saved 2000 annual hours by improving enterprise content search with retrieval-augmented generation (RAG).
Executed seven 30-year energy efficiency impact forecasts, re-engineered ~200 regression models (up to .5 improvement in Adj. R Squared) and redesigned associated R Shiny tools.
Designed TVA’s Program Evaluation and Analytics database (SQL Server) to store and manage 4.1 million records of evaluation, measurement, and verification data for energy efficiency and electrification programs.
Supported TVA’s 2022 Energy Efficiency Expansion Study, 2025 Strategic Load Forecast, and 2026 Integrated Resource Plan, allocating $1.5B to be spent on energy efficiency through 2027.
Identified, implemented, and optimized solutions in a Databricks pipeline to manage ~100 billion rows of time series meter data and customer info, leveraging Azure SQL Databases and Data Lakes for storage.
Monitored and maintained novel near-term load and demand forecasting modules using XGBoost.
Associate | TENNESSEE VALLEY AUTHORITY | June 2021-May 2023
Leveraged relative usage patterns in meter data to perform customer clustering and segmentation.
Stochastically simulated coal and gas dispatch heuristics using mean-reversion and jump-diffusion.
Intern | TENNESSEE VALLEY AUTHORITY | Summer 2019, May 2020-June 2021
Assessed TVA’s smart thermostat demand response pilot by constructing participant-level baselines and presented results to the annual AEIC conference, Knoxville Utilities Board, and the TVA CFO.
Used logistic regression and random forest models to predict product returns for TVA’s supply chain.
Graduate Projects | GEORGIA INSTITUTE OF TECHNOLOGY | January 2023 – March 2024
Developed and tested a knowledge-based AI agent built to solve Raven’s Progressive Matrices, passing 92% of official provided test cases.
Designed independent deep Q-learning networks (IDQN) and value decomposition networks (VDN) to solve single- and multi-agent Markov decision problems such as Cart Pole and Overcooked.
Trained CNNs to detect AI-generated images in the CIFAKE dataset and used transfer learning to reach 93.46% testing accuracy, compared to benchmark of 92.98%.
EDUCATION
University of Evansville
B.S. in Stat. and Data Science and Applied Math
(Specification: Computer Science)
2017 - 2021
Georgia Institure of Technology
M.S. in Computer Science
(Specialization: Machine Learning)
2023 - Present