I’m a data scientist with 6 years of experience developing and delivering machine learning solutions. This includes 5 years with the Tennessee Valley Authority. I’m currently enrolled in Georgia Tech’s Online Master’s of Computer Science program, where I continue to advance my skills.
For more questions or examples of my work, please reach out to proofthatnicklewisexists@gmail.com.
-
I have professional experience leveraging tree-based methods, neural networks, and linear models in production settings and managing MLOps pipelines. I’m a believer in the No Free Lunch theorem and that understanding the structure of a problem is essential to determining the structure of your solution.
At various points in my academic career, I’ve been able to focus in on reinforcement learning, deep learning, and computer vision.
-
I have experience buliding custom retrieval-augmented generation (RAG) pipelines, evaluating their performance, and communicating their limitations. I approach these problems within ‘traditional’ AI patterns such as planning, meta-reasoning, and correcting mistakes.
-
I have experience in distributed, cloud computing pipelines and SQL Server database design. I’ve identified and fixed bugs, implemented upgrades and improvements, and supported new downstream analytics offerings.
-
Plenty of my work also falls into the ‘other’ category. This includes automation, data visualization, project management, R Shiny App development, various EDA, and teaching.
TECHNICAL SKILLS
Machine Learning: Time Series Modeling, Reinforcement Learning, Deep Learning, RAG Systems
Data Engineering & MLOps: Databricks, Dataiku, SQL Server, SSIS
Programming: Python (PyTorch, PySpark, Dash, etc.), R (Shiny, etc.), SQL
PROFESSIONAL EXPERIENCE
Data Scientist | TENNESSEE VALLEY AUTHORITY | September 2025-Present
Quantitative Analyst II | May 2023 – September 2025
Associate | June 2021 – May 2023
Intern | Summer 2019, May 2020 – June 2021
Implemented enterprise CMS improvements through RAG-generated summaries and keyword-based search algorithms, estimated to save 2000 hours annually.
Supported the adoption of Dataiku as TVA’s MLOps solution across multiple data science teams.
Deployed R Shiny Applications critical to TVA’s Coal Combustion Residuals compliance and monitoring.
Executed 30-year energy efficiency impact forecasts for seven Power Supply Plans, re-engineered ~200 regression models (average .1 improvement in Adj. R Squared), and redesigned associated R Shiny tools.
Supported TVA’s 2022 Energy Efficiency Expansion Study, 2025 Strategic Load Forecast, and 2026 Integrated Resource Plan, which allocated $1.5B to be spent on energy efficiency through 2027.
Designed TVA’s Program Evaluation and Analytics database, which stores and manages 4.1 million records of evaluation, measurement, and verification data for energy efficiency and electrification programs.
Identified, implemented, and optimized solutions in a Databricks pipeline to manage ~100 billion rows of time-series meter data and customer information, leveraging Azure SQL Databases and data lakes for storage.
Monitored and maintained energy/demand (XGBoost) forecasting modules for 16 local power companies.
Leveraged relative usage patterns in meter data to perform customer clustering and segmentation.
Stochastically simulated coal and gas dispatch heuristics using a mean-reversion, jump-diffusion framework.
Assessed TVA’s smart thermostat demand response pilot by constructing participant-level baselines and presented results to the annual AEIC conference, Knoxville Utilities Board, and the TVA CFO.
Trained logistic regression and random forest models to predict product returns for TVA’s supply chain.
Graduate Projects | GEORGIA INSTITUTE OF TECHNOLOGY | January 2023 – March 2024
Developed and tested a knowledge-based AI agent to solve Raven’s Progressive Matrices, passing 92% of the provided official test cases.
Designed independent deep Q-learning networks (IDQN) and value decomposition networks (VDN) to solve single- and multi-agent Markov decision problems such as Cart Pole and Overcooked.
Trained CNNs to detect AI-generated images in the CIFAKE dataset and used transfer learning to reach 93.46% testing accuracy, compared to a benchmark of 92.98%.
EDUCATION
University of Evansville
B.S. in Stat. and Data Science and Applied Math
(Specification: Computer Science)
2017 - 2021
Georgia Institure of Technology
M.S. in Computer Science
(Specialization: Machine Learning)
2023 - Present