Dexter Antonio

1672 Spring Street · Davis, CA 95616 · (415) 971-3026 · dexter.d.antonio@gmail.com

I'm a data scientist and developer passionate about applying next-gen data science techniques to speed up research and draw novel insights from unusual data.

Relevant Experience (Please see my LinkedIn for an updated resume)

Graduate Student Researcher

UC Davis, Department of Chemical Engineering

Automated experimentalist’s workflow by using a trained ML algorithm to classify thousands of Raman spectra
Applied machine learning algorithms to identify cancerous and noncancerous blood samples
Coordinated the development of a python software package with git and GitHub
Screened materials for their CO2 capture ability with Grand canonical Monte Carlo (GCMC) simulations
Designed and trained a convolution neural network to identify chemical concentrations from simulated spectra

September 2018 - Present

Teaching Assistant

UC Davis, Department of Chemical Engineering

Led live Python labs on utilizing NumPy and Matplotlib to analyze and visualize energy transport phenomena
Independently led labs consisting of 30+ students and ensured a safe and positive environment
Taught complex chemical engineering concepts in an accessible and exciting way through the design of coffee

September 2019 - Present

Participant

UC Davis, DataLab’s Hack for California Research Cluster

Implemented Elasticsearch to scan tens of thousands of pages of converted pdf documents for key phrases
Created a California General Plan mapping website with R shiny that visualizes general plan policies

January 2020 - Present

Data Science Intern

Allstate

Improved lead ranking by developing a model to predict clients’ likelihoods to bind using billions of rows of data
Applied SQL to filter, clean and build a dataset used to train and identify a high performing ML model
Engineered a temporal feature which captured seasonal trends while avoiding bias, improving model accuracy
Effectively communicated with team ensuring ML models were well understood and incorporated into business processes

June 2020 - September 2020

Download a PDF of my Resume Here

Education

University of California, Davis

Master of Science

Chemical Engineering

GPA: 3.85/4.00

June 2018 - Present (expected June 2021)

Dual Bachelor of Science and Bachelor of Arts Degree Program

Columbia University

Bachelor of Science

Chemical Engineering

GPA: 3.55/4.00

University of Puget Sound

Bachelor of Arts

Chemistry

GPA: 3.84/4.00

September 2012 - May 2017

Projects

Machine Learning Assisted Sampling of SERS Substrates for early cancer Detection

Surface Enhanced Raman Spectroscopy (SERS) is an analytical technique used for detecting low-abundance biomolecules in biological fluids. One particularly promising application of this technique is the detection of cancer-cell-originating extracellular vesicles in human blood. The presence of these extracellular vesicles indicates the presence of cancerous cells in the body, thus providing an early warning of cancer. A major barrier to the widespread application of this technique is the challenge associated with distinguishing cellular signals from background signals in the SERS spectra.

I addressed this problem through careful data engineering and the application of modern machine learning algorithms. First, I built a program that allowed experienced researchers to label spectra as either “good” (i.e., a real signal) or “bad” (i.e., background). After this labeling was complete, I then used a random forest algorithm to automatically classify unlabeled spectra as “good” or “bad”. Finally, I demonstrated that this algorithm could be integrated with LabVIEW. This integration demonstration set the groundwork so that the classification could be done automatically, allowing for more efficient data collection.

Zeolite Simulation Environment (ZeoSE) Package Development

I am currently building the Multiscale Atomic Zeolite Simulation Environment (MAZE) python package to streamline the calculations performed in my research group. The MAZE project extends the Atomic Simulation Environment (ASE) to naturally represent zeolites facilitating the calculations required to determine their properties. The main functionality of this code comes about by creating classes which represent zeolites and their derivatives. These zeolite classes inherit from ASE’s Atoms object, and can be treated as one. MAZE also includes additional functionality, performing calculations and allowing zeolite derivatives to be quickly generated.

Screening Zeolites for their CO2 capture ability with Grand Canonical Monte Carlo (GCMC) Simulations

Global climate change is one of the biggest challenges humanity faces today. One promising solution is retrofitting existing fossil fuel powerplants with CO2 capture technology. Zeolites, which are nano-porous materials used industrially for gas separation, hold considerable promise for solid-state CO2 capture. The diversity of zeolite chemical structures, makes identifying the best Zeolite for this purpose challenging, thus computational chemistry experiments are needed to narrow down the number of potential candidates.

To identify promising zeolites, I utilized Grand Canonical Monte Carlo (GCMC) simulations to predict the CO2 capture ability of a number of zeolites. To facilitate these calculations, I built a host of tools to make setting up the calculations simple. For example, I built an extension of the Atomic Simulation Environment (ASE) commonly used in computational experiments to better facilitate zeolite calculations. I have also written a Python wrapper for the C-coded GCMC simulation package to rapidly perform calculations.

General Plan Mapping Project

I am currently a participant in UC Davis Lab’s Hack for California Research cluster where I lead the General Plan Mapping Project. This project aims to make the hundreds of general plans prepared by California cities and counties easily searchable.

A major challenge of this project was finding a way to search the tens of thousands of document pages efficiently. To meet this requirement, I developed a custom word indexer in C++, which was able to integrate with the initial mapping dashboard and achieve high performance. The website was then reprogrammed in Python, which allowed me to utilize Elasticsearch to improve the searching capabilities significantlyThe first version of the R shinny website can be found here. The new Elasticsearch based website can be found here . The Github repo for the python version of this projects can be found here.

Skills

Programming Languages & Tools

Workflow

Pragmatically using both Object-Oriented Programming and Functional Programing Styles
Utilizing git and github for version control to colaborate on team projects
Agile Development & Scrum

Keywords

Data Science, Machine Learning, Agile Project Management, Industry Knowledge, Mathematics, Chemistry, Laboratory Skills, Chemical Engineering, Agile Methodologies, Strategic Planning, Project Management, Data Analysis, Requirements Gathering, Software Development, Quantitative Analysis, Big Data, Artificial Intelligence (AI), Cell Culture, Computer Vision, Python, R (Programming Language), SQL, Python (Programming Language), MATLAB, Git, Elasticsearch, C, C++, Cross-functional Teamwork, Communication, Presentations, Problem Solving, Leadership, ML/AI, Machine Learning Engineering, openCV, Data Pipelines, ML Models, CI/CD, Data Analytics, FinTech, NumPy, Pandas, Scikit-learn