Predictive Analytics and Data-Driven Decision-Making
For data science nerds like myself, the modeling step is often the most fun, in part because it is represents the most tangible value-add deliverable in the data science life cycle. It is also the point at which some certainty regarding decision opportunities begins to emerge. Lastly, it is the stage at which all of the hard work setting up pipelines, cleaning and transforming data, generating features, and performing exploratory analyses begins to pay off.
I tend to break modeling products into two categories. The first encompasses models designed to test specific hypotheses. The second comprises models designed to make unbiased predictions and/or forecasts. The primary distinction between the two is the level of effort placed in testing the statistical appropriateness of the data and model. For hypothesis testing, it is critical that the model includes reliable and valid measures of the phenomena of interest, and that the statistical properties of those measures conform to models assumptions. For prediction and forecasting, there are times when statistical properties and assumptions are relevant, but by and large the focus is more on developing a model that yields the most accurate predictions with the greatest amount of precision.
Research labs and businesses interested in testing experimental or quasi-experimental data (think marketing campaigns, product placement, etc.) tend to care most about the first modeling family. To date, my statistical work has been published in several of my discipline’s top peer-reviewed journals including Clinical Psychology Review, Psychological Medicine, and Emotion. I have used a mix of frequentist and Bayesian approaches in this work and continue to generate manuscripts targeted for publication in these and other outlets.
As for the second set of models, my full-time work is as a Principal Data Scientist at Capital One Financial. Note this means that I cannot take on any clients working in the financial or financial technology sectors due to obvious conflicts of interest. Capital One is an industry leader in leveraging cloud computing and data storage to support its strategic goals, and, as a member of their valuations and infrastructure team, I apply data science and software engineering principles in my work on a daily basis. I am happy to apply these same skills in designing custom data science and modeling solutions for your group’s needs.