Interview with America First Credit Union’s Richard Woolston
Arize AI / America First Credit Union
Last month, we hosted an event on “Best Practices in ML Observability for Lending & Insurance” that featured a fireside chat with America First Credit Union’s Data Science Manager Richard Woolston.
In the wide-ranging interview, Woolston shared his thoughts on the use of AI in the financial services industry and how his team approaches model monitoring and ML observability. Here are some of the more pointed questions from that session, and Woolston’s takeaways.
Arize: America First Credit Union is one of the largest credit unions in the country. How do you help the organization continue to stand out in a competitive market?
Speed. Everyone these days wants a quick response. Models are the best way to generate lending decisions faster. Another way our team stands out is in identifying segments that we may not have targeted and finding ways to capture that group. So it’s speed and ensuring we’re analyzing the portfolio as a whole as effectively as possible.
Arize: What are your biggest challenges in terms of model development and keeping an eye on model performance in production?
It’s a couple of things; the first one is delayed actuals. Often, the feedback time for amortizing loans – so things like auto loans, RV loans, etc. – can be quite substantial because the tag or the label that you actually care about the most is did the loan pay off, did it write off, or is it still active?
It can take a couple of years depending on the product. One of our main challenges is identifying proxy metrics ahead of these, such as high levels of delinquencies or whether a member is paying down ahead of schedule. By working those into our monitoring framework, we don’t have to wait the two or three or four or five years before either the loan pays off or writes off to get a sense of portfolio health.
Another challenge: legacy systems. We have some actuals that people update manually in spreadsheets so we work with the analysts and move those actuals into the database so that way systems can access them in a much friendlier way.
Arize: Let’s talk about proxy metrics. Is drift one that you’re looking at actively and are there other proxy metrics that are top-of-mind?
As we deploy a model into production, drift is the first thing we look at as predictions start to come across over the hours or days to know that things are stable. If not, we go ahead and roll back and identify what pieces were missing.
As I mentioned earlier, delinquency is a huge proxy metric – so did someone make their first payment and are they 45 days behind, 60 days behind and those types of things – and then eventually the final metrics like whether they pay it off or whether they are still active, which can come substantially longer in the life cycle depending on the product.
Arize: How are you accounting for fair lending regulations and bias assessment as it relates to race, gender, sex and other protected groups?
That’s a great question and it’s always at the forefront of our minds. Obviously we don’t use any of these metrics within our models, but the nice thing about Arize is that we can submit gender and age as features for the model even though we didn’t use them in the model itself. That allows us to go ahead and double check to see breakdowns, so we’re able to watch those very specifically, and even tag them with monitors.
Arize: When monitoring your model’s health and performance, what are the specific metrics you’re looking at? Are there intermittent estimates of performance given delayed actual values?
The metrics depend on the domain, but specifically within lending we tend to focus on three key metrics: recall, precision, and mean absolute error (MAE). As loans come through our system, we withhold five percent of the responses. Those loans are then moved to manual underwriting, where an underwriter goes ahead and performs their natural operation and says this is approved or declined.
We monitor those specifically, using recall and precision – ensuring the model hasn’t drifted away from underwriting, whether we are capturing enough of those approved loans, and then whether we label it correctly.
Then, based upon those loans, we look at what we missed – so things like credit cards – and what does our limit look like compared to underwriting’s limit, which is where MAE and mean squared error come into play. That’s our first line, and then as we get into delinquency where we once again use precision and recall but don’t use MAE because there’s nothing really to measure against. Then with write-off, it’s just recall and precision.
Arize: Once you’re alerted of an issue, how do you troubleshoot?
The start of this workflow is we’ll get an email from Arize that automatically gets turned into a Jira ticket that says “your credit score average is out of whack in some way” and whoever is on call goes ahead and works that issue to determine whether we need some sort of longer discussion or if it is some sort of short-term seasonality thing.
We also have great product teams that we will have a back-and-forth where we tell them that we’ve identified this problem – how do we want to work through it? Here’s the impact on the model, here are the actions – we can either retrain or we can say there is drift but it’s not a problem yet.
It’s really this interaction between the product owners and my team that is critical to any major troubleshooting because we don’t actually own any of the models – we act more as an internal consultancy at America First Credit Union. We build and deploy the models and then we work with the product owners, who are even more well-versed in the product and often drift issues require that deeper understanding for the proper context.
Arize: How do you see the future of lending continue to be shaped by ML models?
I think that access to credit becomes easier, especially for groups that are traditionally excluded – and especially as we bring in more and more and more data. I also think regulation and understanding become easier over time, as systems needed to support these ecosystems naturally become more self-documenting.
Enjoyed this article? Sign up for our newsletter to receive regular insights and stay connected.

