zlacker

[parent] [thread] 2 comments
1. montan+(OP)[view] [source] 2022-10-20 02:48:26
I think what you're missing is that XGBoost is worthless without data to use for inference. That data can come from in process, or over the wire. One is fast, one is not.
replies(1): >>theamk+h5
2. theamk+h5[view] [source] 2022-10-20 03:40:27
>>montan+(OP)
Well, imagine nginx plugin that runs XGBoost. Or even standalone Rust/C++ microservice which provides XGBoost via standard http interface. The data might come from filesystem, or loaded from network location on startup/reload and kept in memory.

Basically, postgresql is a stateful service, and stateful services are always major pain to manage -- you need to back them up, migrate, think about scaling... Sometimes they are inevitable, but that does not seem to be the case here.

If you have CI/CD set up, and do frequent deploys, it will be much easier and more reproducible to include models in build artifact and have them loaded from filesystem along with the rest of the code.

replies(1): >>montan+n7
◧◩
3. montan+n7[view] [source] [discussion] 2022-10-20 04:08:39
>>theamk+h5
Stateful services are indeed more painful to manage than non stateful ones. Ignoring state (data fetch time) for ML as if the model artifact is the only important component is... not a winning strategy.
[go to top]