- replace json (storing data as strings? really?) with a binary format like protobuf, or better yet parquet
- replace redis with duckdb for zero-copy reads
- replace pandas with polars for faster transformations
- use asynchronous, modern web framework for microservices like fastAPI
- Tune xgboost CPU resource usage with semaphores
- I don't think doing one less `memcpy` will make Redis faster over the network.
- We didn't use Pandas during inference, only a Python list. You'd have to get pretty creative to do less work than that.
- That will use less CPU certainly, but I don't think it'll be faster because we still have to wait on a network resource to serve a prediction or on the GIL to deserialize the response.
- Tuning XGBoost is fun, but I don't think that's where the bottleneck is.