Ask HN: Who wants to be hired? (June 2023)

>>whoish+(OP)
UWaterloo CS grad. Want DBA work in query optimization/data layout, compiler related projects, other optimization related efforts, or try some devops (though I’m not as experienced with that)

I’ve achieved 40x improved table storage size with Postgres, placed top 5 in compiler code size optimization contest at school, love lisp macros and have led an optimization effort where we achieved 50% reduction in p50 graphQL performance across the board.

Love to teach people about the benefits of clustered data in databases. Really want to try pg_repack, I figure this could reduce costs of many transactional DBs by 30% due to using less memory and compute for similar QPS and only mildly more storage needs.

Contract or fill time is ok

Location: Waterloo ON

Remote: On-site or remote

Willing to relocate: no

Technologies: Postgres MySQL

Résumé/CV: https://linkedin.com/in/srcreigh

Email: shaneecy at gmail.com

>>srcrei+Oj2
The table size optimization came from a wide keyed table with a single value as data. The proposed solution was simple (wide key, date, data). I showed that storing data as an array with start_date for indexing (one value per day in the array) reduces storage by 40x, and is also faster even with many years worth of data in the array.

The code size optimization used the standard dead code elimination, constant propagation and folding, peephole ASM optimizations. I also found that the hidden test suite had a defect where some valid code could be eliminated and it’s associated tests wouldn’t catch it, but forget the details. I informed them but they let me keep the savings.

I know how to send compiler errors in Racket IDEs using a macro (static code checks). I also made a macro which reads DB table column names and creates variables based on them, so the code would break based on typo checking against the actual table name.

The graphQL optimization came from suspiciously high serialization costs in python Apollo graphQL server. Something like 70% of time handling request in serialization. Someone else figured out a change to help here, it took some prodding them to keep looking for the right fix as 70% time in serialization seemed absolutely nuts.

zlacker