The website is meant for people that look for jobs on HackerNews "Who's Hiring" threads and want to focus more on fresh ads and companies, or to quickly look up ad history of any company.
Github repository: https://github.com/nemanjam/hn-new-jobs
Demo website: https://hackernews-new-jobs.arm1.nemanjamitic.com
I used Algolia API as a data source, along with scheduled task that parses new threads few times at the beginning of each month. The extracted data is then stored in SQLite database for fast querying, and the results are cached with Keyv for faster page responses. I will see in the future how much traffic the website receives and if this stack is performant enough. For the website I used Next.js app with default ShadcnUI components and charts. I just wanted a quick functional prototype to test how much public interest is there for an app with functionality like this.
If you are interested in more implementation details you can find them in the Readme file on Github.
The project is free and open source. Feel free to use, self-host, fork and modify, and contribute. I would love to hear your impressions and suggestions and look forward to discussing features and technical details.
It gets you 80% of the way there on any HN data project.
I don't understand what is missing in 2024-08? You can link month by slug, I see nothing unusual here:
https://hackernews-new-jobs.arm1.nemanjamitic.com/2024-08
But as a side note, yes, this is not meant as an exact analytic tool, rather just a best effort website that gives some interesting insights.
To clarify, only the "Who's hiring" thread is parsed, you can see it clearly in this constants file:
https://github.com/nemanjam/hn-new-jobs/blob/main/constants/...
Also, in there you can see how simple the parsing regex is, it just looks for "|" separator in the comment title.
Another thing I noticed, some companies used different letter casing for their name in some comments, and the company name is part of the primary key, so same company is perceived as different, I should probably handle this better.
https://github.com/nemanjam/hn-new-jobs/blob/main/modules/da...
For example you can search for "ConsenSys" on the Search page:
https://hackernews-new-jobs.arm1.nemanjamitic.com/search?com...
https://github.com/nemanjam/hn-new-jobs/blob/main/constants/...
If people show constant interest I can iterate it further, enhance it, ad features, etc.
There's also Indeed postings data, which unfortunately only goes back to 2020 but is similarly bleak: https://fred.stlouisfed.org/series/IHLIDXUSTPSOFTDEVE
If it's there my website has it also. But Algolia does pretty good job parsing HackerNews, I am pretty confident 99.9% of comments are included.
1: https://irc.bloombergtax.com/public/uscode/doc/irc/section_1...
I was also going to say $(git commit -a) is evil based on <https://github.com/nemanjam/hn-new-jobs/blob/main/data/datab...> but it seems that you just want an always changing binary blob to make your git repo grow without bound :-( https://github.com/nemanjam/hn-new-jobs/blob/main/.gitignore...
I guess this is because they changed their ad’s headline from:
ChartMogul | Remote (EU) | Full-time
to: ChartMogul (https://chartmogul.com )| Remote | Full-time
Other companies are similarly affected, e.g. Medusa:- listed: https://hackernews-new-jobs.arm1.nemanjamitic.com/search?com...
- earliest ad listed: >>34222858
- more recent ad on hn: >>42315828
Perhaps it’s the “)|” bit causing problems with some regexps.
"Note: these changes were signed into law in 2017 but came into effect in 2022"
Like I thought, changes were meant to be a time bomb if the GOP wasn't in control.