Using proprietary software can be pricey at best, and is often frustrating to those who are used to being able to look under the hood and customize their code as desired. Here is a quick overview of open source software for CV parsing.
What is open source?
Open source software has code that is completely accessible for modification and sharing. It is often created by open collaboration between developers, and is available under a free open source software license for both commercial and non-commercial use.
Open Source Resume Software Parser Options
GitHub is a software development platform that allows users to create, maintain and store their code all in one place. There are a few resume parsers currently being developed on GitHub that are open source.
One such parser on Github focuses on a rule-based approach to understand the semantics of a resume. The coding is ingenious, and the developer has obviously worked hard on the project. But natural language processing is a complex field, and parsing resumes accurately is no easy task. Like all the other parsers available in the repository, this open source solution is still in development and doesn’t have its kinks ironed out yet.
Free software is great and the open source initiative is worth supporting, but let's face it: every software developer needs to eat, and the teams working on a open source solution for resume parsing are not able to contribute the huge amounts of time and resources required to create the open source technology needed for accurate CV parsing.
The sad reality is that there are no good open source resume parser options available for those with a genuine commercial need. The resources required to create a good resume parser make a high quality open source project almost impossible to set up, manage and iterate. Recruitment offices, hiring managers, and companies who deal with large numbers of applicants find themselves hard pressed to find reasonable resume parser options.
Friday Media Group – a group of companies headquartered in the UK - found themselves in this exact position before finding Affinda. They were unhappy with their previous CV Parsing provider and decided to investigate how difficult it would be to create their own. They discovered the same thing that you will if you attempt to go down this path: resume parsing is a fiendishly difficult problem to solve.
No CV looks quite like another and the amount of edge cases is mind-boggling. This is why various commercial providers exist. Most - if not all - of the biggest job boards and applicant tracking systems in the world use external providers to supply their parsing technology. (They won’t say it on their website; but they are.)
If you try to build your own parser, the likely outcome – like that of dozens before you – will be to give up after a certain amount of time and come to a commercial software provider like us, or one of our competitors. If not, you’ve either got deep pockets and extensive resources—or you have a mind worthy of Einstein, and we’d love for you to join our team.
Alternatives to Open Source Software – The Best Resume Parser APIs and Alternatives
Affinda, Sovren, Rchilli and Daxtra are four well known closed source alternatives to open source.
All of them offer a resume parser API that can be integrated with an applicant tracking system. They also provide software development kits in various languages to reduce integration headaches.
In many competitive tests, Affinda has been chosen as the best option; the reverse is also true. None of the available commercial parsers win 100% of their competitive tests; this is why we recommend that you test a few systems to discover which is best for your needs.
Why is there no option that is definitively ‘the best’? This depends on your needs and the trade-offs you’re willing to make – not just in terms of speed, accuracy, affordability – but also specific nuances relevant to your situation. We work with 56 languages, and our parser does a better job in some languages than others.
What ‘good’ looks like tends to change a lot from client to client, based on their priorities. To give you another example, Affinda frequently wins competitive tests where analysing the education & experience of the candidate is the most important criteria for the client.
We also win most of our competitive tests where the client has very specific nuances, because we are able to make customisations to our output to reflect your needs.
About Affinda’s Resume Parser
Affinda has created an incredibly accurate and affordable resume parser used by job boards, recruiters, HR teams, and more. Our technology measurably improves hiring outcomes using deep learning models.
We use the latest AI technology, specifically adapted to CVs, to scan the documents and detect the most relevant fields. This allows us to better understand the language used on the page and we use this to identify and segment information into useful fields, ensuring that no crucial details get left behind.
What’s the Best Option?
We believe Affinda offers the best option overall. While it’s not an open source product, it leverages the best of deep learning from open source language models. This is in combination with our proprietary technology, which has been developed by a well-known physicist who also happens to be an expert in AI – our founder, Ben Toner.
We have also trained all our models on thousands of resumes to create parsing software specifically trained to read CVs.
But Let’s be Honest for a Moment
Let’s be real - at Affinda, we don’t win every contract we pitch for - yet!
Resume parsing is a complex space, and while we are often the preferred option, there are cases in which a different competitor has a commercial software product that more closely matches a purchaser’s needs.
That’s why you should never take anyone’s word for it – ours, or anyone else’s!
You should always make a shortlist and then conduct a thorough test of your top 3 or 4 options.
Taking the time to choose the right CV Parser API is wise, as once an integration is complete, a relationship can last for 5, 10 or more years.